Network data from OmniPath

Marton Olbei

10/12/2020

In this tutorial we show you how to query interactions from one of the many resources included in OmniPath, customize the data by interaction types, and further quality controls.

We'll start by importing libraries, first omnipath, and pandas for data wrangling.

In [2]:
import omnipath as op
import pandas as pd

OmniPath contains many resources we can choose to collate our desired network from. To browse the list available resources call the op.interactions.AllInteractions.resources() function.

In [4]:
op.interactions.AllInteractions.resources()
Out[4]:
('ABS',
 'ACSN',
 'ACSN_SignaLink3',
 'ARACNe-GTEx_DoRothEA',
 'ARN',
 'Adhesome',
 'AlzPathway',
 'BEL-Large-Corpus_ProtMapper',
 'Baccin2019',
 'BioGRID',
 'BioGRID_ICELLNET',
 'CA1',
 'CancerCellMap',
 'CellPhoneDB',
 'CellPhoneDB_ICELLNET',
 'DEPOD',
 'DIP',
 'DOMINO',
 'DeathDomain',
 'Dinarello2013_ICELLNET',
 'DoRothEA',
 'DoRothEA-reviews_DoRothEA',
 'ELM',
 'EMBRACE',
 'ENCODE-distal',
 'ENCODE-proximal',
 'ENCODE_tf-mirna',
 'FANTOM4_DoRothEA',
 'Fantom5_LRdb',
 'GO-lig-rec_ICELLNET',
 'Guide2Pharma',
 'Guide2Pharma_CellPhoneDB',
 'Guide2Pharma_ICELLNET',
 'Guide2Pharma_LRdb',
 'HOCOMOCO_DoRothEA',
 'HPMR',
 'HPMR_ICELLNET',
 'HPMR_LRdb',
 'HPRD',
 'HPRD-phos',
 'HPRD_KEA',
 'HPRD_LRdb',
 'HPRD_MIMP',
 'HTRIdb',
 'HTRIdb_DoRothEA',
 'HuRI',
 'I2D_CellPhoneDB',
 'ICELLNET',
 'IMEx_CellPhoneDB',
 'InnateDB',
 'InnateDB-All_CellPhoneDB',
 'InnateDB_CellPhoneDB',
 'InnateDB_ICELLNET',
 'InnateDB_SignaLink3',
 'IntAct',
 'IntAct_CellPhoneDB',
 'IntAct_DoRothEA',
 'JASPAR_DoRothEA',
 'KEA',
 'KEGG-MEDICUS',
 'Kinexus_KEA',
 'Kirouac2010',
 'Kirouac2010_ICELLNET',
 'LMPID',
 'LRdb',
 'Li2012',
 'Lit-BM-17',
 'LncRNADisease',
 'MIMP',
 'MINT_CellPhoneDB',
 'MPPI',
 'Macrophage',
 'Macrophage_ICELLNET',
 'MatrixDB',
 'MatrixDB_CellPhoneDB',
 'NCI-PID_ProtMapper',
 'NFIRegulomeDB_DoRothEA',
 'NRF2ome',
 'NetPath',
 'NetworKIN_KEA',
 'ORegAnno',
 'ORegAnno_DoRothEA',
 'PAZAR',
 'PAZAR_DoRothEA',
 'PhosphoNetworks',
 'PhosphoPoint',
 'PhosphoSite',
 'PhosphoSite_KEA',
 'PhosphoSite_MIMP',
 'PhosphoSite_ProtMapper',
 'PhosphoSite_noref',
 'ProtMapper',
 'REACH_ProtMapper',
 'RLIMS-P_ProtMapper',
 'Ramilowski2015',
 'Ramilowski2015_Baccin2019',
 'Ramilowski2015_ICELLNET',
 'ReMap_DoRothEA',
 'Reactome_ICELLNET',
 'Reactome_LRdb',
 'Reactome_ProtMapper',
 'Reactome_SignaLink3',
 'RegNetwork_DoRothEA',
 'SIGNOR',
 'SIGNOR_ICELLNET',
 'SIGNOR_ProtMapper',
 'SPIKE',
 'SPIKE_ICELLNET',
 'STRING_ICELLNET',
 'SignaLink3',
 'SignaLink3_ICELLNET',
 'Sparser_ProtMapper',
 'TCRcuration_SignaLink3',
 'TFactS_DoRothEA',
 'TFe_DoRothEA',
 'TRED_DoRothEA',
 'TRIP',
 'TRRD_DoRothEA',
 'TRRUST_DoRothEA',
 'TransmiR',
 'UniProt_CellPhoneDB',
 'UniProt_LRdb',
 'Wang',
 'dbPTM',
 'iPTMnet',
 'iTALK',
 'lncrnadb',
 'miR2Disease',
 'miRDeathDB',
 'miRTarBase',
 'miRecords',
 'ncRDeathDB',
 'phosphoELM',
 'phosphoELM_KEA',
 'phosphoELM_MIMP')

OmniPath can serve multiple kinds of interactions, based on the quality of the interactors or the interactions themselves:

  • post_translational i.e. physical interactions of proteins, protein-protein interactions (or PPIs)
  • transcriptional i.e. gene regulatory interactions
  • post_transcriptional i.e. miRNA-mRNA interactions
  • mirna_transcriptional i.e. transcriptional regulation of miRNA genes

In the following code blocks we are going to query all of them, and show the URLS these queries generate, through which the data is also accessible, through a browser.

First, let's take a look at PPI interactions.

URL: https://omnipathdb.org/interactions?genesymbols=yes&datasets=omnipath,pathwayextra,kinaseextra,ligrecextra&organisms=9606&fields=sources,references,curation_effort&license=academic

By default, this query returns data from the omnipath dataset, which means literature curated activity flow (directed, signed interactions in most cases, curation effort).

In [20]:
interactions = op.interactions.PostTranslational.get()
interactions
#
#RecursionError: maximum recursion depth exceeded
---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
<ipython-input-20-69bc68e6b6bc> in <module>
----> 1 interactions = op.interactions.PostTranslational.get()
      2 interactions
      3 #
      4 #RecursionError: maximum recursion depth exceeded

~/miniconda3/lib/python3.7/site-packages/omnipath/_core/requests/_utils.py in wrapper(wrapped, _instance, args, kwargs)
    102     @wrapt.decorator(adapter=wrapt.adapter_factory(argspec_factory))
    103     def wrapper(wrapped, _instance, args, kwargs):
--> 104         return wrapped(*args, **kwargs)
    105 
    106     if hasattr(clazz, "get") and not hasattr(clazz.get, "__wrapped__"):

~/miniconda3/lib/python3.7/site-packages/omnipath/_core/requests/interactions/_interactions.py in get(cls, exclude, **kwargs)
    432         %(get.returns)s
    433         """
--> 434         return cls(exclude=exclude).get(**kwargs)
    435 
    436 

... last 2 frames repeated, from the frame below ...

~/miniconda3/lib/python3.7/site-packages/omnipath/_core/requests/_utils.py in wrapper(wrapped, _instance, args, kwargs)
    102     @wrapt.decorator(adapter=wrapt.adapter_factory(argspec_factory))
    103     def wrapper(wrapped, _instance, args, kwargs):
--> 104         return wrapped(*args, **kwargs)
    105 
    106     if hasattr(clazz, "get") and not hasattr(clazz.get, "__wrapped__"):

RecursionError: maximum recursion depth exceeded

We can include interactions without explicit literature references as well, by including the extra datasets pathwayextra, kinaseextra, or ligrecextra.

To get just one of these extra sets, one can call the specific function for it:

URL https://omnipathdb.org/interactions?genesymbols=yes&datasets=pathwayextra&organisms=9606&fields=sources,references,curation_effort&license=academic

In [24]:
interactions_pathwayextra = op.interactions.PathwayExtra.get()
interactions_pathwayextra
Out[24]:
source target is_directed is_stimulation is_inhibition consensus_direction consensus_stimulation consensus_inhibition dip_url curation_effort references sources references_stripped n_references n_sources n_primary_sources
0 P48995 Q13255 True False True False False False None 0 NaN Wang None None 1 1
1 Q13255 P48995 True True False True True False None 1 TRIP:14614461 TRIP 14614461 1 1 1
2 P20591 Q9Y210 True True False True True False None 3 HPRD:15757897;Lit-BM-17:15757897;TRIP:15757897 HPRD;Lit-BM-17;TRIP;Wang 15757897 1 4 4
3 O60500 Q9Y210 True True False True True False None 2 TRIP:15924139;TRIP:22155451 TRIP;Wang 15924139;22155451 2 2 2
4 Q13976 Q9Y210 True False True True False True None 5 PhosphoSite:19961855;PhosphoSite:23645677;Prot... MIMP;PhosphoSite;PhosphoSite_MIMP;PhosphoSite_... 18617565;19961855;21402151;23645677;24740790 5 9 4
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
41812 P52789 Q07812 True False True True False True None 0 NaN KEGG-MEDICUS None None 1 1
41813 O15264 Q96BA8 True True False True True False None 0 NaN KEGG-MEDICUS None None 1 1
41814 COMPLEX:P49023_Q05397_Q13418 COMPLEX:O43707_P18206_Q9Y4G6 True True False True True False None 0 NaN KEGG-MEDICUS None None 1 1
41815 COMPLEX:P29466_Q96P20_Q9ULZ3 P01584 True True False True True False None 0 NaN KEGG-MEDICUS None None 1 1
41816 Q02750 O15264 True True False False False False None 0 NaN KEGG-MEDICUS None None 1 1

41817 rows × 16 columns

In [26]:
interactions_pathwayextra_citations = interactions_pathwayextra[
    interactions_pathwayextra['curation_effort'] >= 7
]
interactions_pathwayextra_citations
Out[26]:
source target is_directed is_stimulation is_inhibition consensus_direction consensus_stimulation consensus_inhibition dip_url curation_effort references sources references_stripped n_references n_sources n_primary_sources
7 P12931 Q8NER1 True True False True True False None 8 PhosphoSite:16319926;ProtMapper:16319926;ProtM... MIMP;NCI-PID_ProtMapper;PhosphoSite;PhosphoSit... 15084474;16319926;17582331;24717323;25970319;3... 6 10 5
14 P49137 Q16539 True True False False False False None 24 BioGRID:17395714;ELM:23047924;ELM:25255283;HPR... BioGRID;ELM;HPRD;InnateDB;IntAct;Lit-BM-17;Pho... 10581204;10922375;11042204;17255097;17395714;2... 18 8 8
15 Q16539 P49137 True True False True True False None 60 ACSN:11274345;ACSN:12738796;ACSN:15187187;ACSN... ACSN;BEL-Large-Corpus_ProtMapper;BioGRID;CA1;E... 10581204;10922375;11042204;11274345;11551945;1... 34 35 23
16 O60674 P19235 True True False True True False None 19 BioGRID:8343951;HPRD-phos:12441334;HPRD:117795... BEL-Large-Corpus_ProtMapper;BioGRID;HPRD;HPRD-... 10579919;10660611;11443118;11779507;12027890;1... 12 20 12
17 P19235 O60674 True True False False False False None 8 BioGRID:8343951;HPRD:11779507;HPRD:12441334;HP... BioGRID;HPRD;SignaLink3;Wang 11779507;12441334;18160720;23331499;8343951 5 4 4
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
40511 Q16665 P17948 True True False True True False None 15 ACSN:10403805;ACSN:11528470;ACSN:11566883;ACSN... ACSN;Wang 10403805;11528470;11566883;12080085;12829734;1... 15 2 2
40573 P27540 Q15672 True True False True True False None 21 ACSN:10403805;ACSN:11566883;ACSN:12080085;ACSN... ACSN;Wang 10403805;11566883;12080085;12829734;13130303;1... 21 2 2
40743 Q9UJX5 O95997 True True True True True False None 19 ACSN:10477750;ACSN:11402067;ACSN:11535616;ACSN... ACSN;Wang 10477750;11402067;11535616;12640463;15024386;1... 19 2 2
41009 Q9UJU2 O43623 True True False True True False None 19 ACSN:11955436;ACSN:11967263;ACSN:12051714;ACSN... ACSN;Wang 11955436;11967263;12051714;12490555;14623871;1... 19 2 2
41050 P49841 Q00534 True False True True False True None 13 ACSN:10385618;ACSN:10486203;ACSN:11124803;ACSN... ACSN;Wang 10385618;10486203;11124803;11152665;12459251;1... 13 2 2

3426 rows × 16 columns

We can use these properties to further specify our queries, for example on curation effort. The curation_effort value we filtered our query on shows the unique database - citation pairs, i.e. how many times was an interaction described in a paper and mentioned in a database.

To get all PPI interactions call interactions.AllInteractions. By default only directed interactions are included, but we can include a flag to import undirected interactions as well.

URL: https://omnipathdb.org/interactions?genesymbols=yes&fields=sources,references&datasets=omnipath,pathwayextra,kinaseextra,ligrecextra&directed=no

In [35]:
op.interactions.AllInteractions.get(directed = False, organism = 'human')

/Users/olbeim/miniconda3/lib/python3.7/site-packages/omnipath/_core/requests/interactions/_interactions.py:377: DtypeWarning: Columns (8) have mixed types.Specify dtype option on import or set low_memory=False.
  return cls(include, exclude=exclude)._get(**kwargs)
Out[35]:
source target is_directed is_stimulation is_inhibition consensus_direction consensus_stimulation consensus_inhibition dip_url curation_effort references sources type references_stripped n_references n_sources n_primary_sources
0 P0DP24 P48995 True False True True False True None 3 TRIP:11290752;TRIP:11983166;TRIP:12601176 TRIP post_translational 11290752;11983166;12601176 3 1 1
1 Q03135 P48995 True True False True True False http://dip.doe-mbi.ucla.edu/dip/DIPview.cgi?IK... 13 DIP:19897728;HPRD:12732636;IntAct:19897728;Lit... DIP;HPRD;IntAct;Lit-BM-17;TRIP post_translational 10980191;12732636;14551243;16822931;18430726;1... 8 5 5
2 P14416 P48995 True True False True True False None 1 TRIP:18261457 TRIP post_translational 18261457 1 1 1
3 Q02790 P48995 True False True True False True None 3 TRIP:15199065;TRIP:19945390;TRIP:23228564 TRIP post_translational 15199065;19945390;23228564 3 1 1
4 P48995 Q86YM7 False False False False False False None 4 HPRD:14505576;TRIP:14505576;TRIP:16905188;TRIP... HPRD;TRIP post_translational 14505576;16905188;22506990 3 2 2
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
176859 UCA1 P49715 True False False False False False None 1 ncRDeathDB:24648007 ncRDeathDB lncrna_post_transcriptional 24648007 1 1 1
176860 UCA1 Q16665 True False False False False False None 1 ncRDeathDB:24737584 ncRDeathDB lncrna_post_transcriptional 24737584 1 1 1
176861 URH Q9NYL2 True False False False False False None 1 ncRDeathDB:25013376 ncRDeathDB lncrna_post_transcriptional 25013376 1 1 1
176862 Xist P26358 True False False False False False None 1 ncRDeathDB:8769643 ncRDeathDB lncrna_post_transcriptional 8769643 1 1 1
176863 Xist Q01860 True False False False False False None 1 ncRDeathDB:24945968 ncRDeathDB lncrna_post_transcriptional 24945968 1 1 1

176864 rows × 17 columns

The other interaction types have their own built-in functions as well. This query accesses interactions from DoRothEA, from confidence levels A to D, from highest to lowest. It is set to pull out A and B by default, but naturally we can extend it.

URL: https://omnipathdb.org/interactions?genesymbols=yes&fields=sources,references&datasets=dorothea,tf_target&dorothea_levels=A,B,C,D

In [38]:
op.interactions.Transcriptional.get(dorothea_levels = ["A","B","C"], organism = 'human')

Out[38]:
source target is_directed is_stimulation is_inhibition consensus_direction consensus_stimulation consensus_inhibition dip_url curation_effort references sources references_stripped n_references n_sources n_primary_sources
0 Q9H2P0 Q6VMQ6 True False False False False False None 0 NaN ARACNe-GTEx_DoRothEA;ReMap_DoRothEA None None 2 0
1 Q9H2P0 Q13627 True False False False False False None 0 NaN ARACNe-GTEx_DoRothEA;ReMap_DoRothEA None None 2 0
2 Q9H2P0 Q9UKI8 True False False False False False None 0 NaN ARACNe-GTEx_DoRothEA;ReMap_DoRothEA None None 2 0
3 Q9H2P0 Q5VZL5 True False False False False False None 0 NaN ARACNe-GTEx_DoRothEA;ReMap_DoRothEA None None 2 0
4 P35869 Q53QZ3 True False False False False False None 0 NaN ARACNe-GTEx_DoRothEA;ReMap_DoRothEA None None 2 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
90135 COMPLEX:P15336 P12004 True True False True True False None 0 NaN KEGG-MEDICUS None None 1 1
90136 COMPLEX:P01106_Q92993_Q9Y4A5 P30279 True True False True True False None 0 NaN KEGG-MEDICUS None None 1 1
90137 COMPLEX:O00268_Q8TEY5 Q9UBK2 True True False True True False None 0 NaN KEGG-MEDICUS None None 1 1
90138 Q14934 P35354 True True False True True False None 0 NaN KEGG-MEDICUS None None 1 1
90139 COMPLEX:P03372_Q15596 P08727 True True False True True False None 0 NaN KEGG-MEDICUS None None 1 1

90140 rows × 16 columns

In [39]:
op.interactions.TFmiRNA.get()

Out[39]:
source target is_directed is_stimulation is_inhibition consensus_direction consensus_stimulation consensus_inhibition dip_url curation_effort references sources references_stripped n_references n_sources n_primary_sources
0 Q9UKV8 MIMAT0000646 True False True True False True None 1 TransmiR:24263100 TransmiR 24263100 1 1 1
1 Q9UKV8 MIMAT0004658 True False True True False True None 1 TransmiR:24263100 TransmiR 24263100 1 1 1
2 P35869 MIMAT0004672 True True False True True False None 1 TransmiR:24798859 TransmiR 24798859 1 1 1
3 P35869 MIMAT0000680 True True False True True False None 1 TransmiR:24798859 TransmiR 24798859 1 1 1
4 P35869 MIMAT0004594 True True False True True False None 2 TransmiR:25617893;TransmiR:26377202 TransmiR 25617893;26377202 2 1 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
4974 Q15545 MIMAT0000075 True False False False False False None 0 NaN ENCODE_tf-mirna None None 1 0
4975 O60216 MIMAT0005867 True False False False False False None 0 NaN ENCODE_tf-mirna None None 1 0
4976 P23511 MIMAT0003308 True False False False False False None 0 NaN ENCODE_tf-mirna None None 1 0
4977 P15408 MIMAT0005867 True False False False False False None 0 NaN ENCODE_tf-mirna None None 1 0
4978 Q12824 MIMAT0004491 True False False False False False None 0 NaN ENCODE_tf-mirna None None 1 0

4979 rows × 16 columns

In [40]:
op.interactions.miRNA.get()

Out[40]:
source target is_directed is_stimulation is_inhibition consensus_direction consensus_stimulation consensus_inhibition dip_url curation_effort references sources references_stripped n_references n_sources n_primary_sources
0 MIMAT0000062 P01116 True False False False False False None 5 miRTarBase:15766527;miRTarBase:16651716;miRTar... miR2Disease;miRTarBase;miRecords;ncRDeathDB 15766527;16651716;20033209;24092860 4 4 4
1 MIMAT0000062 P52926 True False False False False False None 9 miRTarBase:17322030;miRTarBase:17600087;miRTar... miR2Disease;miRTarBase;miRecords;ncRDeathDB 17322030;17554199;17600087;18083101;18413822;1... 8 4 4
2 MIMAT0000062 P10415 True False False False False False None 0 NaN miR2Disease None None 1 1
3 MIMAT0000062 P01106 True False False False False False None 2 miRTarBase:16651716;miRTarBase:20033209 miR2Disease;miRTarBase 16651716;20033209 2 2 2
4 MIMAT0000062 P30304 True False False False False False None 0 NaN miR2Disease None None 1 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
8273 MIMAT0000617 Q13526 True False False False False False None 1 miRTarBase:24786790 miRTarBase 24786790 1 1 1
8274 MIMAT0000451 Q9NSE2 True False False False False False None 1 miRTarBase:23723424 miRTarBase 23723424 1 1 1
8275 MIMAT0000416 Q12797 True False False False False False None 1 miRTarBase:23723006 miRTarBase 23723006 1 1 1
8276 MIMAT0000449 Q96LI5 True False False False False False None 1 miRTarBase:23591815 miRTarBase 23591815 1 1 1
8277 MIMAT0000256 O14746 True False False False False False None 1 miRTarBase:25444904 miRTarBase 25444904 1 1 1

8278 rows × 16 columns

In this tutorial we learned:

  • The various interaction types in OmniPath
  • The differences between the encoded interaction types
  • How to access and query these interaction types