The Art of Connectivity Mapping

Size: px
Start display at page:

Download "The Art of Connectivity Mapping"

Transcription

1 The Art of Connectivity Mapping Avi Ma ayan, PhD Professor, Department of Pharmacological Sciences Director, Mount Sinai Center for Bioinformatics Icahn School of Medicine at Mount Sinai, New York, NY

2 $$$$$ + Data + Statistics + Wiz Kid $$$$$ + Expert Knowledge 2011

3

4

5 The Variety of Sources for Mammalian Molecular Data Collected from Cells, Tissues, Model Organisms and Patients

6 Cells, Tissues Diseases, Phenotypes, Adverse Events Genes, Proteins, Targets Drugs, Small molecules Gene-Sets, Modules, Pathways

7 Ma ayan et al. Trends Pharmacol Sci Sep;35(9):

8 The Harmonizome Project Rouillard et al. Database (Oxford) Jul 3;

9 Harmonizome Google Analytics

10 Enrichr: a search engine for gene sets

11

12 Enrichr Google Analytics

13 Genes Viral Transcription Translational Termination Translational Elongation Viral Life Cycle Cellular Protein Complex Disassembly Protein Complex Disassembly Nuclear Transcribed Mrna Catabolic Process, Nonsense Mediated Decay Establishment Of Protein Localization To Membrane Protein Targeting Protein Localization To Membrane Establishment Of Protein Localization To Organelle Cotranslational Protein Targeting To Membrane Protein Targeting To Er Srp Dependent Cotranslational Protein Targeting To Membrane Establishment Of Protein Localization To Endoplasmic Reticulum Protein Localization To Endoplasmic Reticulum Protein Targeting To Membrane Translation Nuclear Transcribed Mrna Catabolic Process Cellular Component Disassembly Protein Localization To Organelle Single Organism Cellular Localization Gene Expression Machine Learning is the Obvious Next Step DTP_B_3 SUN.URS_B_3 SUN.LOP_B_3 BOS_B_3 DOM_B_3 LAP_B_3 NIL_D_4 CAB_E_4 CRI_A_4 DAS_D_4 TOF_D_4 ERL_D_4 PON_D_4 IMA_D_4 SOR_D_4 VEM_D_4 GEF_D_4 SUN_D_4 LAP.URS_B_3 SOR.URS_B_3 VAN_B_3 DAB_E_4 DAB_D_4 DAB_A_4 VEM_E_4 CYC_B_4 IMA_B_4 CAB_B_4 VAN_A_3 TRS.DOM_B_3 AXI_B_4 SUN.URS_E_3 MIL_B_2 ERL_B_4 CAB_D_4 VEM_B_4 AMI_A_2 DAS.MTX_A_3 DAS.CYT_A_3 MIL_E_3 TOF_B_3 FLE_E_3 VAN_E_4 LAP_D_3 DOM_A_1 AFA_A_3 URS_D_3 TRS.LOP_A_1 ERL_A_3 TRS.URS_A_1 MTX_A_3 CYT_A_3 Transcriptomics Proteomics Pathways Viral Transcription Translational Termination Translational Elongation Viral Life Cycle Cellular Protein Complex Disassembly Protein Complex Disassembly Nuclear Transcribed Mrna Catabolic Process, Nonsense Mediated Decay Establishment Of Protein Localization To Membrane Protein Targeting Protein Localization To Membrane Establishment Of Protein Localization To Organelle Cotranslational Protein Targeting To Membrane + Protein Targeting To Er Srp Dependent Cotranslational Protein Targeting To Membrane Establishment Of Protein Localization To Endoplasmic Reticulum Protein Localization To Endoplasmic Reticulum Protein Targeting To Membrane Translation Nuclear Transcribed Mrna Catabolic Process Cellular Component Disassembly Protein Localization To Organelle Single Organism Cellular Localization Gene Expression + Knockout Phenotypes f (T, P, I)? Observed Not Observed Unknown SUN.URS_B_3 SUN.LOP_B_3 BOS_B_3 DOM_B_3 LAP_B_3 NIL_D_4 CAB_E_4 CRI_A_4 DAS_D_4 TOF_D_4 ERL_D_4 PON_D_4 IMA_D_4 SOR_D_4 VEM_D_4 GEF_D_4 SUN_D_4 LAP.URS_B_3 SOR.URS_B_3 VAN_B_3 DAB_E_4 DAB_D_4 DAB_A_4 VEM_E_4 CYC_B_4 IMA_B_4 CAB_B_4 VAN_A_3 TRS.DOM_B_3 AXI_B_4 SUN.URS_E_3 MIL_B_2 ERL_B_4 CAB_D_4 Tissues VEM_B_4 AMI_A_2 DAS.MTX_A_3 DAS.CYT_A_3 MIL_E_3 TOF_B_3 FLE_E_3 VAN_E_4 LAP_D_3 DOM_A_1 AFA_A_3 URS_D_3 TRS.LOP_A_1 ERL_A_3 TRS.URS_A_1 Perturbations MTX_A_3 CYT_A_3 [ n phenotypes ] Value Density Deep Learning (ANN with many layers)

14 Predicting Side Effects with LINCS L1000 Data Wang Z, Clark NR, Ma'ayan A. Bioinformatics Aug 1;32(15):

15

16

17

18

19 Fireworks Visualization of 17,041 Drug- Induced Signatures 3,713 drugs/compounds 63 cell lines 3 time points 51 dosages 17,041 significant drug induced signatures

20 Fireworks Visualization of 17,041 Drug- Induced Signatures 3,713 drugs/compounds 63 cell lines 3 time points 51 dosages 17,041 significant drug induced signatures

21 Fireworks Visualization of 17,041 Drug- Induced Signatures 3,713 drugs/compounds 63 cell lines 3 time points 51 dosages 17,041 significant drug induced signatures

22 Fireworks Visualization of 17,041 Drug- Induced Signatures 3,713 drugs/compounds 63 cell lines 3 time points 51 dosages 17,041 significant drug induced signatures

23 Fireworks Visualization of 17,041 Drug- Induced Signatures 3,713 drugs/compounds 63 cell lines 3 time points 51 dosages 17,041 significant drug induced signatures

24 Fireworks Visualization of 17,041 Drug- Induced Signatures 3,713 drugs/compounds 63 cell lines 3 time points 51 dosages 17,041 significant drug induced signatures

25 Fireworks Visualization of 17,041 Drug- Induced Signatures 3,713 drugs/compounds 63 cell lines 3 time points 51 dosages 17,041 significant drug induced signatures

26 Fireworks Visualization of 17,041 Drug- Induced Signatures 3,713 drugs/compounds 63 cell lines 3 time points 51 dosages 17,041 significant drug induced signatures

27 Fireworks Visualization of 17,041 Drug- Induced Signatures 3,713 drugs/compounds 63 cell lines 3 time points 51 dosages 17,041 significant drug induced signatures

28

29

30

31

32

33 Predicting Patient Age from Vital Signs and Lab Tests with Deep Learning J Biomed Inform Dec;76:59-68.

34 Network Analysis in Systems Biology Course on Coursera

35 GEO2Enrichr: browser extension and server app to extract gene sets from GEO and analyze them for biological functions Bioinformatics Sep 15;31(18):

36 GEO2Enrichr: browser extension and server app to extract gene sets from GEO and analyze them for biological functions Bioinformatics Sep 15;31(18):

37 GEO2Enrichr: browser extension and server app to extract gene sets from GEO and analyze them for biological functions Bioinformatics Sep 15;31(18):

38 Microtask Signature Extraction Project: Using GEO2Enrichr to Extract and Annotate Signatures from GEO - Single Gene Perturbations (n=2225) - Drug/Toxin Perturbations (n=1319) - Disease vs. Normal Signatures (n=1105) Nature Communications 7, (2016)

39 Nature Communications 7, (2016)

40 ARCHS4: >300K Processed RNA-seq Samples from GEO for Human and Mouse Nat Commun Apr 10;9(1):1366.

41 ARCHS4: >177K Processed RNA-seq Samples from GEO for Human and Mouse Nat Commun Apr 10;9(1):1366.

42 ARCHS4 Chrome Extension

43 BioJupies Chrome Extension

44 BioJupies

45

46

47

48 Summary Data abstraction for data integration The Harmonizome and Enrichr resources Next step is Machine Learning L1000 data processing and visualization Predicting MOA for L1000 small molecules Predicting the predictability of novel small molecules Integration with EMR based on drugs and vital signs Predicting gene function with ARCHS4 ARCHS4 and BioJupies to facilitate data reuse, data integration, and signature extraction

49 NIH Support U54-HL (LINCS-DCIC), U24-CA (IDG-KMC), and OT3-OD (NIH Data Commons)