April transmart v1.2 Case Study for PredicTox

Size: px
Start display at page:

Download "April transmart v1.2 Case Study for PredicTox"

Transcription

1 April 2015 transmart v1.2 Case Study for PredicTox

2 Agenda Agenda! What is PredicTox?! Brief transmart overview! Answering scientific questions with transmart s help: A case study maximizing data value! Questions?

3 PredictTox! A public private partnership, led by the Reagan- Udall Foundation for the FDA, with the goals of:! Applying systems-based approaches to better understand Adverse Events (AEs)! Developing predictive models! Pilot project --- one drug class &AE! Use TranSMART as platform for Integration of clinical, preclinical and molecular data

4 Agenda Agenda! What is PredicTox?! Brief transmart overview! Answering scientific questions with transmart s help: A case study maximizing data value! Questions?

5 Warehouse structure transmart Data Warehouse Structure Analytical and visualization tools Security Access (enterprise vs. project level) Patient Privacy Diverse Data Szalma S.; Koka, VC.; Khasanova, T.; Perakslis, E. :Effective knowledge management in translational medicine Journal of Translational Medicine 2010, 8:68

6 PredictTox transmart Data for PredicTox (so far)! 18 gene expression data sets from GEO! Human white blood cells having to do with left ventricular dysfunction, and the drugs Imantinib, Sunitinib, and Trastuzumab.! Preclinical studies with gene expression data from heart tissue of rats dosed with imatinib.! These datasets may provide confirmatory gene expression profiles as it differentiates left ventricular dysfunction from other cardiac disease.! Information gleaned from these data may provide mechanistic insight into the cardiotoxicity of tyrosine kinase inhibitors.

7 Agenda Agenda! What is PredicTox?! Brief transmart overview! Answering scientific questions with transmart s help: A case study maximizing data value! Questions?

8 Data! GSE21125 Blood Signature of Pre-heart Failure: A Microarray Study (Smih et al)! Human white blood cells from healthy, heart failure risk patients, asymptomatic left ventricular dysfunction patients, chronic heart failure, acute heart failure patients! Platform - RNG-MRC_HU25k_NICE! PLoS ONE 6(6): e doi: /journal.pone ! GSE2535 In chronic myeloid leukemia white cells from cytogenetic responders and non-responders to imatinib have very similar gene expression signatures (Crossman et al)! Analysis of peripheral blood and bone marrow of chronic myelogenous leukemia (CML) patients prior to imatinib (Gleevec) treatment. This study attempts to determine transcriptional signature of imatinib non-responders.! Platform - Affymetrix Human Genome U95 Version 2 Array! Haematologica 2005; 90:

9 Analytical Rationale! Imatinib is the first targeted therapy used to treat Philadelphia chromosome positive CML! Targets and inhibits the catalytic activity of constitutively active tyrosine kinase Bcr- Alb.! Also associated with reduced left ventricular ejection volume indicative of left ventricular dysfunction! With data loaded into transmart we investigated the correlation between gene expression signatures of patients with ALVD and those associated with imatinib response! Do these profiles have overlapping genes?! What are the functions of the overlapping genes?! Do these profiles show effects on similar pathways?! Can apply the signature of one data set to cluster gene expression profiles from the other?

10 Analysis Workflow Marker Selection Analyses Gene Lists Pathway Enrichment Analysis Clustering Analysis Investigate the Role of Shared Genes Mechanistic Similarities? Hierarchical clustering using swapped gene list Biomarkers?

11 GSE2535 Marker Selection! Calculates the most differentiating genes between two datasets

12 Gene List Creation! Gene lists gathered using Marker Selection workflow in transmart were edited to remove control genes, repeats, unrecognized loci and ORFs! GSE21125 Marker Selection list yielded 54 recognizable gene symbols when loaded into transmart when comparing healthy controls with ALVD patients! GSE2535 Marker Selection list yielded 92 recognizable gene symbols when loaded into transmart when comparing responders vs. non-responders

13 Analysis Workflow Marker Selection Analyses Gene Lists Pathway Enrichment Analysis Clustering Analysis Investigate the Role of Shared Genes Mechanistic Similarities? Hierarchical clustering using swapped gene list Biomarkers?

14 Venn Diagram GSE21125 and GSE2535! Compared gene lists from GSE21125 and GSE2535! Shared gene CACNA2D2! CACNA2D2 voltage-dependent calcium channel! Homozygous mutation is associated with epileptic encephalopathy! Null mutants in mice display seizures, cardiac abnormalities and premature death.! Known to promote tumorigenesis and over expression is associated with increased cell proliferation. Oncogene Jan 26. doi: /onc ! Search in PubMed for CACNA2D2 and left ventricular dysfunction yielded no results

15 CACNA2D2 Expression GSE21125 GSE2535! CACNA2D2 down-regulated in patients with ALVD p=6.41 x 10-6 CACNA2D2 slightly down-regulated in patients unresponsive to imatinib treatment p= Gene expression data for CACNA2D2 show a statistically significant difference in these comparisons

16 CACNA2D2 in GSE21125! CACNA2D2 seems to highly differentiate the investigated pathologies.! Not part of the blood gene expression signature in Smih et al Pairwise t-test

17 Analysis Workflow Marker Selection Analyses Gene Lists Pathway Enrichment Analysis Clustering Analysis Investigate the Role of Shared Genes Mechanistic Similarities? Hierarchical clustering using swapped gene list Biomarkers?

18 Pathway Enrichment Analysis GSE21125 Smih et al GSE2535 Crossman et al -log

19 Analysis Workflow Marker Selection Analyses Gene Lists Pathway Enrichment Analysis Clustering Analysis Investigate the Role of Shared Genes Mechanistic Similarities? Hierarchical clustering using swapped gene list Biomarkers?

20 GSE21125 Hierarchical Clustering with GSE2535 Marker Selection List! Clustering based on the GSE2535 gene list shows separation of Control profiles but does not effectively differentiate ALVD patients.

21 GSE2535 Hierarchical Clustering with GSE21125 Marker Selection List! Applying the GSE21125 Marker Selection List does not distinguish imatinib responders vs non-responders

22 Further inspection of GSE Column13 Non-responder Responder Column15 Leipzig Mannheim Batch effect

23 Summary! Marker Selection analysis GEO data sets GSE21125 and GSE2535 in transmart yielded gene lists of 54 and 92 gene respectively! These lists had one gene in common CACNA2D2 voltagedependent calcium channel! This gene is down regulated in GSE21125 in patients with ALVD and distinguishes ALVD from the other pathologies studied! Down regulated in non-responders to imatinib! Null mutants in mice display seizures, cardiac abnormalities and premature death

24 Summary! Pathway enrichment analysis! GSE21125 pathways involved in cell adhesion and cytoskeleton remodeling! GSE2535 pathways involved in VEGF signaling and ESR1 activation! The datasets share one pathway Cytoskeleton remodeling Role of PKA in cytoskeleton reorganization the significance of which remains to be investigated! Hierarchical clustering analysis shows poor performance with swapped gene lists

25 Agenda Agenda! What is PredicTox?! Brief transmart overview! Answering scientific questions with transmart s help: A case study maximizing data value! Questions?

26 How to Get Involved! Looking for partners: data, expertise, funding, and other resources! Steering Committee and work groups forming soon--- stay tuned for updates! No membership fee --- funding is raised from a variety of public and private sources! For more info: contact nbeck@reaganudall.org or see her after the talk

27 Extra Slides

28 Lessons learned! The more attributes for the samples the better! The more data the better!! Need same tissue, same species, similar treatments, and similar measurements!

29 Current Project Activities! Securing data sharing agreements with pharma companies! Gathering publically available data! Building the ontology of Cardiac Adverse Events! Establishing the project governance structure! Building out transmart instance! Rancho developed use case using GEO data sets to demonstrate utility

30 PredictTox! Pilot project develop centralized knowledge base that includes publically available clinical and molecular data having to do with tyrosine kinase inhibitors (TKIs) and mabs and cardiac AEs; specifically left ventricular dysfunction.! Data goes into transmart infrastructure! Integrated knowledgebase! Mine information on biomarkers, non-clinical and clinical screens! Assist in hypothesis generation and mechanistic level understanding

31 transmart Concept of transmart on data level

32 Warehouse structure Examples Of Data Stored In transmart! Data from clinical trials! Demographics, medical history! Treatment information! Clinical outcomes, including AEs! OMICs type data (gene expression, proteomics, RBM, SNPs)! Pre-clinical Studies! PK/PD data! OMICs type data for animal models and cell lines! Toxicology data

33 transmart Concept of transmart Discovery Group Clinical Development Preclinical Group

34 Cytoskeleton remodeling Role of PKA in cytoskeleton reorganization

35 Conclusions! We went over transmart and explored main functionality of the platform! We used platform to answer a scientific question! Great sets from GEO were curated and loaded into this transmart instance they serve as a starting point and will provide useful comparisons for new, exciting data that is yet to come! We need more data!

36 Marker Selection vs. Signature from Smih et al.! Tested the performance of the blood signature from supporting reference and the Marker Selection gene list! Performed hierarchical clustering analysis using! Marker selection list generated by comparing gene expression from healthy controls and ALVD patients! 7 gene list from the paper

37 GSE21125 Clustering with Marker Selection List The marker selection list was able to differentiate patient samples based on pathology

38 Clustering with Smih et al Gene List! The list from Smih et al. performs well at clustering samples from patients with ALVD