IPA Advanced Training Course Academia Sinica 2015 Oct Gene( 陳冠文 ) Supervisor and IPA certified analyst 1
Review for Introductory Training course Searching Building a Pathway Editing a Pathway for Publication 2
Searching Searching Basics Gene/chemical search and results Function/Disease search and results Drug target search and results Advanced search: Limiting results to a molecule type, family or subcellular location 3
Using the Build Tools Grow, Path Explorer, and Connect can be used to add molecules and relationships to a pathway. Trim, Keep and Highlight can be used to remove or highlight objects that already appear in a pathway. Build Tools: Grow: Adds new molecules and their relationships given the criteria that the user specifies Path Explorer: Calculates the Shortest Path between 2 molecules or 2 sets of molecules Connect: Connects molecules given the criteria that the user specifies Trim: Removes molecules/relationships that meet the criteria that the user specifies Keep: Keeps molecules/relationships that meet the criteria that the user specifies Add Molecule/Relationship: Allows adding a custom molecules or relationship to the current pathway that does not exist in Ingenuity s KB as well as ones that already exist 4
Grow Upstream from AKT1 to kinases and phosphatases Build and Grow Networks of Molecules 5
Agenda A. Data Upload and How to Run a Core Analysis B. Functional Interpretation in IPA Hands-on Exercises C. Comparison Analyses D. Using IPA to Explore microrna Impacts on Molecular Mechanisms of Disease E. Q&A 6
Agenda A. Data Upload and How to Run a Core Analysis B. Functional Interpretation in IPA Hands-on Exercises C. Comparison Analyses D. Using IPA to Explore microrna Impacts on Molecular Mechanisms of Disease E. Q&A 7
Introduction to Data Upload and Analysis Why do we Run an Analysis? Ingenuity s Analyses return The relevant functions associated with the uploaded data Affected signaling and metabolic pathways associated with the uploaded data Networks of interactions among the uploaded molecules as well as related molecules What types of Analyses does IPA have? Core Analysis Tox Analysis Metabolomics Analysis All IPA Analyses return the same information and algorithms. The data is presented in a different order! 8
Workflow for Dataset Analysis IPA Genomic, exon, mirna, SNP, protein arrays; Any molecule lists; Other proteomic & metabolomic assays Identify functions, diseases, and canonical pathways associated with your data 9
IPA Data Analysis Workflow Upload Data Run Core Analysis Pathways (overlay) Functional Effects IPA Transcription Regulators Research Genes of Interest Save Export Experiment approval IPA User platform General Analysis Workflow in IPA 10
Key Terminology Observation: An experimental condition such as a time point, disease subtype, or compound concentration Expression Value: Numerical value indicating level of expression, significance, or other assay result for a specific identifier (gene, RNA, protein, or chemical) Reference Set: The set of molecules used as the universe of molecules when calculating the statistical relevance of biological functions and pathways with respect to a dataset file. The set of molecules are the user's dataset or molecules in Ingenuity's Knowledge Base (genes, endogenous chemicals, or both). Focus Molecule: Molecules that are from uploaded list, pass filters are applied, and are available for generating networks 11
Setting Up a Dataset ID Replicates Average Other observations (Comparison) 12
Best Practices for Dataset Analysis Calculate averages and p-values for replicate samples outside of IPA Create an Excel spreadsheet One column must have identifiers Up to 20 observations Up to 3 expression value types per observation Only 1 header row Set a cutoff value for each expression value type used For large datasets, use a p-value and another expression value type. Check the number of Molecules Eligible for Network generation Cutoffs depend on the confidence in values, but many use fold change 1.5 and -1.5 and a p-value 0.01 Use the Recalculate button to refresh the screen 13
分析用的 Dataset 的範例格式 Chronic obstructive pulmonary disease Observation 1 : Smokers vs. NonSmokers Observation 2 : Early COPD vs. NonSmokers Observation 3 : COPD vs. NonSmokers 14
Live Demo 16
Agenda A. Data Upload and How to Run a Core Analysis B. Functional Interpretation in IPA Hands-on Exercises C. Comparison Analyses D. Using IPA to Explore microrna Impacts on Molecular Mechanisms of Disease E. Q&A 21
Functional analysis Functions analysis: identify what biological processes, diseases, or toxicological functions are affected in the experiment. Canonical Pathways : list the canonical pathways that your in your experimental dataset may be involved in. Upstream Analysis: identify the upstream regulators that may be responsible for gene expression changes observed in your experimental dataset. Networks : Networks are collections of interconnected molecules assembled by a network algorithm. 22
Network types in IPA Upstream Analysis Mechanistic Network of Upstream Regulators Upstream Regulator Dataset Molecules Other upstream regulators Dataset Molecules Function Analysis Dataset Molecules Regulator Effect Network Any Interaction Network Diseases / functions Diseases / functions Dataset Molecules
Functional analysis Functions analysis: identify what biological processes, diseases, or toxicological functions are affected in the experiment. Canonical Pathways : list the canonical pathways that your in your experimental dataset may be involved in. Upstream Analysis: identify the upstream regulators that may be responsible for gene expression changes observed in your experimental dataset. Networks : Networks are collections of interconnected molecules assembled by a network algorithm. 24
Interpret Downstream Biological Functions Identify over-represented biological functions and predict how those functions are increased or decreased in the experiment 25
Downstream Effect on Biological Function Size of the Square Color by and Scale Toggle to the Bar Chart Click on a Square to Drill Down within that function 26
Downstream Effect on Biological Function Ontology Levels Click to See the Specific Genes and Findings 27
Disease and Molecules relationships Powerful functionality enables you to understand causal connections between molecules and diseases. Interactive visual exploration of causality between molecules and disease, function, or phenotypes from a network or My Pathway Visualize the impact of genes on diseases or biological functions in Downstream Effects Analysis. 28
Disease or Function View provides details associated with the disease or biological function such as molecules associated with that disease or function, known drug targets, drugs known to target those molecules, and more. 29
Functional analysis Functions analysis: identify what biological processes, diseases, or toxicological functions are affected in the experiment. Canonical Pathways : list the canonical pathways that your in your experimental dataset may be involved in. Upstream Analysis: identify the upstream regulators that may be responsible for gene expression changes observed in your experimental dataset. Networks : Networks are collections of interconnected molecules assembled by a network algorithm. 30
Canonical Pathways Understanding the biology of your data in an established signaling and metabolic context 32
How to use the Molecule Activity Predictor (MAP) Turn on or off the prediction Set whether predictions should flow downstream upstream, or both Use your dataset or analysis to set the activation state of the molecules 33
Functional analysis Functions analysis: identify what biological processes, diseases, or toxicological functions are affected in the experiment. Canonical Pathways : list the canonical pathways that your in your experimental dataset may be involved in. Upstream Analysis: identify the upstream regulators that may be responsible for gene expression changes observed in your experimental dataset. Networks : Networks are collections of interconnected molecules assembled by a network algorithm. 34
Upstream Regulator Analysis: How does it work? Use experimentally observed relationships (vs. Predicted event) between Upstream Regulators and genes to predict potential regulator and activation Predict activation or inhibition of regulator to explain the changes in gene expression in your dataset Calculates two complementary statistical measures: Activation z-score Overlap p-value 35
Mechanistic Network Algorithm Algorithm seeks large overlaps between an upstream regulator s targets and a more downstream regulator s targets Upstream molecule likely to operate thru this more downstream regulator A Upstream molecule less likely to operate thru this more downstream regulator A B B Shares 6 of 7 of the more downstream regulator s targets Shares 1 of 7 of the more downstream regulator s targets 41
Concept of Regulator Effects - Spring 2014 Hypotheses for how activated or inhibited upstream regulators cause downstream effects on biology Upstream Regulators A Simplest Regulator Effects result A Molecules in the dataset Disease or Function Algorithm First iteration Disease or Function Displays a relationship between the regulator and disease/function if it exists Downstream Effects Analysis Causally consistent networks score higher The algorithm runs iteratively to merge additional regulators with diseases and functions 42
Functional analysis Functions analysis: identify what biological processes, diseases, or toxicological functions are affected in the experiment. Canonical Pathways : list the canonical pathways that your in your experimental dataset may be involved in. Upstream Analysis: identify the upstream regulators that may be responsible for gene expression changes observed in your experimental dataset. Networks : Networks are collections of interconnected molecules assembled by a network algorithm. 43
Networks in IPA Purpose: To show as many interactions between user-specified molecules in a given dataset and how they might work together at the molecular level Why are Ingenuity networks biologically interesting? Highly-interconnected networks are likely to represent significant biological function Networks involve molecules you don t see in your data set. This allows genes you have assayed to be linked to metabolites and chemicals that you couldn t have assayed for, to imply a regulation network that is meaningful. 44
How Networks Are Generated 1. Focus molecules are seeds 2. Focus molecules with the most interactions to other focus molecules are then connected together to form a network 3. Non-focus molecules from the dataset are then added 4. Molecules from the Ingenuity s Knowledge Base are added 5. Resulting Networks are scored and then sorted based on the score 45
Live Demo 47
Hands-on Exercises I 1. Upload a dataset into IPA. You may use your own or we can provide you with an example. 2. What is the top function associated with your dataset? 3. How can you find out what main functions a Canonical Pathway (or group of Canonical Pathways) is involved in? 4. What are the functions of the top network in this analysis?
Agenda A. Data Upload and How to Run a Core Analysis B. Functional Interpretation in IPA Hands-on Exercises C. Comparison Analyses D. Using IPA to Explore microrna Impacts on Molecular Mechanisms of Disease E. Q&A 49
Bringing together multiple types of genomic data Research AIM: To attain a systems biology understanding of your research by bringing multiple types of genomic data together (SNP, CNA, mrna, microrna, proteomics, etc.). Challenge: Data types measured different molecular status in experiment Too much data, some data types may have extra noise (i.e. arrays) Venn Diagram-type comparison excludes A affects B information Solution: Identify phenotypes, disease associations, and pathways that are common themes for multiple data types using Comparison Analysis Interactive pathways overlay multiple data types and find genes up or down-stream that change in the various data types. Pathway tools find regulatory connections between molecules of interest and the various data types microrna Target Filter can link micrornas and targets from mirna and target data sets How do you integrate multiple data types now? 50
Single Experiment Time Course Dose Response Multi Experiment System biology Combining SNP, CNA, mrna, microrna, proteomics, etc. Set Analysis Exploring Common Molecules across one or more experiment (s) 51
Core Comparison Analysis 52
Single Experiment Time Course Dose Response Multi Experiment System biology Combining SNP, CNA, mrna, microrna, proteomics, etc Set Analysis Exploring Common Molecules across one or more experiment (s) 53
IPA: A Point of Data Integration Mutations CNA /CNV mrna Expression Methylation ChIP-Seq mirna Expression IPA Biological Interpretation Phosphorylation Protein Expression 54
IPA: A Point of Data Integration Mutations CNA /CNV mrna Expression Methylation ChIP-Seq mirna Expression IPA Biological Interpretation Phosphorylation Protein Expression 55
Example of Core Analysis with 3-Data Types File Name Mutations CNAs mrnas GBM paper mutation data GBM paper CNA GBM vs Norm Expression ID Gene Symbol Gene Symbol Gene Symbol Observation 1 frequency of nonsilent mutation across samples Pct. Sample/Other frequency of CNA across samples [Pct/Other], increase or decrease in copy number [Amp/Other], and [qvalue/p-value] Log2 ratio change, p-value Core Analysis Frequency of mutation 2% p-value < 0.05 Log ratio 1.5 Keep in mind To set the same Reference Set across the 3 core analyses To check the Expression value type used for coloring the nodes
What do you want out of this comparison? Review your workflow What are your goals? mrna data Core Analysis CNA data Core Analysis Comparison Analysis? mutation data Core Analysis Pathways? Export? References? Lists? 57
Comparison of Functions for 3 data sets: 1. Sorted by 1 st data type (mrna); re-order or review whole table for Functions significant for other data types 2. Look for Functions common to mrna, CNA, mutations from glioblastoma 3. Table may be customized or exported 58
Single Experiment Time Course Dose Response Multi Experiment System biology Combining SNP, CNA, mrna, microrna, proteomics, etc Set Analysis Exploring Common Molecules across one or more experiment (s) 59
Genes Overlap Gene Exp 2282 47 CNA 29 1 18 16 Mutated 60
Compare Tool 61
Live Demo 64
Agenda A. Data Upload and How to Run a Core Analysis B. Functional Interpretation in IPA Hands-on Exercises C. Comparison Analyses D. Using IPA to Explore microrna Impacts on Molecular Mechanisms of Disease E. Q&A 65
Filter Datasets for Biomarkers or mirna Targets mirna Data mirna Target Filter Molecule Type Pathways (Cancer/ Growth) mrna? 88 data points 13,690 targets 1,090 targets 333 targets 39 targets 32 targets Use Pathway tools to build hypothesis for microrna to mrna target association
Live Demo 69
Hands-on Exercises II Overall Exercises: Use the COPD analytical results in exercises I. 1. What is the observed effect on the Xenobiotic Metabolism Signaling Canonical Pathway in the Early COPD group? \ 2. In the COPD group, focus on the function Cellular Movement. Select these genes and add them to a new My Pathway in your IPA account. How many of the proteins in this pathway are enzymes? 3. In Early COPD vs NonSmokers observation Upstream Regulators chapter, filter the Molecule Type to only Transcription Factors, which molecule is predicted to be Inhibited with the lowest z-score?
Hands-on Exercises II cont. Overall Exercises: 5. In studies of nicotine metabolism in smokers, it has been estimated that 70% of a nicotine dose is metabolized to cotinine. Which group express the highest effect on the Nicotine Degradation pathway? 6. In observation Upstream Regulators chapter. Which molecule is predicted to be activated in Both of early and late COPD groups?
Q&A 97
歡迎與我們聯絡 Office: +886-2-2795-1777#3014 Fax: +886-2-2793-8009 EXT 1022 My E-mail: Genechen@gga.asia MSC Support: msc-support@gga.asia