Inference of predictive gene interaction networks Benjamin Haibe-Kains DFCI/HSPH December 15, 2011 The Computational Biology and Functional Genomics Laboratory at the Dana-Farber Cancer Institute and Harvard School of Public Health Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 1 / 12
Background Phenotypes result from biological networks, not individual genes New biotechnologies allow us to analyze multiple genes in parallel: next generation sequencing gene expression profiling Chip-seq... Understand the complex interactions between genes and the behavior of a network is fundamental to bring new biological insights to make useful predictions Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 2 / 12
Relevance Aim: Infer reliable predictive gene interaction networks from gene expression data Beyond biological understanding, such networks would be efficient tools for: predicting the response of an organism (cancer patient) to perturbations (targeted therapies) identifying the key genes to target for significantly decreasing a pathway activity optimizing combination of drugs and therapy regiments Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 3 / 12
Gene Interaction Network Genes are represented as nodes gene h gene b gene f Interactions are represented by edges gene a gene c Edges can be directed to show causal interactions gene d Edges are not necessarily direct interactions gene g gene e Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 4 / 12
Gene Interaction Network and Perturbation High expression gene h gene b gene f gene h gene b gene e Low expression gene a gene c Perturbation gene a gene c gene d gene d gene g gene e gene g gene e Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 5 / 12
Challenges in network inference... and how we address them 1 Problem complexity: seed the search for the best network by using prior biological knowledge about gene interactions Predictive Networks web application 2 Curse of dimensionality: development of a local regression-based network inference to enable analysis of hundreds of genes in parallel 3 Lack of validation: development of performance criteria to assess the quality of network models 4 Lack of software: implementation of network inference methods and related tools in R predictionet package Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 6 / 12
Challenges in network inference... and how we address them 1 Problem complexity: seed the search for the best network by using prior biological knowledge about gene interactions Predictive Networks web application 2 Curse of dimensionality: development of a local regression-based network inference to enable analysis of hundreds of genes in parallel 3 Lack of validation: development of performance criteria to assess the quality of network models 4 Lack of software: implementation of network inference methods and related tools in R predictionet package Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 6 / 12
Challenges in network inference... and how we address them 1 Problem complexity: seed the search for the best network by using prior biological knowledge about gene interactions Predictive Networks web application 2 Curse of dimensionality: development of a local regression-based network inference to enable analysis of hundreds of genes in parallel 3 Lack of validation: development of performance criteria to assess the quality of network models 4 Lack of software: implementation of network inference methods and related tools in R predictionet package Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 6 / 12
Challenges in network inference... and how we address them 1 Problem complexity: seed the search for the best network by using prior biological knowledge about gene interactions Predictive Networks web application 2 Curse of dimensionality: development of a local regression-based network inference to enable analysis of hundreds of genes in parallel 3 Lack of validation: development of performance criteria to assess the quality of network models 4 Lack of software: implementation of network inference methods and related tools in R predictionet package Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 6 / 12
Predictive Networks web application https://compbio.dfci.harvard.edu/predictivenetworks/ Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 7 / 12
Regression-based network inference Priors Gene expression data MRMR Predictive Networks web application Undirected graph causality Inference Directed graph Directed graph Network topology (linear) regression Predictive models Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 8 / 12
Performance criteria for network inference We implement a cross-validation framework to assess: edge-specific stability: what are the interactions inferred in most of the cross-validation folds? gene-specific prediction score: what are the genes whose expression can be well predicted by their parent/source genes? Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 9 / 12
Predictionet R package https://github.com/bhaibeka/predictionet Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 10 / 12
Acknowledgements Computational Biology and Functional Genomics Laboratory Dana-Farber Cancer Institute, USA Renee Rubio Kathleen Fleming Niall Prendergast Raktim Sinha Daniel Schlauch Alejandro Quiroz Megha Padi Stefan Bentink John Quackenbush Machine Learning Group Université Libre de Bruxelles, Belgium Catharina Olsen Gianluca Bontempi Ontario Cancer Institute Universty of Toronto, Canada Entagen Entagen, USA Christopher Bouton Erik Bakke James Hardwick Amira Djebbari Benjamin Haibe-Kains (DFCI/HSPH) R/Bioconductor Course December 15, 2011 11 / 12