Every object that biology studies is a system of systems (1974) Francois Jacob (1965 Nobel Prize in Medicine)

Size: px
Start display at page:

Download "Every object that biology studies is a system of systems (1974) Francois Jacob (1965 Nobel Prize in Medicine)"

Transcription

1

2 Every object that biology studies is a system of systems (1974) Francois Jacob (1965 Nobel Prize in Medicine) 2

3 Reductionism vs Holism (Aristotle, 1946) The whole is something over and above its parts and not just the sum of them all 3

4 Life s Complexity Pyramid Information storage, processing, and execution lie in different levels of organizations 4

5 High-throughput Data 5

6 Systems Medicine A systems approach to health and disease (Hood) A disease is rarely the outcome from one single gene or its product with genetic abnormality, but a cohort of genes/proteins that involve in a complex network Paul O Shea: Future medicine shaped by an interdisciplinary new biology 6

7 A Simple Example Mutation leading to the gene regulatory network malfunctioning (Current Opinion in Biotechnology, 2010, 21: ) 7

8 Systems Medicine Proactive P4 (predictive/preventive/personalized/participatory) medicine: L. Hood and M. Flores, 2012 Provide deep insights into disease mechanisms Make it possible for viewing health and disease for the individual Stratify complex diseases into their distinct subtypes for a match against proper drugs Provide new approaches to drug discovery Generate metrics for assessing wellness 8

9 Outlines Network Constructions Networks and Human Diseases Network properties & diseases Prediction of disease genes Other applications Opportunities and Challenges 9

10 Network Constructions 10

11 Simplified Biological Systems Abstract representations of biological systems Nodes: RNA, gene, proteins, metabolites etc Edges: physical, biochemical and functional interactions Alon, Science 301,

12 Network Types Gene Regulatory Networks Protein-protein Interaction Networks Metabolic Networks Transcriptional Profiling Networks Phenotypic Profiling Networks Others 12

13 Network Constructions High-throughput Experiments Large scale versus accuracy Manual Curation Existing information from literatures Quality of the published data and quantity Computational Methods Large scale, take advantage of the existing knowledge Capable of dealing with noises 13

14 Examples Networks in Cellular Systems Vidal, Cusick, and Barabasi, Cell 144,

15 Gene Regulatory Networks Nodes: TF or DNA regulatory elements Edges: physical (functional) binding (directional) 15

16 Modeling GRN Biological Knowledge 16

17 Modeling GRNs Hypothesis-driven approaches assume the network structure is known (e.g., biochemically driven models, neural network models) Clustering assumes co-expression is caused by coregulation; Hierarchical, k-means, biclustering etc Ordinary differential equations Boolean networks Bayesian networks 17

18 Bayesian Networks Data are noisy The behavior of GRNs at the single-cell level is fundamentally stochastic (e.g., Computational Modeling of GRNs by Hamid Bolouri) Optical readout noise ODE and Boolean Networks inherently deterministic The sound probabilistic semantics allows BNs to deal with the noises BNs can handle missing data and permit the incomplete knowledge about the biological system BNs are capable of integrating the prior biological knowledge into the system 18

19 Challenges High dimensionality Genome-wide structure learning task is NP-hard Need dimensionality reduction Small sample size Several tens to hundreds (typically < 200) Model overfiting Our solution HMM-based model for dimensionality reduction Integration of constraint-based and scoring-based structure learning methods for network reconstruction 19

20 Computational System Dimensionality reduction followed by structure learning 20

21 Constructing an Undirected Network Integration of mutual information and graph theory for complete connectivity (e.g., path matrix based approach) 21

22 Refine the Network Using the d-separation and the Markov independence criterion to evaluate the edges (remove and add edges). For example, C P(A C) P(A B,C) T estafterdeleting A B P(A B) P(A C, B) T estafterdeleting A C A B P(B A) P(B C, A) T estafterdeleting B C A C B D If the conditional independence test fails for all the deletions, it could be that the triangle loop is authentic, or other paths exist from one node to another. For example, consider a true sub-network. Assume that in the UDS, a triangle loop A B C is identified; while in a true network, the edge between C and B (dashed line) does not exist, and another node D is involved in this network. 22

23 Direction Assignment Using graph theory, d-separation, and maximizing BIC score (exhaustive methods on sub-networks, e.g., find loops with four or five nodes), coming edges etc. C A B Computational complexity: O(n^4) + O(mn^2), m: # samples, n: # of nodes 23

24 Structure Learning X. Chen, et al., Improving Bayesian Network Structure Learning with Mutual Information-based Node Ordering in the K2 Algorithm, IEEE Transactions on Knowledge and Data Engineering, vol. 20(5): , X. Chen, et al., An effective structure learning method for constructing gene networks, Bioinformatics, 22(11): ,

25 Experimental Results POL 30 BN built with our algorithm for yeast cell cycle-related genes. A total of 20 nodes (genes) and 34 edges (interactions) were incorporated. This network captured 65% of all currently reported direct and indirect interactions among these genes. CDC 45 RFA 3 MS H6 MS H2 HPR 5 PRI1 POL 2 POL 1 PDS 1 RAD 53 ASF 1 MCD 1 PRI2 RAD 54 POL 12 DPB 2 CLB 5 PMS 1 25 CLB 6

26 Protein-protein Interaction Networks Nodes: proteins Edges: physical interaction, complexes 26

27 Model Organisms Jeong, Mason, Barabasi, and Oltvai - Nature, 411, 3 May, Rual et al., Nature, 437(20), Oct. 2005

28 In Silico Methods Sequence based methods Rosetta Stone Method Co-evolution Method Physicochemical Properties Method Domain based methods Association Method MLE Method Random Forest Method Integrative methods Decision Tree Logistic Regression Bayesian Network 28

29 Domain-based Approaches p 1 d 1 d 2 d 3 d 5 p 2 d 4 d 5 d 5 d 3 d 2 d 4 p 4 p 3 Protein-protein interactions d 2 Domain-domain interaction 29

30 RFD Methods X. Chen and M. Liu, Prediction of Protein-protein Interactions Using Random Decision Forest Framework, Bioinformatics, 21(24): ,

31 Extended Method X. Chen, M. Liu, and R. Ward, Protein Function Assignment through Mining Cross-Species Protein-protein Interactions, PLoS ONE, 3(2): e1562, 2008 X. Chen and J. Jeong, Sequencebased Prediction of Protein Interaction sites with an Integrative Method, Bioinformatics, 25(5): , 2009 M. Liu, X. Chen and R. Jothi, Knowledge-Guided Inference of Domain-Domain Interactions from Incomplete Protein-Protein Interaction Networks, Bioinformatics, 25(19): ,

32 KUPS Structure of KUPS Workflow diagram 32 URL:

33 KUPS X. Chen, J. Jeong, and P. Dermyer, KUPS: Constructing datasets of interacting and non-interacting protein pairs with associated attributes, Nucleic Acids Research, 2011, Jan; 39:D750-4 Purpose Services Providing high-quality interacting protein pairs (IPPs) and noninteracting pairs (NIPs) for researchers IPPs from three manually curated PPI databases (i.e. IntAct, MINT and HPRD) NIPs calculated with four different methods Test set collections Benchmark datasets 33

34 Metabolic Networks Nodes: metabolites Edges: biochemical reactions (or enzyme that catalyzes) Construction: manual curation + computational assist KEGG 34

35 Genotypic Profiling Networks Indirect interactions (functional links): protein (genes) who function together share similar expressions Genes versus microarray, DNA chip, de novo RNA sequencing etc. Correlation + threshold, e.g., Pearson correlation coeff Stuart et al. Science 302, 2003 Gunsale et al. Nature 436,

36 Some Other Methods Clustering algorithms Signature Algorithm (Ihmels, Bergman, et al. 2002, Nature Genet; 2003, Bioinformatice) Supervised Neural Networks (Alvaro et al., PNAS, 2002) Gene Recommender (Owen et al., Genome Research, 2003) Bayesian Network Biclustering (Dhollander et al., Bioinformatics, 2007) Other bi-clustering algorithms Not considering time sequence 36

37 Dimensionality Reduction rand HMM Σ FFR seed HMM p-val Iterations? 37 37

38 HMM-based Approach A. Senf and X. Chen, Identification of Genes Involved in the Same Pathway Using a Hidden Markov Model-based Approach, Bioinformatics, 25(22): ,

39 Synthetic Data According to Signature Algorithm Literature Initial matrix contains all-zeros Add a module by adding ones to the matrix Randomly scale all genes and conditions Add noise Using a trained Hidden Markov Model Initial matrix contains all-zeros Add module by generating observation sequences using a previously trained HMM Add random values at remaining matrix positions Add noise 39

40 Signature Algorithm Data Set 2600 genes, 100 experimental conditions Embedded modules: 250 genes, 40 conditions Low amount of noise High amount of noise 40

41 HMM Data Sets 500 genes, 100 experimental conditions Embedded module: 125 genes, 40 conditions Low amount of noise 41 High amount of noise

42 Cell Cycle Pathway Found new genes functionally related to input genes Same functional annotations overrepresented as in input gene group 42

43 Phenotypic Profiling Networks Nodes: genes Edges: correlated phenotypic profiles (e.g., RNA interference (RNAi), gene knock-out) Giaever et al., 2002, yeast Mohr et al., 2010, c. elegans, human etc. PPNs confer PPIs (evidence in c. elegans DNA damage response etc.) 43

44 Network Integration Overlap all the networks constructed (Gunsalus et al.,nature 2005; Vidal et al., Cell 2011) 44

45 Network Properties & Diseases 45

46 Network Properties The structure and evolution of networks (social, food, biological etc.) share some common principles: Scale free Hubs Essential genes (Jeong et al. 2001) Evolved more slowly (Fraser et al., 2002) Cancer proteins tend to have more interacting partners than non-cancer proteins Motifs Subgraphs (patterns), more frequent than expected Tied to some biological functions 46

47 Disease Networks Diseases are not independent from each other They may be triggered by different perturbations of some highly connected biological networks Nodes: diseases; edges: how the two diseases are linked Constructing a bipartite network Goh et al, 2007: OMIM genes, two diseases share the same disease genes whose mutations are the cause Rzhetsky et al. 2007; Hidalgu et al., 2009: by data mining individuals who were diagnosed for disease A often for disease B as well - comorbidity (e.g., diabetes and obesity) 47

48 Disease Networks Integrative analysis of cellular networks and disease networks (e.g., the molecular defects for diseases A and B in disease networks how will it affect other genes or other diseases? 48 Barabasi et al., Nature 2011 (12)

49 Prediction of disease genes 49

50 Network-based Applications The human interactome network is predictive (Lim et al., Lee et al., 2010) Linkage methods direct interaction partners Disease module-based methods genes in the same topological, functional or disease modules (integrative analysis of both cellular and disease networks) Diffuse-based methods seed genes randomly walk along an identified pathways (closely related to the see genes) 50

51 Disease Modules Nature reviews, vol. 12,

52 Our Recent Work Based on an integrative model OMIM Gene Network Red circle: mitochondrial deficiency related diseases Yellow circle: protocadherin beta related diseases 52

53 Our Recent Work Newly discovered disease genes 53

54 SNPs-disease Associations Epistasis: interactive effect between two or more genetic variants The primary reason for variation in common (or complex) human diseases Detection of epistatic interactions can help to improve pathogenesis, prevention, diagnosis and treatment of complex diseases Challenges of epistatic interaction detection Large size of genotyped data (10 million SNPs) Enormous number of all possible combinations of genetic factors 54

55 General Methods Statistical methods Only can be applied to small-scale analysis due to their computational complexity Machine learning-based methods Might identify a SNP set that produces the highest classification accuracy, but not necessarily has the strongest association with the diseases lack the ability to detect the causal elements Tend to introduce many false positives How many SNPs??? 55

56 POWER POWER POWER POWER BN-based Approaches A new scoring method and search algorithm Age-related Macular Degeneration (AMD) dataset Contains 116,204 SNPs genotyped with 96 cases and 50 controls. Three associated SNPs were found: Rs Found with a significant association with AMD BN-BnB DASSO-MB BEAM SVM MDR Model1 (=0.3 r 2 =1) MAF Model3 (=0.6 r 2 =1) BN-BnB DASSO-MB BEAM SVM MDR MAF BN-BnB DASSO-MB BEAM SVM MDR Model2 (=0.3 r 2 =1) MAF BN-BnB DASSO-MB BEAM SVM MDR Model4 (=7 r 2 =1) MAF 56

57 Other Applications Network Pharmacology systems level method for drug design: reduce the search for therapeutic agents; identify potential side effects; multi-target drug strategy an essential part for drug development Network Perturbation Disease Classification Personalized Medicine Reformed healthcare (predictive/preventive) 57

58 Opportunities and Challenges Big picture: personalized medicine/healthcare, preventative medicine Whole genome information available Data Sources: large scale + heterogeneous (BIG Data) still need creative algorithms and systematic approached for data analysis (genome wide etc.) Perturbation and time series analyses Integration: a (weighted) complete, reliable human interactome Current Opinion Pharmacology, 2012, 12:1-6 58

59 Thank You 59

Protein-Protein-Interaction Networks. Ulf Leser, Samira Jaeger

Protein-Protein-Interaction Networks. Ulf Leser, Samira Jaeger Protein-Protein-Interaction Networks Ulf Leser, Samira Jaeger This Lecture Protein-protein interactions Characteristics Experimental detection methods Databases Biological networks Ulf Leser: Introduction

More information

Protein-Protein-Interaction Networks. Ulf Leser, Samira Jaeger

Protein-Protein-Interaction Networks. Ulf Leser, Samira Jaeger Protein-Protein-Interaction Networks Ulf Leser, Samira Jaeger This Lecture Protein-protein interactions Characteristics Experimental detection methods Databases Protein-protein interaction networks Ulf

More information

Protein-Protein-Interaction Networks. Ulf Leser, Samira Jaeger

Protein-Protein-Interaction Networks. Ulf Leser, Samira Jaeger Protein-Protein-Interaction Networks Ulf Leser, Samira Jaeger SHK Stelle frei Ab 1.9.2015, 2 Jahre, 41h/Monat Verbundprojekt MaptTorNet: Pankreatische endokrine Tumore Insb. statistische Aufbereitung und

More information

Data Mining for Biological Data Analysis

Data Mining for Biological Data Analysis Data Mining for Biological Data Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Data Mining Course by Gregory-Platesky Shapiro available at www.kdnuggets.com Jiawei Han

More information

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology. G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY Methods or systems for genetic

More information

The application of hidden markov model in building genetic regulatory network

The application of hidden markov model in building genetic regulatory network J. Biomedical Science and Engineering, 2010, 3, 633-637 doi:10.4236/bise.2010.36086 Published Online June 2010 (http://www.scirp.org/ournal/bise/). The application of hidden markov model in building genetic

More information

The Interaction-Interaction Model for Disease Protein Discovery

The Interaction-Interaction Model for Disease Protein Discovery The Interaction-Interaction Model for Disease Protein Discovery Ken Cheng December 10, 2017 Abstract Network medicine, the field of using biological networks to develop insight into disease and medicine,

More information

ECS 234: Genomic Data Integration ECS 234

ECS 234: Genomic Data Integration ECS 234 : Genomic Data Integration Heterogeneous Data Integration DNA Sequence Microarray Proteomics >gi 12004594 gb AF217406.1 Saccharomyces cerevisiae uridine nucleosidase (URH1) gene, complete cds ATGGAATCTGCTGATTTTTTTACCTCACGAAACTTATTAAAACAGATAATTTCCCTCATCTGCAAGGTTG

More information

BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology. Lecture 2: Microarray analysis

BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology. Lecture 2: Microarray analysis BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology Lecture 2: Microarray analysis Genome wide measurement of gene transcription using DNA microarray Bruce Alberts, et al., Molecular Biology

More information

Machine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University

Machine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University Machine learning applications in genomics: practical issues & challenges Yuzhen Ye School of Informatics and Computing, Indiana University Reference Machine learning applications in genetics and genomics

More information

Network System Inference

Network System Inference Network System Inference Francis J. Doyle III University of California, Santa Barbara Douglas Lauffenburger Massachusetts Institute of Technology WTEC Systems Biology Final Workshop March 11, 2005 What

More information

Alexander Statnikov, Ph.D.

Alexander Statnikov, Ph.D. Alexander Statnikov, Ph.D. Director, Computational Causal Discovery Laboratory Benchmarking Director, Best Practices Integrative Informatics Consultation Service Assistant Professor, Department of Medicine,

More information

VALLIAMMAI ENGINEERING COLLEGE

VALLIAMMAI ENGINEERING COLLEGE VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur 603 203 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER BM6005 BIO INFORMATICS Regulation 2013 Academic Year 2018-19 Prepared

More information

Computational Genomics. Reconstructing signaling and dynamic regulatory networks

Computational Genomics. Reconstructing signaling and dynamic regulatory networks 02-710 Computational Genomics Reconstructing signaling and dynamic regulatory networks Input Output Hidden Markov Model Input (Static transcription factorgene interactions) Bengio and Frasconi, NIPS 1995

More information

BIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM)

BIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM) BIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM) PROGRAM TITLE DEGREE TITLE Master of Science Program in Bioinformatics and System Biology (International Program) Master of Science (Bioinformatics

More information

CS 5984: Topics and Schedule

CS 5984: Topics and Schedule CS 5984: and Schedule T. M. Murali January 19, 2006 T. M. Murali January 19, 2006 CS 5984: and Schedule Continuum of Models in Systems Biology From Building with a scaffold: emerging strategies for high-

More information

advanced analysis of gene expression microarray data aidong zhang World Scientific State University of New York at Buffalo, USA

advanced analysis of gene expression microarray data aidong zhang World Scientific State University of New York at Buffalo, USA advanced analysis of gene expression microarray data aidong zhang State University of New York at Buffalo, USA World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI Contents

More information

Era with Computational Biology/Toxicology

Era with Computational Biology/Toxicology USM Seminar 1/22/2010 Embracing the Post-Omics Era with Computational Biology/Toxicology Ping Gong Environmental Genomics and Genetics (EGG) Team @ Environmental Laboratory Outline Introduction Bioinformatics

More information

Machine Learning. HMM applications in computational biology

Machine Learning. HMM applications in computational biology 10-601 Machine Learning HMM applications in computational biology Central dogma DNA CCTGAGCCAACTATTGATGAA transcription mrna CCUGAGCCAACUAUUGAUGAA translation Protein PEPTIDE 2 Biological data is rapidly

More information

Identifying Signaling Pathways. BMI/CS 776 Spring 2016 Anthony Gitter

Identifying Signaling Pathways. BMI/CS 776  Spring 2016 Anthony Gitter Identifying Signaling Pathways BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostat.wisc.edu Goals for lecture Challenges of integrating high-throughput assays Connecting relevant

More information

Microarrays & Gene Expression Analysis

Microarrays & Gene Expression Analysis Microarrays & Gene Expression Analysis Contents DNA microarray technique Why measure gene expression Clustering algorithms Relation to Cancer SAGE SBH Sequencing By Hybridization DNA Microarrays 1. Developed

More information

Bioinformatics : Gene Expression Data Analysis

Bioinformatics : Gene Expression Data Analysis 05.12.03 Bioinformatics : Gene Expression Data Analysis Aidong Zhang Professor Computer Science and Engineering What is Bioinformatics Broad Definition The study of how information technologies are used

More information

V 1 Introduction! Fri, Oct 24, 2014! Bioinformatics 3 Volkhard Helms!

V 1 Introduction! Fri, Oct 24, 2014! Bioinformatics 3 Volkhard Helms! V 1 Introduction! Fri, Oct 24, 2014! Bioinformatics 3 Volkhard Helms! How Does a Cell Work?! A cell is a crowded environment! => many different proteins,! metabolites, compartments,! On a microscopic level!

More information

ROAD TO STATISTICAL BIOINFORMATICS CHALLENGE 1: MULTIPLE-COMPARISONS ISSUE

ROAD TO STATISTICAL BIOINFORMATICS CHALLENGE 1: MULTIPLE-COMPARISONS ISSUE CHAPTER1 ROAD TO STATISTICAL BIOINFORMATICS Jae K. Lee Department of Public Health Science, University of Virginia, Charlottesville, Virginia, USA There has been a great explosion of biological data and

More information

Survival Outcome Prediction for Cancer Patients based on Gene Interaction Network Analysis and Expression Profile Classification

Survival Outcome Prediction for Cancer Patients based on Gene Interaction Network Analysis and Expression Profile Classification Survival Outcome Prediction for Cancer Patients based on Gene Interaction Network Analysis and Expression Profile Classification Final Project Report Alexander Herrmann Advised by Dr. Andrew Gentles December

More information

Methods and tools for exploring functional genomics data

Methods and tools for exploring functional genomics data Methods and tools for exploring functional genomics data William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington Outline Searching for

More information

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005 Bioinformatics is the recording, annotation, storage, analysis, and searching/retrieval of nucleic acid sequence (genes and RNAs), protein sequence and structural information. This includes databases of

More information

Random matrix analysis for gene co-expression experiments in cancer cells

Random matrix analysis for gene co-expression experiments in cancer cells Random matrix analysis for gene co-expression experiments in cancer cells OIST-iTHES-CTSR 2016 July 9 th, 2016 Ayumi KIKKAWA (MTPU, OIST) Introduction : What is co-expression of genes? There are 20~30k

More information

Uncovering differentially expressed pathways with protein interaction and gene expression data

Uncovering differentially expressed pathways with protein interaction and gene expression data The Second International Symposium on Optimization and Systems Biology (OSB 08) Lijiang, China, October 31 November 3, 2008 Copyright 2008 ORSC & APORC, pp. 74 82 Uncovering differentially expressed pathways

More information

DNA Based Disease Prediction using pathway Analysis

DNA Based Disease Prediction using pathway Analysis 2017 IEEE 7th International Advance Computing Conference DNA Based Disease Prediction using pathway Analysis Syeeda Farah Dr.Asha T Cauvery B and Sushma M S Department of Computer Science and Shivanand

More information

Péter Antal Ádám Arany Bence Bolgár András Gézsi Gergely Hajós Gábor Hullám Péter Marx András Millinghoffer László Poppe Péter Sárközy BIOINFORMATICS

Péter Antal Ádám Arany Bence Bolgár András Gézsi Gergely Hajós Gábor Hullám Péter Marx András Millinghoffer László Poppe Péter Sárközy BIOINFORMATICS Péter Antal Ádám Arany Bence Bolgár András Gézsi Gergely Hajós Gábor Hullám Péter Marx András Millinghoffer László Poppe Péter Sárközy BIOINFORMATICS The Bioinformatics book covers new topics in the rapidly

More information

Accuracy of the Bayesian Network Algorithms for Inferring Gene Regulatory Networks

Accuracy of the Bayesian Network Algorithms for Inferring Gene Regulatory Networks HELSINKI UNIVERSITY OF TECHNOLOGY Engineering Physics and Mathematics Systems Analysis Laboratory Mat-2.108 Independent research projects in applied mathematics Accuracy of the Bayesian Network Algorithms

More information

MFMS: Maximal Frequent Module Set mining from multiple human gene expression datasets

MFMS: Maximal Frequent Module Set mining from multiple human gene expression datasets MFMS: Maximal Frequent Module Set mining from multiple human gene expression datasets Saeed Salem North Dakota State University Cagri Ozcaglar Amazon 8/11/2013 Introduction Gene expression analysis Use

More information

Finding Compensatory Pathways in Yeast Genome

Finding Compensatory Pathways in Yeast Genome Finding Compensatory Pathways in Yeast Genome Olga Ohrimenko Abstract Pathways of genes found in protein interaction networks are used to establish a functional linkage between genes. A challenging problem

More information

Bayesian Variable Selection and Data Integration for Biological Regulatory Networks

Bayesian Variable Selection and Data Integration for Biological Regulatory Networks Bayesian Variable Selection and Data Integration for Biological Regulatory Networks Shane T. Jensen Department of Statistics The Wharton School, University of Pennsylvania stjensen@wharton.upenn.edu Gary

More information

From genome-wide association studies to disease relationships. Liqing Zhang Department of Computer Science Virginia Tech

From genome-wide association studies to disease relationships. Liqing Zhang Department of Computer Science Virginia Tech From genome-wide association studies to disease relationships Liqing Zhang Department of Computer Science Virginia Tech Types of variation in the human genome ( polymorphisms SNPs (single nucleotide Insertions

More information

2. Materials and Methods

2. Materials and Methods Identification of cancer-relevant Variations in a Novel Human Genome Sequence Robert Bruggner, Amir Ghazvinian 1, & Lekan Wang 1 CS229 Final Report, Fall 2009 1. Introduction Cancer affects people of all

More information

Gene expression connectivity mapping and its application to Cat-App

Gene expression connectivity mapping and its application to Cat-App Gene expression connectivity mapping and its application to Cat-App Shu-Dong Zhang Northern Ireland Centre for Stratified Medicine University of Ulster Outline TITLE OF THE PRESENTATION Gene expression

More information

Inferring Gene Networks from Microarray Data using a Hybrid GA p.1

Inferring Gene Networks from Microarray Data using a Hybrid GA p.1 Inferring Gene Networks from Microarray Data using a Hybrid GA Mark Cumiskey, John Levine and Douglas Armstrong johnl@inf.ed.ac.uk http://www.aiai.ed.ac.uk/ johnl Institute for Adaptive and Neural Computation

More information

Towards Gene Network Estimation with Structure Learning

Towards Gene Network Estimation with Structure Learning Proceedings of the Postgraduate Annual Research Seminar 2006 69 Towards Gene Network Estimation with Structure Learning Suhaila Zainudin 1 and Prof Dr Safaai Deris 2 1 Fakulti Teknologi dan Sains Maklumat

More information

Grand Challenges in Computational Biology

Grand Challenges in Computational Biology Grand Challenges in Computational Biology Kimmen Sjölander UC Berkeley Reconstructing the Tree of Life CITRIS-INRIA workshop 24 May, 2011 Prediction of biological pathways and networks Human microbiome

More information

Genome Biology and Biotechnology

Genome Biology and Biotechnology Genome Biology and Biotechnology Functional Genomics Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute for Biotechnology (VIB) University of Gent International course

More information

Upstream/Downstream Relation Detection of Signaling Molecules using Microarray Data

Upstream/Downstream Relation Detection of Signaling Molecules using Microarray Data Vol 1 no 1 2005 Pages 1 5 Upstream/Downstream Relation Detection of Signaling Molecules using Microarray Data Ozgun Babur 1 1 Center for Bioinformatics, Computer Engineering Department, Bilkent University,

More information

Bioinformatics. Microarrays: designing chips, clustering methods. Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute

Bioinformatics. Microarrays: designing chips, clustering methods. Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Bioinformatics Microarrays: designing chips, clustering methods Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Course Syllabus Jan 7 Jan 14 Jan 21 Jan 28 Feb 4 Feb 11 Feb 18 Feb 25 Sequence

More information

Our view on cdna chip analysis from engineering informatics standpoint

Our view on cdna chip analysis from engineering informatics standpoint Our view on cdna chip analysis from engineering informatics standpoint Chonghun Han, Sungwoo Kwon Intelligent Process System Lab Department of Chemical Engineering Pohang University of Science and Technology

More information

Bioinformatics opportunities in Genomics and Genetics

Bioinformatics opportunities in Genomics and Genetics Bioinformatics opportunities in Genomics and Genetics Case Study: Prediction of novel gene functions of NSF1/YPL230W in Saccharomyces Cerevisiae via search for maximally interconnected sub-graph Kyrylo

More information

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes Coalescence Scribe: Alex Wells 2/18/16 Whenever you observe two sequences that are similar, there is actually a single individual

More information

Predicting prokaryotic incubation times from genomic features Maeva Fincker - Final report

Predicting prokaryotic incubation times from genomic features Maeva Fincker - Final report Predicting prokaryotic incubation times from genomic features Maeva Fincker - mfincker@stanford.edu Final report Introduction We have barely scratched the surface when it comes to microbial diversity.

More information

Augmenting DIAMOnD: A Method for Improving Disease Networks Among Human Genes

Augmenting DIAMOnD: A Method for Improving Disease Networks Among Human Genes Augmenting DIAMOnD: A Method for Improving Disease Networks Among Human Genes A Major Qualifying Project submitted to the Faculty of WORCESTER POLYTECHNIC INSTITUTE in partial fulfillment of the requirements

More information

Genetics and Bioinformatics

Genetics and Bioinformatics Genetics and Bioinformatics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Lecture 1: Setting the pace 1 Bioinformatics what s

More information

From Bench to Bedside: Role of Informatics. Nagasuma Chandra Indian Institute of Science Bangalore

From Bench to Bedside: Role of Informatics. Nagasuma Chandra Indian Institute of Science Bangalore From Bench to Bedside: Role of Informatics Nagasuma Chandra Indian Institute of Science Bangalore Electrocardiogram Apparent disconnect among DATA pieces STUDYING THE SAME SYSTEM Echocardiogram Chest sounds

More information

Technical University of Denmark

Technical University of Denmark 1 of 13 Technical University of Denmark Written exam, 15 December 2007 Course name: Introduction to Systems Biology Course no. 27041 Aids allowed: Open Book Exam Provide your answers and calculations on

More information

Gene function prediction. Computational analysis of biological networks. Olga Troyanskaya, PhD

Gene function prediction. Computational analysis of biological networks. Olga Troyanskaya, PhD Gene function prediction Computational analysis of biological networks. Olga Troyanskaya, PhD Available Data Coexpression - Microarrays Cells of Interest Known DNA sequences Isolate mrna Glass slide Resulting

More information

MANIFESTO OF STUDIES 2012

MANIFESTO OF STUDIES 2012 MANIFESTO OF STUDIES 2012 1st YEAR Course Teacher Hours Synopsis Evaluation procedure Laboratory Safety Course (Mandatory) Prof. Mancini I. Dr. Provenzani A. 12 General Laboratory Procedures, Equipment

More information

Outline and learning objectives. From Proteomics to Systems Biology. Integration of omics - information

Outline and learning objectives. From Proteomics to Systems Biology. Integration of omics - information From to Systems Biology Outline and learning objectives Omics science provides global analysis tools to study entire systems How to obtain omics - What can we learn Limitations Integration of omics - In-class

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics If the 19 th century was the century of chemistry and 20 th century was the century of physic, the 21 st century promises to be the century of biology...professor Dr. Satoru

More information

From Proteomics to Systems Biology. Integration of omics - information

From Proteomics to Systems Biology. Integration of omics - information From Proteomics to Systems Biology Integration of omics - information Outline and learning objectives Omics science provides global analysis tools to study entire systems How to obtain omics - data What

More information

ECS 234: Introduction to Computational Functional Genomics ECS 234

ECS 234: Introduction to Computational Functional Genomics ECS 234 : Introduction to Computational Functional Genomics Administrativia Prof. Vladimir Filkov 3023 Kemper filkov@cs.ucdavis.edu Appts: Office Hours: Wednesday, 1:30-3p Ask me or email me any time for appt

More information

Bayesian Networks as framework for data integration

Bayesian Networks as framework for data integration Bayesian Networks as framework for data integration Jun Zhu, Ph. D. Department of Genomics and Genetic Sciences Icahn Institute of Genomics and Multiscale Biology Icahn Medical School at Mount Sinai New

More information

Reconstructing Gene Regulatory Networks from Homozygous and Heterozygous Deletion Data Using Gaussian Noise Model

Reconstructing Gene Regulatory Networks from Homozygous and Heterozygous Deletion Data Using Gaussian Noise Model 2013 First International Conference on Artificial Intelligence, Modelling & Simulation Reconstructing Gene Regulatory Networks from Homozygous and Heterozygous Deletion Data Using Gaussian Noise Model

More information

Predictive and Causal Modeling in the Health Sciences. Sisi Ma MS, MS, PhD. New York University, Center for Health Informatics and Bioinformatics

Predictive and Causal Modeling in the Health Sciences. Sisi Ma MS, MS, PhD. New York University, Center for Health Informatics and Bioinformatics Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, MS, PhD. New York University, Center for Health Informatics and Bioinformatics 1 Exponentially Rapid Data Accumulation Protein Sequencing

More information

NETWORK BASED PRIORITIZATION OF DISEASE GENES

NETWORK BASED PRIORITIZATION OF DISEASE GENES NETWORK BASED PRIORITIZATION OF DISEASE GENES by MEHMET SİNAN ERTEN Submitted in partial fulfillment of the requirements for the degree of Master of Science Thesis Advisor: Mehmet Koyutürk Department of

More information

11/22/13. Proteomics, functional genomics, and systems biology. Biosciences 741: Genomics Fall, 2013 Week 11

11/22/13. Proteomics, functional genomics, and systems biology. Biosciences 741: Genomics Fall, 2013 Week 11 Proteomics, functional genomics, and systems biology Biosciences 741: Genomics Fall, 2013 Week 11 1 Figure 6.1 The future of genomics Functional Genomics The field of functional genomics represents the

More information

and Promoter Sequence Data

and Promoter Sequence Data : Combining Gene Expression and Promoter Sequence Data Outline 1. Motivation Functionally related genes cluster together genes sharing cis-elements cluster together transcriptional regulation is modular

More information

27041, Week 02. Review of Week 01

27041, Week 02. Review of Week 01 27041, Week 02 Review of Week 01 The human genome sequencing project (HGP) 2 CBS, Department of Systems Biology Systems Biology and emergent properties 3 CBS, Department of Systems Biology Different model

More information

A MACHINE LEARNING APPROACH TO QUERY TIME- SERIES MICROARRAY DATA SETS FOR FUNCTIONALLY RELATED GENES USING HIDDEN MARKOV MODELS

A MACHINE LEARNING APPROACH TO QUERY TIME- SERIES MICROARRAY DATA SETS FOR FUNCTIONALLY RELATED GENES USING HIDDEN MARKOV MODELS A MACHINE LEARNING APPROACH TO QUERY TIME- SERIES MICROARRAY DATA SETS FOR FUNCTIONALLY RELATED GENES USING HIDDEN MARKOV MODELS BY Alexander Senf Submitted to the graduate degree program in Computer Science

More information

The interactions that occur between two proteins are essential parts of biological systems.

The interactions that occur between two proteins are essential parts of biological systems. Gabor 1 Evaluation of Different Biological Data and Computational Classification Methods for Use in Protein Interaction Prediction in Signaling Pathways in Humans Yanjun Qi 1 and Judith Klein-Seetharaman

More information

System Identification methods for Reverse Engineering Gene Regulatory Networks

System Identification methods for Reverse Engineering Gene Regulatory Networks System Identification methods for Reverse Engineering Gene Regulatory Networks by Zhen Wang A thesis submitted to the School of Computing in conformity with the requirements for the degree of Master of

More information

In silico prediction of novel therapeutic targets using gene disease association data

In silico prediction of novel therapeutic targets using gene disease association data In silico prediction of novel therapeutic targets using gene disease association data, PhD, Associate GSK Fellow Scientific Leader, Computational Biology and Stats, Target Sciences GSK Big Data in Medicine

More information

ECS 234: Introduction to Computational Functional Genomics ECS 234

ECS 234: Introduction to Computational Functional Genomics ECS 234 : Introduction to Computational Functional Genomics Administrativia Prof. Vladimir Filkov 3023 Kemper filkov@cs.ucdavis.edu Appts: Office Hours: M,W, 3-4pm, and by appt. , 4 credits, CRN: 54135 http://www.cs.ucdavis.edu~/filkov/234/

More information

Introduction to BIOINFORMATICS

Introduction to BIOINFORMATICS COURSE OF BIOINFORMATICS a.a. 2016-2017 Introduction to BIOINFORMATICS What is Bioinformatics? (I) The sinergy between biology and informatics What is Bioinformatics? (II) From: http://www.bioteach.ubc.ca/bioinfo2010/

More information

Metabolic Networks. Ulf Leser and Michael Weidlich

Metabolic Networks. Ulf Leser and Michael Weidlich Metabolic Networks Ulf Leser and Michael Weidlich This Lecture Introduction Systems biology & modelling Metabolism & metabolic networks Network reconstruction Strategy & workflow Mathematical representation

More information

Smart India Hackathon

Smart India Hackathon TM Persistent and Hackathons Smart India Hackathon 2017 i4c www.i4c.co.in Digital Transformation 25% of India between age of 16-25 Our country needs audacious digital transformation to reach its potential

More information

Epigenetics and DNase-Seq

Epigenetics and DNase-Seq Epigenetics and DNase-Seq BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2018 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by Anthony

More information

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008 MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

BTRY 7210: Topics in Quantitative Genomics and Genetics

BTRY 7210: Topics in Quantitative Genomics and Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu Spring 2015, Thurs.,12:20-1:10

More information

Function Prediction of Proteins from their Sequences with BAR 3.0

Function Prediction of Proteins from their Sequences with BAR 3.0 Open Access Annals of Proteomics and Bioinformatics Short Communication Function Prediction of Proteins from their Sequences with BAR 3.0 Giuseppe Profiti 1,2, Pier Luigi Martelli 2 and Rita Casadio 2

More information

Capabilities & Services

Capabilities & Services Capabilities & Services Accelerating Research & Development Table of Contents Introduction to DHMRI 3 Services and Capabilites: Genomics 4 Proteomics & Protein Characterization 5 Metabolomics 6 In Vitro

More information

BIOINFORMATICS THE MACHINE LEARNING APPROACH

BIOINFORMATICS THE MACHINE LEARNING APPROACH 88 Proceedings of the 4 th International Conference on Informatics and Information Technology BIOINFORMATICS THE MACHINE LEARNING APPROACH A. Madevska-Bogdanova Inst, Informatics, Fac. Natural Sc. and

More information

Statistical Methods for Network Analysis of Biological Data

Statistical Methods for Network Analysis of Biological Data The Protein Interaction Workshop, 8 12 June 2015, IMS Statistical Methods for Network Analysis of Biological Data Minghua Deng, dengmh@pku.edu.cn School of Mathematical Sciences Center for Quantitative

More information

Detecting gene-gene interactions in high-throughput genotype data through a Bayesian clustering procedure

Detecting gene-gene interactions in high-throughput genotype data through a Bayesian clustering procedure Detecting gene-gene interactions in high-throughput genotype data through a Bayesian clustering procedure Sui-Pi Chen and Guan-Hua Huang Institute of Statistics National Chiao Tung University Hsinchu,

More information

Exploring the Genetic Basis of Congenital Heart Defects

Exploring the Genetic Basis of Congenital Heart Defects Exploring the Genetic Basis of Congenital Heart Defects Sanjay Siddhanti Jordan Hannel Vineeth Gangaram szsiddh@stanford.edu jfhannel@stanford.edu vineethg@stanford.edu 1 Introduction The Human Genome

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Dr. Taysir Hassan Abdel Hamid Lecturer, Information Systems Department Faculty of Computer and Information Assiut University taysirhs@aun.edu.eg taysir_soliman@hotmail.com

More information

ChIP-seq and RNA-seq. Farhat Habib

ChIP-seq and RNA-seq. Farhat Habib ChIP-seq and RNA-seq Farhat Habib fhabib@iiserpune.ac.in Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions

More information

A Propagation-based Algorithm for Inferring Gene-Disease Associations

A Propagation-based Algorithm for Inferring Gene-Disease Associations A Propagation-based Algorithm for Inferring Gene-Disease Associations Oron Vanunu Roded Sharan Abstract: A fundamental challenge in human health is the identification of diseasecausing genes. Recently,

More information

Study on the Application of Data Mining in Bioinformatics. Mingyang Yuan

Study on the Application of Data Mining in Bioinformatics. Mingyang Yuan International Conference on Mechatronics Engineering and Information Technology (ICMEIT 2016) Study on the Application of Mining in Bioinformatics Mingyang Yuan School of Science and Liberal Arts, New

More information

Functional genomics + Data mining

Functional genomics + Data mining Functional genomics + Data mining BIO337 Systems Biology / Bioinformatics Spring 2014 Edward Marcotte, Univ of Texas at Austin Edward Marcotte/Univ of Texas/BIO337/Spring 2014 Functional genomics + Data

More information

CS262 Lecture 12 Notes Single Cell Sequencing Jan. 11, 2016

CS262 Lecture 12 Notes Single Cell Sequencing Jan. 11, 2016 CS262 Lecture 12 Notes Single Cell Sequencing Jan. 11, 2016 Background A typical human cell consists of ~6 billion base pairs of DNA and ~600 million bases of mrna. It is time-consuming and expensive to

More information

LARGE-SCALE PROTEIN INTERACTOMICS. Karl Frontzek Institute of Neuropathology

LARGE-SCALE PROTEIN INTERACTOMICS. Karl Frontzek Institute of Neuropathology LARGE-SCALE PROTEIN INTERACTOMICS Karl Frontzek Institute of Neuropathology STUDYING THE INTERACTOME A yeast 2 hybrid (DB: DNA binding domain, AD: activation domain) B tandem affinity purification (dashed

More information

Role of Centrality in Network Based Prioritization of Disease Genes

Role of Centrality in Network Based Prioritization of Disease Genes Role of Centrality in Network Based Prioritization of Disease Genes Sinan Erten 1 and Mehmet Koyutürk 1,2 Case Western Reserve University (1)Electrical Engineering & Computer Science (2)Center for Proteomics

More information

Statistical Inference and Reconstruction of Gene Regulatory Network from Observational Expression Profile

Statistical Inference and Reconstruction of Gene Regulatory Network from Observational Expression Profile Statistical Inference and Reconstruction of Gene Regulatory Network from Observational Expression Profile Prof. Shanthi Mahesh 1, Kavya Sabu 2, Dr. Neha Mangla 3, Jyothi G V 4, Suhas A Bhyratae 5, Keerthana

More information

Knowledge-Guided Analysis with KnowEnG Lab

Knowledge-Guided Analysis with KnowEnG Lab Han Sinha Song Weinshilboum Knowledge-Guided Analysis with KnowEnG Lab KnowEnG Center Powerpoint by Charles Blatti Knowledge-Guided Analysis KnowEnG Center 2017 1 Exercise In this exercise we will be doing

More information

Bioinformatics for Biologists

Bioinformatics for Biologists Bioinformatics for Biologists Functional Genomics: Microarray Data Analysis Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Outline Introduction Working with microarray data Normalization Analysis

More information

Ingenuity Pathway Analysis (IPA )

Ingenuity Pathway Analysis (IPA ) Ingenuity Pathway Analysis (IPA ) For the analysis and interpretation of omics data IPA is a web-based software application for the analysis, integration, and interpretation of data derived from omics

More information

ChIP-seq and RNA-seq

ChIP-seq and RNA-seq ChIP-seq and RNA-seq Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions (ChIPchromatin immunoprecipitation)

More information

Information Driven Biomedicine. Prof. Santosh K. Mishra Executive Director, BII CIAPR IV Shanghai, May

Information Driven Biomedicine. Prof. Santosh K. Mishra Executive Director, BII CIAPR IV Shanghai, May Information Driven Biomedicine Prof. Santosh K. Mishra Executive Director, BII CIAPR IV Shanghai, May 21 2004 What/How RNA Complexity of Data Information The Genetic Code DNA RNA Proteins Pathways Complexity

More information

CS 6824: New Directions in Computational Systems Biology

CS 6824: New Directions in Computational Systems Biology CS 6824: New Directions in Computational Systems Biology T. M. Murali January 19, 2011 Course Structure Discuss state-of-the-art research papers. Course Structure Lectures Discuss state-of-the-art research

More information

Editorial. Current Computational Models for Prediction of the Varied Interactions Related to Non-Coding RNAs

Editorial. Current Computational Models for Prediction of the Varied Interactions Related to Non-Coding RNAs Editorial Current Computational Models for Prediction of the Varied Interactions Related to Non-Coding RNAs Xing Chen 1,*, Huiming Peng 2, Zheng Yin 3 1 School of Information and Electrical Engineering,

More information

Using Genomics to Guide Immunosuppression Therapy David A. Baran, MD, FACC, FSCAI System Director, Advanced HF, Transplant and MCS, Sentara Heart

Using Genomics to Guide Immunosuppression Therapy David A. Baran, MD, FACC, FSCAI System Director, Advanced HF, Transplant and MCS, Sentara Heart Using Genomics to Guide Immunosuppression Therapy David A. Baran, MD, FACC, FSCAI System Director, Advanced HF, Transplant and MCS, Sentara Heart Hospital, Norfolk, VA Disclosure Consulting: Livanova,

More information