Predictive and Causal Modeling in the Health Sciences. Sisi Ma MS, MS, PhD. New York University, Center for Health Informatics and Bioinformatics
|
|
- Ophelia Phelps
- 6 years ago
- Views:
Transcription
1 Predictive and Causal Modeling in the Health Sciences Sisi Ma MS, MS, PhD. New York University, Center for Health Informatics and Bioinformatics 1
2 Exponentially Rapid Data Accumulation Protein Sequencing via MS 1986 First GWAS Study Published; NGS 2005 Single Cell Sequencing Rapid DNA Sequencing 1982 GeneBank Formed 1990 Human Genome Project Initiated 2003 Completion of Human Genome Sequencing PDB initiated 2006 TCGA Initiated 1,000 Genomes Initiated 2010 Human Connectome Project 2016 TCGA Completed >10,000 Tumors 2
3 From Data to Discoveries Advanced Data Preparation, Analysis and Modeling methods are needed for knowledge discovery in high volume, high variety data. Two key types: Predictive Modeling and Computational Causal Discovery Predictive Model Causal Model Predictive Knowledge Causal Knowledge Screening Diagnostics Prognostics Intervention Therapeutics 3
4 Talk Outline Predictive Modeling o Brief Introduction to Predictive Modeling o Indicative Case Studies Causal Modeling o Causal Modeling using Observation Data o Indicative Case Studies o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection 4
5 Talk Outline Predictive Modeling o Brief Introduction to Predictive Modeling o Indicative Case Studies Causal Modeling o Causal Modeling using Observation Data o Indicative Case Studies o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection 5
6 Predictive Models : the Goal 6
7 Example of Predictive Modeling : Support Vector Machines (SVMs) Key Characteristics of SVM Maximum gap to prevent overfitting QP problems can be solved with standard methods. Soft margins to tolerate noise Kernel trick for linearly non-separable data Boser et al.1992; Statnikov et al., 2011 Support Vector Machine 7
8 Predictive Models : the Goal 8
9 Predictive Modeling: a Simplified General Framework 9
10 Predictive Modeling: Cross validation for performance estimation and model selection Ma et al., 2015 (in preparation) 10
11 Talk Outline Predictive Modeling o Brief Introduction to Predictive Modeling o Indicative Case Studies Causal Modeling and its Applications o Causal Modeling using Observation Data o Indicative Case Studies o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection 11
12 Predictive Modeling for Post-traumatic Stress Post-traumatic Stress Response: Almost everyone experience at least one traumatic event in their life. Most people display acute stress responses. Acute stress responses diminish over time in most individuals, but about 10% - 20% people experience non-remitting stress responses long after the trauma. Persistent stress is detrimental to Physiological and psychological well-being of individuals. Galatzer-Levy et al., 2015; Ma et al. 2015; Galatzer-Levy et al., 2015 (submitted) 12
13 Predictive Modeling for Post-traumatic Stress Discovery Goals/Questions: Can we identify the people who will suffer from nonremitting stress responses? If so, can they be identified early enough? What types of data need to be collected to identify people who will suffer from non-remitting stress responses? 13
14 Predictive Modeling for Post-traumatic Stress Data: trauma survivors that were admitted to the ER were followed up to 4 month after the trauma. Patient history, clinical data, stress hormones, psychiatric related measurements were collected in the ER, 1 week, 1 month, and 4 month after the trauma. A total number of 135 variables were collected. 14
15 Predictive Modeling for Post-traumatic Stress Remitting and Non-remitting Post-traumatic Stress Responses (Identified via Latent Growth Mixture Modeling) 15
16 Predictive Modeling for Post-traumatic Stress Discovery Goals/Questions: Can we identify the people who will suffer from nonremitting stress responses? If so, can they be identified early enough? What types of data need to be collected to identify people who will suffer from non-remitting stress responses? 16
17 Predictive Model for Post-traumatic Stress Study Design: Five predictive models were build using data incorporating increasing amounts of information: (1) background data (2) Data collected through ER (3) Data collected through 1 week (4) Data collected through 1 month (5) Data collected though 4 month SVM with feature selection was employed, with 10 split 5 fold cross-validation 17
18 Predictive Modeling for Post-traumatic Stress Prediction accuracy increases progressively as data collected at later time points are added to the predictive models. Predictivity of the model built with patient background information is statistically significant. Model built with patient background information and data collected in the ER have strong enough predictive performance to be clinically useful. 18
19 Predictive Modeling for Post-traumatic Stress Discovery Goals/Questions: Can we identify the people who will suffer from nonremitting stress responses? If so, can they be identified early enough? What types of data need to be collected to identify people who will suffer from non-remitting stress responses? Specifically, can neuroendocrine levels predict non-remitting post-traumatic stress? 19
20 Predictive Modeling for Post-traumatic Stress Neuroendocrine data studied contain limited information for non-remitting stress response. Except at the time of ER, combining neuroendocrine and other data (clinical information, psychiatric surveys) do not significantly increase predictivity of the models. 20
21 Other Case Studies for Predicting Modeling Predicting Cancer Patient Outcome Predicting Neural Activity in the Dorsolateral Striatum Predicting Transposon Insertion 21
22 Other Case Studies for Predicting Modeling Predicting Cancer Patient Outcome Problem: Determine the most informative data modality for predicting cancer patient outcome Data: 47 datasets/predictive tasks that in total span over 9 data modalities including copy number, gene expression, protein expression, mico-rna expression, imaging, GWAS, somatic mutation, methylation, and clinical information. Conclusion: Gene expression is in generally the most informative data modality. Combining different data modality do not increase predictive performance. Ray MS, Henaff MS, Aliferis PhD, Statnikov Ray et al.,
23 Other Case Studies for Predicting Modeling Predicting Neural Activity in the Dorsolateral Striatum (DLS) Problem: Predict neural activity from movement data Data: Single Neuron Activity in the DLS Head Movement Tracking Data Model: Linear-Non-linear-Poisson Model to predict neural activity from head movement profile of the animal and spike history of the neuron. Reconstructed neural activity in subpopulation of the neurons. David Barker NIDA Ma and Barker,
24 Other Case Studies for Predicting Modeling Predicting Transposon Insertion Problem: Identify transposon insertion location in the genome. Data: Targeted Sequencing Data. Model: train logistic regression model on a set of annotated transposon insertion sites and apply the model for de-novo insertion identification. More than 95% of the de-novo insertion identified by the model was validated by experiments. Zuojian Tang MS, David Fenyo PhD, Jeff Boeke Langone, Kathleen JHU 24
25 Talk Outline Predictive Modeling o Brief Introduction to Predictive Modeling o Indicative Case Studies Causal Modeling o Causal Modeling using Observation Data o Indicative Case Studies o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection 25
26 Causal Modeling: the Goal 26
27 Causal Modeling: the Goal 27
28 Causal Modeling: Causal graphs Capture Direct, Indirect Relationships 28
29 Causal Modeling: V-structures a Common Technique for Orienting Causal Relationships 29
30 Casual Modeling: PC Algorithm a prototypical causal discovery algorithm PC algorithm: Skeleton Discovery Sprites et al.,
31 Casual Modeling: PC Algorithm PC algorithm: Skeleton Discovery, Trace 31
32 Casual Modeling: PC Algorithm PC algorithm: Orientation 32
33 Causal Modeling: HITON-PC Algorithm A B E T D C Local causal discovery method Easily extended for global causal discovery with the LGL framework. Aliferis et al.,
34 Causal Modeling: HITON-PC Algorithm Trace of HITON-PC A B E T D C 34
35 Causal Modeling: Semi-Interleaved HITON-PC a more efficient implementation Efficient, and robust. Scalable to very BIG DATA. Easily extended for global causal discovery with the LGL framework. An instantiation of the GLL framework. 35
36 Talk Outline Predictive Modeling o Brief Introduction to Predictive Modeling o Indicative Case Studies Causal Modeling o Causal Modeling using Observation Data o Indicative Case Studies o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection 36
37 Causal Modeling for Post-traumatic Stress Study Data: trauma survivors that were admitted to the ER were followed up to 4 month after the trauma. Patient history, clinical data, stress hormones, psychiatric related measurements were collected in the ER, 1 week, 1 month, and 4 month after the trauma. A total number of 135 variables were collected. Galatzer-Levy et al., 2015; Ma et al. 2015; Galatzer-Levy et al., 2015 (submitted) 37
38 Causal Model for Post-traumatic Stress Causal Discovery Question: What are the factors determining non-remitting stress responses? Analysis Design: Apply local causal discovery algorithms (HITON-PC) to find the parent children sets for all measured variables A global causal graph depicting the relationship among all measured variables were constructed using the local to global framework LGL. Edges were oriented according the time that individual variables were measured. 38
39 Causal Modeling for Post-traumatic Stress The Global Causal Graph A very complicated model! 39
40 Causal Modeling for Post-traumatic Stress Example Causal Path Leading to non-remitting Stress Responses 40
41 Causal Modeling for Post-traumatic Stress Potential intervention for non-remitting Stress Responses 41
42 Causal Modeling for Post-traumatic Stress Potential Intervention for non-remitting Stress Responses 42
43 Talk Outline Predictive Modeling o Brief Introduction to Predictive Modeling o Indicative Case Studies Causal Modeling o Causal Modeling using Observation Data o Indicative Case Studies o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection 43
44 Causal Model Guided Experimental Minimization and Adaptive Data Collection Goals: Reduce number of experiments that experimentalists need to do in order to fully resolve a biological pathway (or other complex set of causal interactions among variables of interest). Reduce time to discovery Reduce costs 44
45 Causal Model Guided Experimental Minimization and Adaptive Data Collection Special Importance In Health Sciences with both omics data and clinical data: One variable could be univariately associated with hundred to thousand variables: Drivers: direct and indirect Passengers Effects High degree of multiplicity. Classical statistical techniques exhibit both increased false positives and negatives 45
46 Causal Model-Guided Experimental Minimization and Adaptive Data Collection Simplified view of the Framework: 46
47 Causal Model Guided Experimental Minimization and Adaptive Data Collection The ODLP Algorithm: Output: Local causal pathway (parents and children) of the variable of interest. Two Phases: Identify local causal pathway consistent with the data and information equivalent clusters. Adaptively recommend experiments to perform, integrate experimental results to refine and orient the local causal pathway. Statnikov et al., 2015 (Accepted) 47
48 Causal Model Guided Experimental Minimization and Adaptive Data Collection ODLP: Pseudo Code: The ODLP Algorithm: Output: Local causal pathway (parents and children) of the variable of interest. Two Phases: Identify local causal pathway consistent with the data and information equivalent clusters. Adaptively recommend experiments to perform, integrate experimental results to refine and orient the local causal pathway. 48
49 Causal Model Guided Experimental Minimization and Adaptive Data Collection The ODLP Algorithm Phase I: Identify local causal pathway consistent with the data and information equivalent clusters (TIE*, itie* algorithms). 49
50 Causal Model Guided Experimental Minimization and Adaptive Data Collection The ODLP Algorithm Phase I: itie* 50
51 Causal Model Guided Experimental Minimization and Adaptive Data Collection The ODLP Algorithm Phase II: Adaptively recommend experiments to perform, integrate experimental results to refine and orient the local causal pathway. (i.e. Identify Causes, Effects, and Passengers). 51
52 Causal Model Guided Experimental Minimization and Adaptive Data Collection ODLP: Identifying effects Manipulate T and obtain experimental data D E. Mark all variables in V that change in D E due to manipulation of T as effects. effects 52
53 Causal Model Guided Experimental Minimization and Adaptive Data Collection ODLP: direct and indirect effects Select an effect variable X that has neither been marked as indirect effect nor as direct effect. Manipulate X and obtain experimental data D E. Mark all effect variables that change in D E due to manipulation of X and belong to the same equivalence cluster as indirect effects. The last effect variable in an equivalent cluster that is not marked as indirect effect is a direct effect. Indirect effect 53
54 Causal Model Guided Experimental Minimization and Adaptive Data Collection ODLP: Identifying Passengers Select an unmarked variable X from an equivalence cluster. Manipulate X and obtain experimental data D E. If T does not change in D E due to manipulation of X, mark X as a passenger and mark all other non-effect variables that change in D E due to manipulation of X as passengers; otherwise mark X as a cause. Passengers 54
55 Causal Model Guided Experimental Minimization and Adaptive Data Collection ODLP: Identifying Causes For every cause X, mark X as a direct cause if there exist no other cause in the same equivalence cluster that changes due to manipulation of X; otherwise mark X as an Indirect cause. If there is an equivalence cluster that contains a single unmarked variable X and all marked variables in this cluster (if any) are only passengers and/or effects, then mark X as a direct cause. 55
56 Causal Model Guided Experimental Minimization and Adaptive Data Collection ODLP vs Other Algorithms: Performance on Simulated Data Benchmark study 58 algorithms/variant from 4 algorithm families. 11 networks of different sizes. Statnikov et al., 2015 (Accepted) 56
57 Causal Model Guided Experimental Minimization and Adaptive Data Collection ODLP vs Other Algorithms: Network Reconstruction Quality 57
58 Causal Model Guided Experimental Minimization and Adaptive Data Collection ODLP vs Other Algorithms: Reconstruction Quality & Efficiency 58
59 Causal Model Guided Experimental Minimization and Adaptive Data Collection ODLP vs Other Algorithms: Scalability 59
60 Causal Model Guided Experimental Minimization and Adaptive Data Collection ODLP vs Other Algorithms: Performance on Real Biological Data Ma et al., 2015 (submitted) 60
61 Causal Model Guided Experimental Minimization and Adaptive Data Collection ODLP vs Other Algorithms: Performance on Real Biological Data 61
62 Summary Predictive Modeling o Brief Introduction to Predictive Modeling o Indicative Case Studies Causal Modeling o Causal Modeling using Observation Data o Indicative Case Studies o Causal Modeling- Guided Experimental Minimization and Adaptive Data Collection 62
63 Future directions Improve Existing algorithms (e.g., relax some application assumptions). Design and Implement Analysis Pipelines that can be used by non experts. Disseminate Software and Analytics Packages. Apply these techniques broadly in different domains. Educate researchers about the capabilities (and limitations) as well as proper use of these and related methods. 63
Alexander Statnikov, Ph.D.
Alexander Statnikov, Ph.D. Director, Computational Causal Discovery Laboratory Benchmarking Director, Best Practices Integrative Informatics Consultation Service Assistant Professor, Department of Medicine,
More informationSmart India Hackathon
TM Persistent and Hackathons Smart India Hackathon 2017 i4c www.i4c.co.in Digital Transformation 25% of India between age of 16-25 Our country needs audacious digital transformation to reach its potential
More informationNeural Networks and Applications in Bioinformatics. Yuzhen Ye School of Informatics and Computing, Indiana University
Neural Networks and Applications in Bioinformatics Yuzhen Ye School of Informatics and Computing, Indiana University Contents Biological problem: promoter modeling Basics of neural networks Perceptrons
More informationBIOINFORMATICS THE MACHINE LEARNING APPROACH
88 Proceedings of the 4 th International Conference on Informatics and Information Technology BIOINFORMATICS THE MACHINE LEARNING APPROACH A. Madevska-Bogdanova Inst, Informatics, Fac. Natural Sc. and
More informationNeural Networks and Applications in Bioinformatics
Contents Neural Networks and Applications in Bioinformatics Yuzhen Ye School of Informatics and Computing, Indiana University Biological problem: promoter modeling Basics of neural networks Perceptrons
More informationIntroduction to BIOINFORMATICS
COURSE OF BIOINFORMATICS a.a. 2016-2017 Introduction to BIOINFORMATICS What is Bioinformatics? (I) The sinergy between biology and informatics What is Bioinformatics? (II) From: http://www.bioteach.ubc.ca/bioinfo2010/
More informationMethods for Multi-Category Cancer Diagnosis from Gene Expression Data: A Comprehensive Evaluation to Inform Decision Support System Development
1 Methods for Multi-Category Cancer Diagnosis from Gene Expression Data: A Comprehensive Evaluation to Inform Decision Support System Development Alexander Statnikov M.S., Constantin F. Aliferis M.D.,
More informationin Biomedicine A Gentle Introduction to Support Vector Machines Volume 1: Theory and Methods
A Gentle Introduction to Support Vector Machines in Biomedicine Volume 1: Theory and Methods This page intentionally left blank A Gentle Introduction to Support Vector Machines in Biomedicine Volume 1:
More informationLearning theory: SLT what is it? Parametric statistics small number of parameters appropriate to small amounts of data
Predictive Genomics, Biology, Medicine Learning theory: SLT what is it? Parametric statistics small number of parameters appropriate to small amounts of data Ex. Find mean m and standard deviation s for
More informationKnowledge-Guided Analysis with KnowEnG Lab
Han Sinha Song Weinshilboum Knowledge-Guided Analysis with KnowEnG Lab KnowEnG Center Powerpoint by Charles Blatti Knowledge-Guided Analysis KnowEnG Center 2017 1 Exercise In this exercise we will be doing
More informationComplex Adaptive Systems Forum: Transformative CAS Initiatives in Biomedicine
Complex Adaptive Systems Forum: Transformative CAS Initiatives in Biomedicine January 18, 2013 Anna D. Barker, Ph.D. Director, Transformative Healthcare Networks C-Director, Complex Adaptive Systems Initiative
More informationData Mining for Biological Data Analysis
Data Mining for Biological Data Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Data Mining Course by Gregory-Platesky Shapiro available at www.kdnuggets.com Jiawei Han
More information2017 HTS-CSRS COMMUNITY PUBLIC WORKSHOP
2017 HTS-CSRS COMMUNITY PUBLIC WORKSHOP GenomeNext Overview Olympus Platform The Olympus Platform provides a continuous workflow and data management solution from the sequencing instrument through analysis,
More informationIntroduction. CS482/682 Computational Techniques in Biological Sequence Analysis
Introduction CS482/682 Computational Techniques in Biological Sequence Analysis Outline Course logistics A few example problems Course staff Instructor: Bin Ma (DC 3345, http://www.cs.uwaterloo.ca/~binma)
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics Richard Corbett Canada s Michael Smith Genome Sciences Centre Vancouver, British Columbia June 28, 2017 Our mandate is to advance knowledge about cancer and other diseases
More informationStatistical Machine Learning Methods for Bioinformatics VI. Support Vector Machine Applications in Bioinformatics
Statistical Machine Learning Methods for Bioinformatics VI. Support Vector Machine Applications in Bioinformatics Jianlin Cheng, PhD Computer Science Department and Informatics Institute University of
More informationInferring Gene Networks from Microarray Data using a Hybrid GA p.1
Inferring Gene Networks from Microarray Data using a Hybrid GA Mark Cumiskey, John Levine and Douglas Armstrong johnl@inf.ed.ac.uk http://www.aiai.ed.ac.uk/ johnl Institute for Adaptive and Neural Computation
More informationPioneering Clinical Omics
Pioneering Clinical Omics Clinical Genomics Strand NGS An analysis tool for data generated by cutting-edge Next Generation Sequencing(NGS) instruments. Strand NGS enables read alignment and analysis of
More informationC-14 FINDING THE RIGHT SYNERGY FROM GLMS AND MACHINE LEARNING. CAS Annual Meeting November 7-10
1 C-14 FINDING THE RIGHT SYNERGY FROM GLMS AND MACHINE LEARNING CAS Annual Meeting November 7-10 GLM Process 2 Data Prep Model Form Validation Reduction Simplification Interactions GLM Process 3 Opportunities
More informationClassification of DNA Sequences Using Convolutional Neural Network Approach
UTM Computing Proceedings Innovations in Computing Technology and Applications Volume 2 Year: 2017 ISBN: 978-967-0194-95-0 1 Classification of DNA Sequences Using Convolutional Neural Network Approach
More informationCapabilities & Services
Capabilities & Services Accelerating Research & Development Table of Contents Introduction to DHMRI 3 Services and Capabilites: Genomics 4 Proteomics & Protein Characterization 5 Metabolomics 6 In Vitro
More informationAnalytics Behind Genomic Testing
A Quick Guide to the Analytics Behind Genomic Testing Elaine Gee, PhD Director, Bioinformatics ARUP Laboratories 1 Learning Objectives Catalogue various types of bioinformatics analyses that support clinical
More informationOncoMD User Manual Version 2.6. OncoMD: Cancer Analytics Platform
OncoMD: Cancer Analytics Platform 1 Table of Contents 1. INTRODUCTION... 3 2. OVERVIEW OF ONCOMD... 3 3. ORGANIZATION OF INFORMATION IN ONCOMD... 3 4. GETTING STARTED... 6 4.1 USER AUTHENTICATION... 6
More informationACCELERATING GENOMIC ANALYSIS ON THE CLOUD. Enabling the PanCancer Analysis of Whole Genomes (PCAWG) consortia to analyze thousands of genomes
ACCELERATING GENOMIC ANALYSIS ON THE CLOUD Enabling the PanCancer Analysis of Whole Genomes (PCAWG) consortia to analyze thousands of genomes Enabling the PanCancer Analysis of Whole Genomes (PCAWG) consortia
More informationRepresentation in Supervised Machine Learning Application to Biological Problems
Representation in Supervised Machine Learning Application to Biological Problems Frank Lab Howard Hughes Medical Institute & Columbia University 2010 Robert Howard Langlois Hughes Medical Institute What
More informationGene expression connectivity mapping and its application to Cat-App
Gene expression connectivity mapping and its application to Cat-App Shu-Dong Zhang Northern Ireland Centre for Stratified Medicine University of Ulster Outline TITLE OF THE PRESENTATION Gene expression
More informationDNA. Clinical Trials. Research RNA. Custom. Reports CLIA CAP GCP. Tumor Genomic Profiling Services for Clinical Trials
Tumor Genomic Profiling Services for Clinical Trials Custom Reports DNA RNA Focused Gene Sets Clinical Trials Accuracy and Content Enhanced NGS Sequencing Extended Panel, Exomes, Transcriptomes Research
More informationThis place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.
G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY Methods or systems for genetic
More informationTitle: Genome-Wide Predictions of Transcription Factor Binding Events using Multi- Dimensional Genomic and Epigenomic Features Background
Title: Genome-Wide Predictions of Transcription Factor Binding Events using Multi- Dimensional Genomic and Epigenomic Features Team members: David Moskowitz and Emily Tsang Background Transcription factors
More informationExploring the Genetic Basis of Congenital Heart Defects
Exploring the Genetic Basis of Congenital Heart Defects Sanjay Siddhanti Jordan Hannel Vineeth Gangaram szsiddh@stanford.edu jfhannel@stanford.edu vineethg@stanford.edu 1 Introduction The Human Genome
More informationAssay Validation Services
Overview PierianDx s assay validation services bring clinical genomic tests to market more rapidly through experimental design, sample requirements, analytical pipeline optimization, and criteria tuning.
More informationOur website:
Biomedical Informatics Summer Internship Program (BMI SIP) The Department of Biomedical Informatics hosts an annual internship program each summer which provides high school, undergraduate, and graduate
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics Alla L Lapidus, Ph.D. SPbSU St. Petersburg Term Bioinformatics Term Bioinformatics was invented by Paulien Hogeweg (Полина Хогевег) and Ben Hesper in 1970 as "the study of
More informationChristoph Bock ICPerMed First Research Workshop Milano, 26 June 2017
New Tools for Personalized Medicine *Tools = Assays, Devices, Software Christoph Bock ICPerMed First Research Workshop Milano, 26 June 2017 http://epigenomics.cemm.oeaw.ac.at http://biomedical-sequencing.at
More informationData representation for clinical data and metadata
Data representation for clinical data and metadata WP1: Data representation for clinical data and metadata Inconsistent terminology creates barriers to identifying common clinical entities in disparate
More informationBIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM)
BIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM) PROGRAM TITLE DEGREE TITLE Master of Science Program in Bioinformatics and System Biology (International Program) Master of Science (Bioinformatics
More informationOur view on cdna chip analysis from engineering informatics standpoint
Our view on cdna chip analysis from engineering informatics standpoint Chonghun Han, Sungwoo Kwon Intelligent Process System Lab Department of Chemical Engineering Pohang University of Science and Technology
More informationMediSapiens Ltd. Because data is not knowledge. 4th of November Sami Kilpinen, Ph.D Co-founder, CEO MediSapiens Ltd
4th of November 2014 MediSapiens Ltd Because data is not knowledge Sami Kilpinen, Ph.D Co-founder, CEO MediSapiens Ltd Copyright 2014 MediSapiens Ltd. All rights reserved. medisapiens.com MediSapiens Brief
More informationHITON, A Novel Markov Blanket Algorithm for Optimal Variable Selection
HITON, A Novel Markov Blanket Algorithm for Optimal Variable Selection C.F. Aliferis M.D., Ph.D., I. Tsamardinos Ph.D., Alexander Statnikov M.S. Department of Biomedical Informatics, Vanderbilt University,
More information2017 Qualifying Examination
B1 1 Basic Molecular Genetics Mechanisms Dr. Ueng-Cheng Yang Molecular Genetics Techniques Cellular Energetics 24 2 Dr. Dar-Yi Wang Transcriptional Control of Gene Expression 8 3 Dr. Chuan-Hsiung Chang
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics If the 19 th century was the century of chemistry and 20 th century was the century of physic, the 21 st century promises to be the century of biology...professor Dr. Satoru
More informationMachine Learning in Computational Biology CSC 2431
Machine Learning in Computational Biology CSC 2431 Lecture 9: Combining biological datasets Instructor: Anna Goldenberg What kind of data integration is there? What kind of data integration is there? SNPs
More informationComputational Challenges of Medical Genomics
Talk at the VSC User Workshop Neusiedl am See, 27 February 2012 [cbock@cemm.oeaw.ac.at] http://medical-epigenomics.org (lab) http://www.cemm.oeaw.ac.at (institute) Introducing myself to Vienna s scientific
More informationGene Expression Data Analysis
Gene Expression Data Analysis Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu BMIF 310, Fall 2009 Gene expression technologies (summary) Hybridization-based
More informationAGILENT S BIOINFORMATICS ANALYSIS SOFTWARE
ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS
More informationGene Therapy: The Basics. Mark A. Kay MD PhD Dennis Farrey Family Professor Stanford University
Gene Therapy: The Basics Mark A. Kay MD PhD Dennis Farrey Family Professor Stanford University Definition of gene therapy Gene therapy is the introduction of nucleic acids (e.g. DNA/genes) into somatic
More informationResearch Powered by Agilent s GeneSpring
Research Powered by Agilent s GeneSpring Agilent Technologies, Inc. Carolina Livi, Bioinformatics Segment Manager Research Powered by GeneSpring Topics GeneSpring (GS) platform New features in GS 13 What
More informationWhat is Genetic Engineering?
Selective Breeding Selective Breeding is when someone (humans) breed organisms with specific traits in order to produce offspring having those same traits. This is also called artificial selection, the
More informationIntroduction to Machine Learning for Longitudinal Medical Data
Introduction to Machine Learning for Longitudinal Medical Data Orlando Doehring, Ph.D. Unit 2, 2A Bollo Lane, London W4 5LE, UK orlando.doehring@phastar.com Machine learning for healthcare 1 Machine Learning
More informationILLUMINA SEQUENCING SYSTEMS
ILLUMINA SEQUENCING SYSTEMS PROVEN QUALITY. TRUSTED SOLUTIONS. Every day, researchers are using Illumina next-generation sequencing (NGS) systems to better understand human health and disease, as well
More informationFirst Annual Biomarker Symposium Quest Diagnostics Clinical Trials
First Annual Biomarker Symposium Quest Diagnostics Clinical Trials Terry Robins, Ph.D. Director Biomarker R&D and Scientific Affairs Quest Diagnostics Clinical Trials Key Considerations: Biomarker Development
More informationMachine Learning Models for Classification of Lung Cancer and Selection of Genomic Markers Using Array Gene Expression Data
Machine Learning Models for Classification of Lung Cancer and Selection of Genomic Markers Using Array Gene Expression Data C.F. Aliferis 1, I. Tsamardinos 1, P.P. Massion 2, A. Statnikov 1, N. Fananapazir
More informationMachine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University
Machine learning applications in genomics: practical issues & challenges Yuzhen Ye School of Informatics and Computing, Indiana University Reference Machine learning applications in genetics and genomics
More informationSAS Microarray Solution for the Analysis of Microarray Data. Susanne Schwenke, Schering AG Dr. Richardus Vonk, Schering AG
for the Analysis of Microarray Data Susanne Schwenke, Schering AG Dr. Richardus Vonk, Schering AG Overview Challenges in Microarray Data Analysis Software for Microarray Data Analysis SAS Scientific Discovery
More informationInferring Gene-Gene Interactions and Functional Modules Beyond Standard Models
Inferring Gene-Gene Interactions and Functional Modules Beyond Standard Models Haiyan Huang Department of Statistics, UC Berkeley Feb 7, 2018 Background Background High dimensionality (p >> n) often results
More informationThe Sentieon Genomic Tools Improved Best Practices Pipelines for Analysis of Germline and Tumor-Normal Samples
The Sentieon Genomic Tools Improved Best Practices Pipelines for Analysis of Germline and Tumor-Normal Samples Andreas Scherer, Ph.D. President and CEO Dr. Donald Freed, Bioinformatics Scientist, Sentieon
More informationBig Data Standards and the Potential Long-Term Benefits for Research and Clinical Development
Big Data Standards and the Potential Long-Term Benefits for Research and Clinical Development Eric Engelhard, Ph.D. Director of Informatics Mouse Biology Program UC Davis ekengelhard@ucdavis.edu UC Davis
More information2. Materials and Methods
Identification of cancer-relevant Variations in a Novel Human Genome Sequence Robert Bruggner, Amir Ghazvinian 1, & Lekan Wang 1 CS229 Final Report, Fall 2009 1. Introduction Cancer affects people of all
More informationCorporate Overview. December Erik Holmlin President & CEO
Corporate Overview December 2018 Erik Holmlin President & CEO Forward-Looking Statements This presentation contains forward-looking statements. Forward-looking statements describe future expectations,
More informationIntroduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics
Introduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics abedi777@ymail.com Outlines Technology Basic concepts Data analysis Printed Microarrays In Situ-Synthesized
More informationLecture 8: Predicting and analyzing metagenomic composition from 16S survey data
Lecture 8: Predicting and analyzing metagenomic composition from 16S survey data What can we tell about the taxonomic and functional stability of microbiota? Why? Nature. 2012; 486(7402): 207 214. doi:10.1038/nature11234
More informationGrowing Needs for Practical Molecular Diagnostics: Indonesia s Preparedness for Current Trend
Growing Needs for Practical Molecular Diagnostics: Indonesia s Preparedness for Current Trend Dr. dr. Francisca Srioetami Tanoerahardjo, SpPK., MSi Essential Practical Molecular Diagnostics Seminar Hotel
More informationProteogenomics. Kelly Ruggles, Ph.D. Proteomics Informatics Week 9
Proteogenomics Kelly Ruggles, Ph.D. Proteomics Informatics Week 9 Proteogenomics: Intersection of proteomics and genomics As the cost of high-throughput genome sequencing goes down whole genome, exome
More informationRNA-SEQUENCING ANALYSIS
RNA-SEQUENCING ANALYSIS Joseph Powell SISG- 2018 CONTENTS Introduction to RNA sequencing Data structure Analyses Transcript counting Alternative splicing Allele specific expression Discovery APPLICATIONS
More informationThe application of hidden markov model in building genetic regulatory network
J. Biomedical Science and Engineering, 2010, 3, 633-637 doi:10.4236/bise.2010.36086 Published Online June 2010 (http://www.scirp.org/ournal/bise/). The application of hidden markov model in building genetic
More informationMultivariate Methods to detecting co-related trends in data
Multivariate Methods to detecting co-related trends in data Canonical correlation analysis Partial least squares Co-inertia analysis Classical CCA and PLS require n>p. Can apply Penalized CCA and sparse
More informationGenetics and Bioinformatics
Genetics and Bioinformatics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Lecture 1: Setting the pace 1 Bioinformatics what s
More informationInformation Driven Biomedicine. Prof. Santosh K. Mishra Executive Director, BII CIAPR IV Shanghai, May
Information Driven Biomedicine Prof. Santosh K. Mishra Executive Director, BII CIAPR IV Shanghai, May 21 2004 What/How RNA Complexity of Data Information The Genetic Code DNA RNA Proteins Pathways Complexity
More informationAnalysis of RNA-seq Data. Feb 8, 2017 Peikai CHEN (PHD)
Analysis of RNA-seq Data Feb 8, 2017 Peikai CHEN (PHD) Outline What is RNA-seq? What can RNA-seq do? How is RNA-seq measured? How to process RNA-seq data: the basics How to visualize and diagnose your
More informationWhat is Evolutionary Computation? Genetic Algorithms. Components of Evolutionary Computing. The Argument. When changes occur...
What is Evolutionary Computation? Genetic Algorithms Russell & Norvig, Cha. 4.3 An abstraction from the theory of biological evolution that is used to create optimization procedures or methodologies, usually
More informationBIOSTATISTICS AND MEDICAL INFORMATICS (B M I)
Biostatistics and Medical Informatics (B M I) 1 BIOSTATISTICS AND MEDICAL INFORMATICS (B M I) B M I/POP HLTH 451 INTRODUCTION TO SAS PROGRAMMING FOR 2 credits. Use of the SAS programming language for the
More informationIn silico prediction of novel therapeutic targets using gene disease association data
In silico prediction of novel therapeutic targets using gene disease association data, PhD, Associate GSK Fellow Scientific Leader, Computational Biology and Stats, Target Sciences GSK Big Data in Medicine
More informationDescription of expands
Description of expands Noemi Andor June 29, 2018 Contents 1 Introduction 1 2 Data 2 3 Parameter Settings 3 4 Predicting coexisting subpopulations with ExPANdS 3 4.1 Cell frequency estimation...........................
More informationTextbook Reading Guidelines
Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Last updated: May 1, 2009 Textbook Reading Guidelines Preface: Read the whole preface, and especially: For the students with Life Science
More informationSoftware Engineering. Engineering & Technology. Applied Sciences. Domain Knowledge. Robust Processes
Software Engineering Applied Sciences Engineering & Technology Domain Knowledge Robust Processes End User Applications Application Integration Lab Informatics Instrument Integration Control-Monitor Firmware
More informationBiomedical Big Data and Precision Medicine
Biomedical Big Data and Precision Medicine Jie Yang Department of Mathematics, Statistics, and Computer Science University of Illinois at Chicago October 8, 2015 1 Explosion of Biomedical Data 2 Types
More informationSupport Vector Machines (SVMs) for the classification of microarray data. Basel Computational Biology Conference, March 2004 Guido Steiner
Support Vector Machines (SVMs) for the classification of microarray data Basel Computational Biology Conference, March 2004 Guido Steiner Overview Classification problems in machine learning context Complications
More informationG E N OM I C S S E RV I C ES
GENOMICS SERVICES ABOUT T H E N E W YOR K G E NOM E C E N T E R NYGC is an independent non-profit implementing advanced genomic research to improve diagnosis and treatment of serious diseases. Through
More informationInformation Technology for Genetic and Genomic Based Personalized Medicine. Submitted. April 23, 2008
Information Technology for Genetic and Genomic Based Personalized Medicine Submitted April 23, 2008 By The Harvard Medical School Partners HealthCare Center for Genetics and Genomics in Partnership with
More informationBUSINESS DATA MINING (IDS 572) Please include the names of all team-members in your write up and in the name of the file.
BUSINESS DATA MINING (IDS 572) HOMEWORK 4 DUE DATE: TUESDAY, APRIL 10 AT 3:20 PM Please provide succinct answers to the questions below. You should submit an electronic pdf or word file in blackboard.
More informationCourse Presentation. Ignacio Medina Presentation
Course Index Introduction Agenda Analysis pipeline Some considerations Introduction Who we are Teachers: Marta Bleda: Computational Biologist and Data Analyst at Department of Medicine, Addenbrooke's Hospital
More informationClinician s Guide to Actionable Genes and Genome Interpretation
Clinician s Guide to Actionable Genes and Genome Interpretation Brandy Bernard PhD Senior Research Scientist Institute for Systems Biology Seattle, WA Dr. Bernard s research interests are in cancer drug
More informationM a x i m i z in g Value from NGS Analytics in t h e E n terprise
Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.935.4445 F.508.988.7881 www.idc-hi.com M a x i m i z in g Value from NGS Analytics in t h e E n terprise C U S T O M I N D U S T R Y B
More informationClinical and Translational Bioinformatics
Clinical and Translational Bioinformatics An Overview Jussi Paananen Institute of Biomedicine September 4 th, 2015 Bioinformatics Bioinformatics combines statistics, computer science and information technology
More informationPredicting prokaryotic incubation times from genomic features Maeva Fincker - Final report
Predicting prokaryotic incubation times from genomic features Maeva Fincker - mfincker@stanford.edu Final report Introduction We have barely scratched the surface when it comes to microbial diversity.
More informationOutline. Analysis of Microarray Data. Most important design question. General experimental issues
Outline Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization Introduction to microarrays Experimental design Data normalization Other data transformation Exercises George Bell,
More informationThe flow diagram below shows part of a process to produce a protein, using genetically modified plants.
1 Some organisms have been genetically modified to produce proteins including hormones and vaccines. The flow diagram below shows part of a process to produce a protein, using genetically modified plants.
More informationBasics of RNA-Seq. (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility
2018 ABRF Meeting Satellite Workshop 4 Bridging the Gap: Isolation to Translation (Single Cell RNA-Seq) Sunday, April 22 Basics of RNA-Seq (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly,
More informationNLM Funded Research Projects Involving Text Mining/NLP
NLM Funded Research Projects Involving Text Mining/NLP Jane Ye, PhD Program Officer Division of Extramural Programs 2017 BioCreative VI Workshop Funding Stakeholders Panel 1 NLM Grant Programs in Biomedical
More informationOntologies - Useful tools in Life Sciences and Forensics
Ontologies - Useful tools in Life Sciences and Forensics How today's Life Science Technologies can shape the Crime Sciences of tomorrow 04.07.2015 Dirk Labudde Mittweida Mittweida 2 Watson vs Watson Dr.
More informationOverview of Health Informatics. ITI BMI-Dept
Overview of Health Informatics ITI BMI-Dept Fellowship Week 5 Overview of Health Informatics ITI, BMI-Dept Day 10 7/5/2010 2 Agenda 1-Bioinformatics Definitions 2-System Biology 3-Bioinformatics vs Computational
More informationGENOMICS for DUMMIES
ØGC seminar 31. oktober 2013 GENOMICS for DUMMIES Torben A. Kruse Klinisk Genetisk Afdeling, Odense Universitetshospital Klinisk Institut, Syddansk Universitet Human MicroArray Center, OUH / SDU Årsag:
More information296 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 3, JUNE 2006
296 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 3, JUNE 2006 An Evolutionary Clustering Algorithm for Gene Expression Microarray Data Analysis Patrick C. H. Ma, Keith C. C. Chan, Xin Yao,
More informationProteomics And Cancer Biomarker Discovery. Dr. Zahid Khan Institute of chemical Sciences (ICS) University of Peshawar. Overview. Cancer.
Proteomics And Cancer Biomarker Discovery Dr. Zahid Khan Institute of chemical Sciences (ICS) University of Peshawar Overview Proteomics Cancer Aims Tools Data Base search Challenges Summary 1 Overview
More informationGenScale Scalable, Optimized and Parallel Algorithms for Genomics. Dominique LAVENIER
GenScale Scalable, Optimized and Parallel Algorithms for Genomics Dominique LAVENIER Context New Sequencing Technologies - NGS Exponential growth of genomic data Drastic decreasing of costs Emergence of
More informationIntelliSpace Genomics
IntelliSpace Genomics Challenges in Genomic Aberration Detection and Interpretation in Solid Tumors and Evidence based Approaches for Targeted Therapy Nevenka Dimitrova, PhD, CTO Genomics, Healthcare Informatics,
More informationThe Integrated Biomedical Sciences Graduate Program
The Integrated Biomedical Sciences Graduate Program at the university of notre dame Cutting-edge biomedical research and training that transcends traditional departmental and disciplinary boundaries to
More informationThe NHS approach to personalised medicine in respiratory disease. Professor Sue Chief Scientific Officer for England
The NHS approach to personalised medicine in respiratory disease Professor Sue Hill @CSOSue Chief Scientific Officer for England Jul 2017 Genomics is probably the biggest breakthrough in the last 50 years.
More informationWhole Genome Sequencing in Cancer Diagnostics (research) Nederlandse Pathologiedagen 19 & 20 November 2015
Whole Genome Sequencing in Cancer Diagnostics (research) Nederlandse Pathologiedagen 19 & 20 November 2015 Dr. I.J. Nijman Disclosure slide (Potential) conflict of interest None For this meeting relevant
More informationMedical Devices; Immunology and Microbiology Devices; Classification of the Next Generation
This document is scheduled to be published in the Federal Register on 06/22/2018 and available online at https://federalregister.gov/d/2018-13406, and on FDsys.gov 4164-01-P DEPARTMENT OF HEALTH AND HUMAN
More information