Unravelling `omics' data with the mixomics R package
|
|
- Nigel Watson
- 5 years ago
- Views:
Transcription
1 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Unravelling `omics' data with the R package Illustration on several studies Queensland Facility for Advanced Bioinformatics The University of Queensland
2 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Challenges The issue with integrative systems biology Unlimited quantity of data (n << p problem) Data from multiple sources Ecient and biologically relevant statistical methodologies are needed to combine the information in these heterogeneous data sets.
3 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio The biological questions Single Omics analysis Do we observe a `natural' separation between the dierent groups of patients? Can we identify potential biomarker candidates predicting the status of the patients? Integrative Omics analysis Can we identify a subset of correlated genes and proteins from matching data sets? Can we predict the abundance of a protein given the expression of a small subset of genes? Do two matching omics data set contain the same information?
4 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio The data The data n = number of patients p, q = number of biological features (genes, proteins..) Single Omics analysis one omic data set X(n x p) for a supervised analysis, Y vector indicating the class of the patients Integrative Omics analysis two matching omics data sets (measured on the same patients) X (n x p) and Z (n x q)
5 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Multivariate analysis Linear multivariate approaches enable: Dimension reduction project the data in a smaller subspace To handle multicollinear, irrelevant, missing variables To capture experimental annd biological variation In particular, in, focus is on: Data integration Variable selection Computationally ecient methodologies for large biological data sets Interpretable graphical outputs
6 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio is an R package dedicated to the exploration and the integrative analyses of high dimensional biological data sets. Website - R tutorials - Newsletter Web Interface - User friendly interface - Comprehensive results page -Lê Cao et al. (2009) integromics/: an R package to unravel relationships between two omics data sets, Bioinformatics
7 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Data Multivariate approach Internal variable selection Graphical outputs PCA IPCA spca sipca Sample plots One data set (n x p) PLS-DA spls-da Two matching data sets (n x p) & (n x q) PLS rcca PLS spls canonical spls regression Variable plots Missing values (n x p) imputation NIPALS
8 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Introduction with PCA Principal Component Analysis: PCA Seek the best directions in the data that account for most of the variability principal components: articial variables that are linear combinations of the original variables: c = X v (n) (nxp) (p) c is a linear function of the elements of X having maximal variance v is called the associated loading vector x2 x1
9 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Introduction with PCA Principal components cont. The new PCs form a vectorial subspace of dimension < p Project the data on these new axes. z z1 approximate representation of the data points in a lower dimensional space Problem: Interpretation dicult with very large number of (possibly) irrelevant variables
10 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Introduction with PCA sparse Principal Component Analysis: spca loading vectors sparse loading vectors Principal components (PCA) (spca) x x1 loading 1 loading loading 1 loading The principal components are linear combinations of the original variables, variables weights are dened in the associated loading vectors. sparse PCA computes the sparse loading vectors to remove irrelevant variables using lasso penalizations (Shen & Huang 2008, J. Multivariate Analysis).
11 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Independent Principal Component Analysis IPCA is based on Independent Component Analysis (ICA): assumes non Gaussian data distribution ( PCA). `blind source' signal separation. seeks for a set of independent components ( PCA). Combines the advantages of both PCA and ICA. The PCA loadings are transformed via ICA to obtain independent loading vectors and independent principal components. sparse IPCA also developed to select the variable contributing to the independent loading vectors Yao, F. Coquery, J. and Lê Cao, K-A Independent Principal Component Analysis for biologically meaningful dimension reduction of large biological data sets, BMC Bioinformatics.
12 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Independent Principal Component Analysis Illustration of PCA and IPCA Sample representation for the kidney transplant study (3 groups of rejection status patients)
13 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Discriminant Analysis PLS - Discriminant Analysis Similarly to Linear Discriminant Analysis, classical PLS-DA looks for the best components to separate the sample groups. As opposed to PCA/IPCA methods, it is a supervised approach. In addition to this, spls-da searches for discriminative variables that can help separating the sample groups. Evaluation of the discriminative power of the selected variables using external data sets or cross-validation. Lê Cao K-A., Boitard S. and Besse P. (2011) Sparse PLS Discriminant Analysis: biologically relevant feature selection and graphical displays for multiclass problems, BMC Bioinformatics, 12:253.
14 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Discriminant Analysis Illustration of spls-da Genomics training data Genomics error rate comp 1 comp 1:2 comp 1:3 Comp Train AR NR Test AR NR number of selected genes Comp 1 Figure: Tuning the number of var. to select with cross-validation Figure: Predicting the class of the test set samples
15 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Motivation Aims: unravel the correlation stucture between two data sets select co-regulated biological entities across samples select and integrate in a one step procedure the dierent types of data Partial Least Squares regression maximises the covariance between each linear combination (components) associated to each data set sparse PLS has been developed to include variable selection from both data sets Two modes are proposed to model the relationship between the two data sets ('regression' and 'canonical') Lê Cao et al. (2008), SAGMB, Lê Cao et al. (2009), BMC Bioinformatics
16 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Integrating genomics and proteomics Illustration of sparse PLS: sample plot spls aims at selecting correlated variables (genes, proteins) across the same samples by performing a multivariate regression. Regression: explain the protein abundance w.r.t the gene expression relationship. The latent variables (components) are determined based on the selected genes and proteins give more insight into the samples similarities. Unsupervised approach
17 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Probe set Protein group Dimension 1 Integrating genomics and proteomics Illustration of sparsepls: variable plot Relevance networks are bipartite graphs directly inferred from the spls components. Some other insightful graphical outputs: correlation circle plots clustered image maps Dimension González I., Lê Cao K.-A., Davis, M.D. and Déjean S. Visualising association between paired `omics' data sets. In revision.
18 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio H460 Comparing two proteomics platforms Illustration of spls canonical mode Selects correlated variables across the same samples and highlights the correlation structure between the two data sets. Canonical mode: relationship LEU RE BR COL LEU COL NS BR LEU LEU CNS COL OVNSRE COL COL BR OV BR RE REOV NS NS CNS OVPR RE BR NS MEL OV COL LEU RE COL RE CNS PR NS NS CNS RE NS OV CNS LEU MEL MEL MEL MEL MEL BR MEL BR MEL BR CNS COL LE MEL NS OV PR RE Figure: Arrow plot to highlight the similarties between 2 data sets
19 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Multilevel analysis Cross-over design Data Multivariate approach Internal variable selection Graphical outputs Sample plots One data set repeated measurements (n x p) supervised multilevel PLS-DA multilevel spls-da Two matching data sets repetated measurements (n x p) & (n x q) canonical, 1 level multilevel PLS multilevel spls Variable plots A novel approach for cross-over designs (repeated measurements up to 2 cross factors)
20 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Multilevel analysis Illustration of multilevel spls-da X variate 1 X variate 2 GAG+ GAG LIPO5 NS t1 t2 gene517 gene541 gene589 gene550 gene567 gene551 gene579 gene506 gene540 gene557 gene527 gene534 gene509 gene564 gene528 gene518 gene553 gene531 gene502 gene537 gene546 gene504 gene524 gene591 gene519 gene583 gene585 gene503 gene521 gene520 gene582 gene552 gene568 gene549 gene576 gene563 gene539 gene555 gene588 gene571 gene529 gene525 gene511 gene587 gene515 gene584 gene526 gene570 gene594 gene505 gene574 gene535 gene566 gene510 gene580 gene577 gene538 gene554 gene569 gene572 gene530 gene599 gene507 gene575 gene560 gene542 gene501 gene543 gene597 gene513 gene547 gene562 gene595 gene598 gene516 gene561 gene578 gene508 gene558 gene593 gene536 gene556 gene532 gene559 gene573 gene512 gene592 gene533 gene586 gene565 gene522 gene514 gene600 gene523 gene544 gene548 gene581 gene590 gene545 gene596 gene19 gene82 gene85 gene27 gene92 gene13 gene99 gene38 gene81 gene88 gene4 gene9 gene28 gene98 gene37 gene75 gene76 gene45 gene83 gene32 gene55 gene78 gene39 gene51 gene20 gene71 gene30 gene96 gene63 gene14 gene18 gene53 gene73 gene87 gene2 gene91 gene25 gene69 gene23 gene48 gene43 gene56 gene22 gene47 gene59 gene84 gene15 gene65 gene1 gene3 gene34 gene58 gene80 gene86 gene24 gene36 gene49 gene90 gene7 gene44 gene67 gene12 gene29 gene95 gene17 gene50 gene66 gene26 gene64 gene31 gene21 gene61 gene5 gene11 gene40 gene89 gene97 gene74 gene77 gene94 gene100 gene54 gene35 gene33 gene62 gene46 gene93 gene10 gene52 gene79 gene8 gene57 gene16 gene72 gene6 gene60 gene68 gene70 gene41 gene42 gene613 gene674 gene673 gene694 gene603 gene692 gene671 gene655 gene606 gene647 gene621 gene622 gene654 gene637 gene662 gene699 gene635 gene663 gene665 gene653 gene664 gene628 gene641 gene678 gene632 gene634 gene642 gene630 gene616 gene681 gene614 gene685 gene652 gene688 gene644 gene689 gene672 gene700 gene617 gene677 gene650 gene633 gene640 gene625 gene631 gene626 gene660 gene612 gene636 gene675 gene618 gene624 gene697 gene698 gene646 gene676 gene607 gene683 gene696 gene619 gene656 gene645 gene651 gene684 gene680 gene623 gene668 gene610 gene679 gene643 gene670 gene611 gene686 gene608 gene602 gene609 gene620 gene639 gene695 gene629 gene666 gene627 gene605 gene649 gene669 gene657 gene604 gene661 gene658 gene691 gene667 gene601 gene659 gene638 gene648 gene682 gene687 gene693 gene615 gene690 gene160 gene104 gene200 gene149 gene190 gene138 gene151 gene165 gene132 gene199 gene120 gene121 gene179 gene194 gene177 gene129 gene123 gene131 gene148 gene158 gene171 gene146 gene192 gene106 gene124 gene168 gene170 gene188 gene162 gene108 gene118 gene155 gene173 gene134 gene139 gene143 gene150 gene175 gene145 gene195 gene122 gene166 gene140 gene182 gene187 gene102 gene128 gene110 gene147 gene159 gene164 gene191 gene105 gene136 gene111 gene107 gene135 gene176 gene153 gene193 gene183 gene184 gene125 gene197 gene185 gene117 gene161 gene172 gene101 gene180 gene144 gene126 gene156 gene154 gene109 gene167 gene137 gene169 gene133 gene186 gene103 gene174 gene115 gene141 gene130 gene127 gene198 gene119 gene142 gene112 gene181 gene152 gene157 gene163 gene196 gene114 gene116 gene189 gene113 gene178 gene837 gene865 gene812 gene870 gene817 gene883 gene805 gene856 gene839 gene841 gene871 gene885 gene802 gene873 gene808 gene843 gene851 gene823 gene828 gene831 gene818 gene878 gene844 gene891 gene824 gene860 gene880 gene807 gene884 gene835 gene834 gene832 gene866 gene874 gene894 gene801 gene819 gene869 gene889 gene857 gene863 gene809 gene872 gene877 gene836 gene826 gene876 gene850 gene868 gene806 gene820 gene867 gene887 gene811 gene845 gene859 gene846 gene848 gene886 gene840 gene861 gene879 gene897 gene888 gene892 gene825 gene890 gene862 gene875 gene833 gene881 gene816 gene827 gene895 gene830 gene858 gene810 gene852 gene900 gene803 gene849 gene340 gene381 gene392 gene309 gene308 gene350 gene346 gene366 gene321 gene373 gene303 gene329 gene304 gene400 gene313 gene367 gene377 gene398 gene368 gene355 gene384 gene347 gene333 gene361 gene375 gene383 gene317 gene343 gene386 gene306 gene319 gene348 gene318 gene349 gene387 gene332 gene327 gene363 gene372 gene396 gene335 gene388 gene359 gene378 gene399 gene311 gene324 gene315 gene334 gene330 gene364 gene357 gene314 gene316 gene345 gene322 gene337 gene344 gene360 gene353 gene312 gene385 gene325 gene382 gene389 gene354 gene379 gene352 gene393 gene358 gene331 gene391 gene380 gene390 gene336 gene310 gene365 gene339 gene376 gene301 gene371 gene320 gene362 gene394 gene338 gene307 gene374 gene326 gene305 gene328 gene323 gene351 gene302 gene370 gene397 gene342 gene395 gene341 gene356 Sample Stimulation GAG+ GAG LIPO5 NS
21 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Conclusions Conclusions The exploratory and integrative approaches are: exible and can answer various types of questions. can highlight the potential of the data. enable to generate new hypotheses to be further investigated. Future work includes: Cross-platform comparison Integration of multiple data sets (unsupervised and supervised) Time-course experiments
22 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Conclusions Acknowledgements Sébastien Déjean Ignacio González Xin Yi Chua team Univ. Tlse Univ. Tlse QFAB Je Coquery Benoit Liquet Eric Fangzhou Yao Pierre Monget Other contributors to QFAB, Sup'Biotech Univ. Bordeaux QFAB, Shanghai Univ of Finance and Economics QFAB, CESI
23 Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Questions? Questions?
Multivariate Methods to detecting co-related trends in data
Multivariate Methods to detecting co-related trends in data Canonical correlation analysis Partial least squares Co-inertia analysis Classical CCA and PLS require n>p. Can apply Penalized CCA and sparse
More informationIntegration of heterogeneous omics data
Integration of heterogeneous omics data Andrea Rau March 11, 2016 Formation doctorale: Biologie expe rimentale animale et mode lisation pre dictive andrea.rau@jouy.inra.fr Integration of heterogeneous
More informationPattern Recognition of Glycyrrhiza uralensis Metabonomics on Rats with MixOmics Package of R Software
Available online at www.sciencedirect.com Procedia Engineering 24 (2011) 510 514 2011 International Conference on Advances in Engineering (ICAE 2011) Pattern Recognition of Glycyrrhiza uralensis Metabonomics
More informationThe Future of IntegrOmics
The Future of IntegrOmics Kristel Van Steen, PhD 2 kristel.vansteen@ulg.ac.be Systems and Modeling Unit, Montefiore Institute, University of Liège, Belgium Bioinformatics and Modeling, GIGA-R, University
More informationadvanced analysis of gene expression microarray data aidong zhang World Scientific State University of New York at Buffalo, USA
advanced analysis of gene expression microarray data aidong zhang State University of New York at Buffalo, USA World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI Contents
More informationGene Expression Data Analysis
Gene Expression Data Analysis Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu BMIF 310, Fall 2009 Gene expression technologies (summary) Hybridization-based
More informationAGILENT S BIOINFORMATICS ANALYSIS SOFTWARE
ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS
More informationThis place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.
G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY Methods or systems for genetic
More informationStatistically Integrated Metabonomic-Proteomic Studies on a Human Prostate Cancer Xenograft Model in Mice
Statistically Integrated Metabonomic-Proteomic Studies on a Human Prostate Cancer Xenograft Model in Mice Mattias Rantalainen, Olivier Cloarec, Olaf Beckonert, I. D. Wilson, David Jackson, Robert Tonge,
More informationSupport Vector Machines (SVMs) for the classification of microarray data. Basel Computational Biology Conference, March 2004 Guido Steiner
Support Vector Machines (SVMs) for the classification of microarray data Basel Computational Biology Conference, March 2004 Guido Steiner Overview Classification problems in machine learning context Complications
More informationInferring Gene-Gene Interactions and Functional Modules Beyond Standard Models
Inferring Gene-Gene Interactions and Functional Modules Beyond Standard Models Haiyan Huang Department of Statistics, UC Berkeley Feb 7, 2018 Background Background High dimensionality (p >> n) often results
More informationBioinformatics : Gene Expression Data Analysis
05.12.03 Bioinformatics : Gene Expression Data Analysis Aidong Zhang Professor Computer Science and Engineering What is Bioinformatics Broad Definition The study of how information technologies are used
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 16 Reader s reaction to Dimension Reduction for Classification with Gene Expression Microarray Data by Dai et al
More informationSingle-cell sequencing
Single-cell sequencing Harri Lähdesmäki Department of Computer Science Aalto University December 5, 2017 Contents Background & Motivation Single cell sequencing technologies Single cell sequencing data
More informationFinding molecular signatures from gene expression data: review and a new proposal
Finding molecular signatures from gene expression data: review and a new proposal Ramón Díaz-Uriarte rdiaz@cnio.es http://bioinfo.cnio.es/ rdiaz Unidad de Bioinformática Centro Nacional de Investigaciones
More informationPathways from the Genome to Risk Factors and Diseases via a Metabolomics Causal Network. Azam M. Yazdani, PhD
Pathways from the Genome to Risk Factors and Diseases via a Metabolomics Causal Network Azam M. Yazdani, PhD 1 Causal Inference in Observational Studies Example Question What is the effect of Aspirin on
More informationDiscriminant models for high-throughput proteomics mass spectrometer data
Proteomics 2003, 3, 1699 1703 DOI 10.1002/pmic.200300518 1699 Short Communication Parul V. Purohit David M. Rocke Center for Image Processing and Integrated Computing, University of California, Davis,
More informationA computational pipeline for the development of multi-marker bio-signature panels and ensemble classifiers. Günther et al.
A computational pipeline for the development of multi-marker bio-signature panels and ensemble s Günther et al. Günther et al. BMC Bioinformatics 2012, 13:326 Günther et al. BMC Bioinformatics 2012, 13:326
More informationReal-time RT PCR. for identification of differentially expressed genes. (with Schizophrenia application) Rolf Sundberg, Stockholm Univ.
Real-time RT PCR for identification of differentially expressed genes. (with Schizophrenia application) Rolf Sundberg, Stockholm Univ. Göteborg, May 2006 Real-time PCR for mrna quantitation Review paper
More informationA Study on New Customer Satisfaction Index Model of Smart Grid
2016 International Conference on Material Science and Civil Engineering (MSCE 2016) ISBN: 978-1-60595-378-6 A Study on New Customer Satisfaction Index Model of Smart Grid *Ze-san LIU 1, Zhuo YU 2 and Ai-qiang
More informationStatistical Methods for Network Analysis of Biological Data
The Protein Interaction Workshop, 8 12 June 2015, IMS Statistical Methods for Network Analysis of Biological Data Minghua Deng, dengmh@pku.edu.cn School of Mathematical Sciences Center for Quantitative
More informationAdditional file 2. Figure 1: Receiver operating characteristic (ROC) curve using the top
Additional file 2 Figure Legends: Figure 1: Receiver operating characteristic (ROC) curve using the top discriminatory features between HIV-infected (n=32) and HIV-uninfected (n=15) individuals. The top
More informationHigh Resolution GC-MS Application: Metabolomics Vladimir Tolstikov, PhD
High Resolution GC-MS Application: Metabolomics Vladimir Tolstikov, PhD Eli Lilly and Company A Sample Harvest and Storage Biological Metadata Standard Operational Procedure Sample Extraction Sample Preparation
More informationA Brief Overview of Multivariate Data Analysis in Biological Sciences
A Brief Overview of Multivariate Data Analysis in Biological Sciences Mohammad OHID ULLAH (Corresponding author) Associate Professor, Department of Statistics Shahjalal University of Science and Technology,
More informationBiomarker Discovery Directly from Tissue Xenograph Using High Definition Imaging MALDI Combined with Multivariate Analysis
Biomarker Discovery Directly from Tissue Xenograph Using High Definition Imaging MALDI Combined with Multivariate Analysis Emmanuelle Claude and Mark Towers Waters Corporation, Manchester, UK APPLICATION
More informationA Decision Support System for Market Segmentation - A Neural Networks Approach
Association for Information Systems AIS Electronic Library (AISeL) AMCIS 1995 Proceedings Americas Conference on Information Systems (AMCIS) 8-25-1995 A Decision Support System for Market Segmentation
More informationDesigning a Complex-Omics Experiments. Xiangqin Cui. Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham
Designing a Complex-Omics Experiments Xiangqin Cui Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham 1/7/2015 Some slides are from previous lectures of Grier
More informationSparse PLS: Variable Selection when Integrating Omics data
Sparse PLS: Variable Selection when Integrating Omics data Kim-Anh Lê Cao, Debra Rossow, Christèle Robert-Granié, Philippe Besse To cite this version: Kim-Anh Lê Cao, Debra Rossow, Christèle Robert-Granié,
More informationDisentangling Prognostic and Predictive Biomarkers Through Mutual Information
Informatics for Health: Connected Citizen-Led Wellness and Population Health R. Randell et al. (Eds.) 2017 European Federation for Medical Informatics (EFMI) and IOS Press. This article is published online
More informationHow to use ANOVA in Qlucore Omics Explorer
How to use ANOVA in Qlucore Omics Explorer Content 1 Introduction... 3 2 Introduction to ANOVA... 3 2.1 Model A: Independent measures, single factor design... 3 2.2 Model B: Independent measures, factorial
More informationUsing Genomics to Guide Immunosuppression Therapy David A. Baran, MD, FACC, FSCAI System Director, Advanced HF, Transplant and MCS, Sentara Heart
Using Genomics to Guide Immunosuppression Therapy David A. Baran, MD, FACC, FSCAI System Director, Advanced HF, Transplant and MCS, Sentara Heart Hospital, Norfolk, VA Disclosure Consulting: Livanova,
More informationAUTOMATED INTERPRETABLE COMPUTATIONAL BIOLOGY IN THE CLINIC: A FRAMEWORK TO PREDICT DISEASE SEVERITY AND STRATIFY PATIENTS FROM CLINICAL DATA
Interdisciplinary Description of Complex Systems 15(3), 199-208, 2017 AUTOMATED INTERPRETABLE COMPUTATIONAL BIOLOGY IN THE CLINIC: A FRAMEWORK TO PREDICT DISEASE SEVERITY AND STRATIFY PATIENTS FROM CLINICAL
More informationSurvival Outcome Prediction for Cancer Patients based on Gene Interaction Network Analysis and Expression Profile Classification
Survival Outcome Prediction for Cancer Patients based on Gene Interaction Network Analysis and Expression Profile Classification Final Project Report Alexander Herrmann Advised by Dr. Andrew Gentles December
More informationBiomarker discovery and high dimensional datasets
Biomarker discovery and high dimensional datasets Biomedical Data Science Marco Colombo Lecture 4, 2017/2018 High-dimensional medical data In recent years, the availability of high-dimensional biological
More informationSingle-cell RNA-sequencing
Single-cell RNA-sequencing Stefania Giacomello UC Davis, 22 June 2018 Applications - Heterogeneity analysis - Cell-type identification - Cellular states in differentiation and developmental processes -
More informationSmart India Hackathon
TM Persistent and Hackathons Smart India Hackathon 2017 i4c www.i4c.co.in Digital Transformation 25% of India between age of 16-25 Our country needs audacious digital transformation to reach its potential
More informationOur website:
Biomedical Informatics Summer Internship Program (BMI SIP) The Department of Biomedical Informatics hosts an annual internship program each summer which provides high school, undergraduate, and graduate
More informationModern Analysis of Customer Surveys
Modern Analysis of Customer Surveys with applications using R Edited by Ron S. Kenett KPA Ltd., Raanana, Israel, University of Turin, Italy, and NYU-Poly, Center for Risk Engineering, New York, USA Silvia
More informationIntegrative clustering methods for high-dimensional molecular data
Review Article Integrative clustering methods for high-dimensional molecular data Prabhakar Chalise, Devin C. Koestler, Milan Bimali, Qing Yu, Brooke L. Fridley Department of Biostatistics, University
More informationResearch Powered by Agilent s GeneSpring
Research Powered by Agilent s GeneSpring Agilent Technologies, Inc. Carolina Livi, Bioinformatics Segment Manager Research Powered by GeneSpring Topics GeneSpring (GS) platform New features in GS 13 What
More informationUnsupervised machine learning techniques for studying climate variability
Unsupervised machine learning techniques for studying climate variability Alexander Ilin, Harri Valpola, Erkki Oja Helsinki University of Technology Adaptive Informatics Research Center Adaptive Informatics
More informationA GENOTYPE CALLING ALGORITHM FOR AFFYMETRIX SNP ARRAYS
Bioinformatics Advance Access published November 2, 2005 The Author (2005). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org
More informationMulCom: a Multiple Comparison statistical test for microarray data in Bioconductor.
MulCom: a Multiple Comparison statistical test for microarray data in Bioconductor. Claudio Isella, Tommaso Renzulli, Davide Corà and Enzo Medico May 3, 2016 Abstract Many microarray experiments compare
More informationBioinformatics for Biologists
Bioinformatics for Biologists Microarray Data Analysis. Lecture 1. Fran Lewitter, Ph.D. Director Bioinformatics and Research Computing Whitehead Institute Outline Introduction Working with microarray data
More informationSAS Microarray Solution for the Analysis of Microarray Data. Susanne Schwenke, Schering AG Dr. Richardus Vonk, Schering AG
for the Analysis of Microarray Data Susanne Schwenke, Schering AG Dr. Richardus Vonk, Schering AG Overview Challenges in Microarray Data Analysis Software for Microarray Data Analysis SAS Scientific Discovery
More informationComputational Biology
3.3.3.2 Computational Biology Today, the field of Computational Biology is a well-recognised and fast-emerging discipline in scientific research, with the potential of producing breakthroughs likely to
More informationQLUCORE: Novel platform for protein microarray
QLUCORE: Novel platform for protein microarray data analysis Manuel Fuentes Centro de Investigación del Cáncer Madrid, Spain mfuentes@usal.es QLUCORE: Novel Platform for protein microarray data analysis
More informationStudy on the Application of Data Mining in Bioinformatics. Mingyang Yuan
International Conference on Mechatronics Engineering and Information Technology (ICMEIT 2016) Study on the Application of Mining in Bioinformatics Mingyang Yuan School of Science and Liberal Arts, New
More informationIntegrated Predictive Maintenance Platform Reduces Unscheduled Downtime and Improves Asset Utilization
November 2017 Integrated Predictive Maintenance Platform Reduces Unscheduled Downtime and Improves Asset Utilization Abstract Applied Materials, the innovator of the SmartFactory Rx suite of software products,
More informationBioinformatics. Microarrays: designing chips, clustering methods. Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute
Bioinformatics Microarrays: designing chips, clustering methods Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Course Syllabus Jan 7 Jan 14 Jan 21 Jan 28 Feb 4 Feb 11 Feb 18 Feb 25 Sequence
More informationFunctional genomics + Data mining
Functional genomics + Data mining BIO337 Systems Biology / Bioinformatics Spring 2014 Edward Marcotte, Univ of Texas at Austin Edward Marcotte/Univ of Texas/BIO337/Spring 2014 Functional genomics + Data
More informationPartial Least Squares Structural Equation Modeling PLS-SEM
Partial Least Squares Structural Equation Modeling PLS-SEM New Edition Joe Hair Cleverdon Chair of Business Director, DBA Program Statistical Analysis Historical Perspectives Early 1900 s 1970 s = Basic
More informationIntroduction to descriptive statistics
Introduction to descriptive statistics Illustrated with XLSTAT Jean Paul Maalouf jpmaalouf@xlstat.com linkedin.com/in/jean-paul-maalouf www.xlstat.com Oct. 12, 2016 1 PLAN XLSTAT: who are we? Statistics:
More informationLC/MS/MS Solutions for Biomarker Discovery QSTAR. Elite Hybrid LC/MS/MS System. More performance, more reliability, more answers
LC/MS/MS Solutions for Biomarker Discovery QSTAR Elite Hybrid LC/MS/MS System More performance, more reliability, more answers More is better and the QSTAR Elite LC/MS/MS system has more to offer. More
More informationIntroduction to Quantitative Genomics / Genetics
Introduction to Quantitative Genomics / Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics September 10, 2008 Jason G. Mezey Outline History and Intuition. Statistical Framework. Current
More informationSyllabus for BIOS 101, SPRING 2013
Page 1 Syllabus for BIOS 101, SPRING 2013 Name: BIOSTATISTICS 101 for Cancer Researchers Time: March 20 -- May 29 4-5pm in Wednesdays, [except 4/15 (Mon) and 5/7 (Tue)] Location: SRB Auditorium Background
More informationTo appear in: Moutinho and Hutcheson: Dictionary of Quantitative Methods in Management. Sage Publications
To appear in: Moutinho and Hutcheson: Dictionary of Quantitative Methods in Management. Sage Publications Factor Analysis Introduction Factor Analysis attempts to identify the underlying structure in a
More informationNima Hejazi. Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi. nimahejazi.org github/nhejazi
Data-Adaptive Estimation and Inference in the Analysis of Differential Methylation for the annual retreat of the Center for Computational Biology, given 18 November 2017 Nima Hejazi Division of Biostatistics
More informationExperimental Design for Gene Expression Microarray. Jing Yi 18 Nov, 2002
Experimental Design for Gene Expression Microarray Jing Yi 18 Nov, 2002 Human Genome Project The HGP continued emphasis is on obtaining by 2003 a complete and highly accurate reference sequence(1 error
More informationDetection of faults in Batch Processes: Application to an industrial fermentation and a steel making process Abstract Introduction
Detection of faults in Batch Processes: Application to an industrial fermentation and a steel maing process Barry Lennox *, Gary Montague + and Ognjen Marjanovic * * Control Technology Centre Ltd, School
More informationBioinformatics for Biologists
Bioinformatics for Biologists Functional Genomics: Microarray Data Analysis Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Outline Introduction Working with microarray data Normalization Analysis
More informationFFGWAS. Fast Functional Genome Wide Association AnalysiS of Surface-based Imaging Genetic Data
FFGWAS Fast Functional Genome Wide Association AnalysiS of Surface-based Imaging Genetic Data Chao Huang Department of Biostatistics Biomedical Research Imaging Center The University of North Carolina
More informationCS262 Lecture 12 Notes Single Cell Sequencing Jan. 11, 2016
CS262 Lecture 12 Notes Single Cell Sequencing Jan. 11, 2016 Background A typical human cell consists of ~6 billion base pairs of DNA and ~600 million bases of mrna. It is time-consuming and expensive to
More informationPrincipal components analysis II
Principal components analysis II Patrick Breheny November 29 Patrick Breheny BST 764: Applied Statistical Modeling 1/17 svd R SAS Singular value decompositions can be computed in R via the svd function:
More informationIncluding prior knowledge in shrinkage classifiers for genomic data
Including prior knowledge in shrinkage classifiers for genomic data Jean-Philippe Vert Jean-Philippe.Vert@mines-paristech.fr Mines ParisTech / Curie Institute / Inserm Statistical Genomics in Biomedical
More informationDesigning Complex Omics Experiments
Designing Complex Omics Experiments Xiangqin Cui Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham 6/15/2015 Some slides are from previous lectures given by
More informationBusiness Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee
Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 02 Data Mining Process Welcome to the lecture 2 of
More informationMachine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University
Machine learning applications in genomics: practical issues & challenges Yuzhen Ye School of Informatics and Computing, Indiana University Reference Machine learning applications in genetics and genomics
More informationGenomics and Proteomics *
OpenStax-CNX module: m44558 1 Genomics and Proteomics * OpenStax This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 By the end of this section, you will
More informationBIORISE D6.1 SYLLABUS OF THE CSMM BIOINFORMATICS COURSE
Ref. Ares(2016)1538926-31/03/2016 BIORISE D6.1 SYLLABUS OF THE CSMM BIOINFORMATICS COURSE The attached syllabus has been prepared by the CING Bioinformatics ERA Chair, Dr George Spyrou. The proposed syllabus
More information1/27 MLR & KNN M48, MLR and KNN, More Simple Generalities Handout, KJ Ch5&Sec. 6.2&7.4, JWHT Sec. 3.5&6.1, HTF Sec. 2.3
Stat 502X Details-2016 N=Stat Learning Notes, M=Module/Stat Learning Slides, KJ=Kuhn and Johnson, JWHT=James, Witten, Hastie, and Tibshirani, HTF=Hastie, Tibshirani, and Friedman 1 2 3 4 5 1/11 Intro M1,
More informationOur view on cdna chip analysis from engineering informatics standpoint
Our view on cdna chip analysis from engineering informatics standpoint Chonghun Han, Sungwoo Kwon Intelligent Process System Lab Department of Chemical Engineering Pohang University of Science and Technology
More informationMETABOLIC PROFILING AND SIGNATURES IN ALS. Rima Kaddurah-Daouk, Ph.D. Metabolic Profiling: Pathways in Discovery Dec 2-3, 2002 North Carolina
METABOLIC PROFILING AND SIGNATURES IN ALS Rima Kaddurah-Daouk, Ph.D. Metabolic Profiling: Pathways in Discovery Dec 2-3, 2002 North Carolina Metabolomics is the newest Omics science DNA Genomics RNA Transcriptomics
More informationMachine Learning in Computational Biology CSC 2431
Machine Learning in Computational Biology CSC 2431 Lecture 9: Combining biological datasets Instructor: Anna Goldenberg What kind of data integration is there? What kind of data integration is there? SNPs
More informationGenomic Selection with Linear Models and Rank Aggregation
Genomic Selection with Linear Models and Rank Aggregation m.scutari@ucl.ac.uk Genetics Institute March 5th, 2012 Genomic Selection Genomic Selection Genomic Selection: an Overview Genomic selection (GS)
More informationIsolating single cell types from co-culture flow cytometry experiments using automated n-dimensional gating for CAR T-based cancer immunotherapy
Isolating single cell types from co-culture flow cytometry experiments using automated n-dimensional gating for CAR T-based cancer iunotherapy Victor Tieu 1 (vtieu@stanford.edu) 1 Department of Bioengineering,
More informationLearning theory: SLT what is it? Parametric statistics small number of parameters appropriate to small amounts of data
Predictive Genomics, Biology, Medicine Learning theory: SLT what is it? Parametric statistics small number of parameters appropriate to small amounts of data Ex. Find mean m and standard deviation s for
More informationIterated Conditional Modes for Cross-Hybridization Compensation in DNA Microarray Data
http://www.psi.toronto.edu Iterated Conditional Modes for Cross-Hybridization Compensation in DNA Microarray Data Jim C. Huang, Quaid D. Morris, Brendan J. Frey October 06, 2004 PSI TR 2004 031 Iterated
More informationData analysis standards in metabolomics
Data analysis standards in metabolomics Chair: Roy Goodacre (School of Chemistry, University of Manchester). Working group (Group please insert affiliations in same style as above): andrew.craig@cambridgebluegnome.com,
More informationAnalysis of microarray data
BNF078 Fall 2006 Analysis of microarray data Markus Ringnér Computational Biology and Biological Physics Department of Theoretical Physics Lund University markus@thep.lu.se 046-2229337 1 Contents Preface
More informationIn silico prediction of novel therapeutic targets using gene disease association data
In silico prediction of novel therapeutic targets using gene disease association data, PhD, Associate GSK Fellow Scientific Leader, Computational Biology and Stats, Target Sciences GSK Big Data in Medicine
More informationCarry out rule-based statistical analysis
Overview This unit is about carrying out a range of different types of statistical analysis for internal and external clients under supervision. Applicable NOS Unit SSC/ N 2101 Unit Code SSC/ N 2101 Unit
More informationECS 234: Introduction to Computational Functional Genomics ECS 234
: Introduction to Computational Functional Genomics Administrativia Prof. Vladimir Filkov 3023 Kemper filkov@cs.ucdavis.edu Appts: Office Hours: M,W, 3-4pm, and by appt. , 4 credits, CRN: 54135 http://www.cs.ucdavis.edu~/filkov/234/
More informationPredicting the functional consequences of amino acid polymorphisms using hierarchical Bayesian models
Predicting the functional consequences of amino acid polymorphisms using hierarchical Bayesian models C. Verzilli 1,, J. Whittaker 1, N. Stallard 2, D. Chasman 3 1 Dept. of Epidemiology and Public Health,
More informationMulti-SNP Models for Fine-Mapping Studies: Application to an. Kallikrein Region and Prostate Cancer
Multi-SNP Models for Fine-Mapping Studies: Application to an association study of the Kallikrein Region and Prostate Cancer November 11, 2014 Contents Background 1 Background 2 3 4 5 6 Study Motivation
More informationExploring Similarities of Conserved Domains/Motifs
Exploring Similarities of Conserved Domains/Motifs Sotiria Palioura Abstract Traditionally, proteins are represented as amino acid sequences. There are, though, other (potentially more exciting) representations;
More informationHeterogeneity Random and fixed effects
Heterogeneity Random and fixed effects Georgia Salanti University of Ioannina School of Medicine Ioannina Greece gsalanti@cc.uoi.gr georgia.salanti@gmail.com Outline What is heterogeneity? Identifying
More informationPHD THESIS - summary
THE BUCHAREST UNIVERSITY OF ECONOMIC STUDIES Council for Doctoral Studies Management Doctoral School PHD THESIS - summary POSSIBILITIES TO INCREASE THE COMPETITIVENESS OF HEALTH ORGANIZATIONS BY MODERNIZING
More informationCancer Informatics. Comprehensive Evaluation of Composite Gene Features in Cancer Outcome Prediction
Open Access: Full open access to this and thousands of other papers at http://www.la-press.com. Cancer Informatics Supplementary Issue: Computational Advances in Cancer Informatics (B) Comprehensive Evaluation
More informationModel Building Process Part 2: Factor Assumptions
Model Building Process Part 2: Factor Assumptions Authored by: Sarah Burke, PhD 17 July 2018 Revised 6 September 2018 The goal of the STAT COE is to assist in developing rigorous, defensible test strategies
More informationBasic aspects of Microarray Data Analysis
Hospital Universitari Vall d Hebron Institut de Recerca - VHIR Institut d Investigació Sanitària de l Instituto de Salud Carlos III (ISCIII) Basic aspects of Microarray Data Analysis Expression Data Analysis
More informationPéter Antal Ádám Arany Bence Bolgár András Gézsi Gergely Hajós Gábor Hullám Péter Marx András Millinghoffer László Poppe Péter Sárközy BIOINFORMATICS
Péter Antal Ádám Arany Bence Bolgár András Gézsi Gergely Hajós Gábor Hullám Péter Marx András Millinghoffer László Poppe Péter Sárközy BIOINFORMATICS The Bioinformatics book covers new topics in the rapidly
More informationCross-organism prediction of drug hepatotoxicity by sparse group factor analysis
Cross-organism prediction of drug hepatotoxicity by sparse group factor analysis Tommi Suvitaival, Juuso A. Parkkinen, Seppo Virtanen Department of Information and Computer Science, Aalto University {tommi.suvitaival,juuso.parkkinen,seppo.j.virtanen}@aalto.fi
More informationNeed to identify cell subpopulations
Problem Hyperspectral microscopy imaging Our methods Key results Conclusion Acknowledgements Need to identify cell subpopulations Biological cells are very heterogeneous. Significant major subpopulations
More informationAgilent Software Tools for Mass Spectrometry Based Multi-omics Studies
Agilent Software Tools for Mass Spectrometry Based Multi-omics Studies Technical Overview Introduction The central dogma for biological information flow is expressed as a series of chemical conversions
More informationMeasuring gene expression
Measuring gene expression Grundlagen der Bioinformatik SS2018 https://www.youtube.com/watch?v=v8gh404a3gg Agenda Organization Gene expression Background Technologies FISH Nanostring Microarrays RNA-seq
More informationStatistical Tools for Predicting Ancestry from Genetic Data
Statistical Tools for Predicting Ancestry from Genetic Data Timothy Thornton Department of Biostatistics University of Washington March 1, 2015 1 / 33 Basic Genetic Terminology A gene is the most fundamental
More informationENDOGENOUS METABOLIC PROFILING AS A FUNDAMENT FOR PERSONALIZED THERANOSTICS. Torbjörn Lundstedt Professor in Chemometrics C.S.O.
A cure o m i c s ENDOGENOUS METABOLIC PROFILING AS A FUNDAMENT FOR PERSONALIZED THERANOSTICS Torbjörn Lundstedt Professor in Chemometrics C.S.O. AcureOmics AB Thessaloniki, Greece, 19 th of May, 2016 1
More informationIntroduction to Multivariate Procedures (Book Excerpt)
SAS/STAT 9.22 User s Guide Introduction to Multivariate Procedures (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.22 User s Guide. The correct bibliographic citation
More information