Integration of heterogeneous omics data

Size: px
Start display at page:

Download "Integration of heterogeneous omics data"

Transcription

1 Integration of heterogeneous omics data Andrea Rau March 11, 2016 Formation doctorale: Biologie expe rimentale animale et mode lisation pre dictive Integration of heterogeneous omics data 1 / 36

2 Introduction Outline 1 Introduction Integromics Example data: TCGA multi-omics data 2 Descriptive integration with multiple factor analysis 3 Clustering integration with icluster+ 4 Discussion andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 2 / 36

3 Integrative data analysis

4 Introduction Integromics Integrative omics data analysis ( integromics ) Public genome databases like NCBI already house petabytes (10 6 GB) of data, and are growing exponentially each year Increasingly difficult to extract full value from massive omics data in a unified and meaningful way: Gene expression (RNA-seq, microarrays) Protein expression Methylation Metabolome Copy number variants Genomic mutations Functional annotations Gene pathway membership Protein-protein interactions High-throughput phenotypic information Focusing on a single platform runs the risk of missing an obvious signal! andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 4 / 36

5 Introduction Integromics A relatively new phenomenon andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 5 / 36

6 Introduction Integromics The broad umbrella of integrative data analysis Ultimate goal: Understanding complex processes Lots of different meanings: Exploration Description Classification (supervised, unsupervised, semi-supervised) Variable selection / biomarker identification Phenotype prediction Meta-analysis... andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 6 / 36

7 Introduction Integromics Integrative multi-omics analysis: What? Why? 1 Exploration Multiple Factor Analysis (MFA) Regularized Canonical Correlation Multiple co-inertia analysis 2 Classification Clustering (iclusterplus) 3 Prediction Integrative lasso with Penalty Factors (IPF-Lasso) Multi-group partial least squares Penalized linear discriminant analysis andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 7 / 36

8 Introduction Integromics... with lots of statistical and practical difficulties! Missing or incomplete data Potentially heterogenous quality across datasets Need for normalization / standardization / preprocessing (???) Many (!!) more variables than observations (ultra-high dimensionality) Multiple testing Datasets of differing sizes Potentially large requirements for data storage and computing power... and of course, biological interpretation! andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 8 / 36

9 Introduction Example data: TCGA multi-omics data Introduction to the TCGA data Comprehensive and coordinated effort to improve the molecular understanding of major types and sub-types of cancer through high-throughput genomics Clinical information + genomic characterization data + high level sequence analysis of tumor genomes 34 cancer types/sub-types Open-access tier (public data not unique to individuals) and controlled-access tier (primary sequence data, raw SNP data) andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 9 / 36

10 Introduction Example data: TCGA multi-omics data TCGA data (matched/unmatched tumor/normal samples) Clinical (demographic, treatment, survival information) mirna sequencing Protein expression mrna sequencing DNA methylation Copy number variants Somatic mutations Biospecimen data Diagnostic / tissue / radiological images Whole exome / genome sequencing Total RNA sequencing Array-based expression andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 10 / 36

11 Introduction Example data: TCGA multi-omics data TCGA breast cancer data For illustration, we make use of tumoral data from 104 patients with breast invasive carcinoma: Clinical information: cancer subtype (Basal, Luminal A, Luminal B, HER2-enriched), estrogen / progesterone status, survival time, pathologic stage, race, age,... Subtype: Basal-like HER2-enriched Luminal A Luminal B ER status: Negative Positive PR status Negative Positive andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 11 / 36

12 Introduction Example data: TCGA multi-omics data TCGA breast cancer data For illustration, we make use of tumoral data from 104 patients with breast invasive carcinoma: mirna-seq (Illumina Hi-Seq): 725 mirs Normalized protein expression (reverse phase protein arrays): 156 proteins RNA-seq (Illumina Hi-Seq): genes Methylation (Infinium HumanMethylation27 BeadChip): genes Somatic mutations: 4398 genes Copy number alterations: genes Integration of heterogeneous omics data 12 / 36

13 Descriptive integration with multiple factor analysis Outline 1 Introduction Integromics Example data: TCGA multi-omics data 2 Descriptive integration with multiple factor analysis 3 Clustering integration with icluster+ 4 Discussion andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 13 / 36

14 Descriptive integration with multiple factor analysis Multi-table analyses Individuals are described by a set of (possibly related) variables that are structured into several groups: Several potential goals: Identify relationships between tables (inter-structure): canonical correlation Identify a consensus (common structure) among tables: multiple factor analysis (Escofier and Pagès, 1997) Borrow from multivariate methods developed for ecological/survey/chemometrics data andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 14 / 36

15 Descriptive integration with multiple factor analysis Multiple factor analysis (MFA) Integration of heterogeneous omics data 15 / 36

16 Descriptive integration with multiple factor analysis Multiple factor analysis (MFA) We seek common structures present in some or all of the data tables: Simultaneously deal with tables containing information on the same individuals but first, groups of variables must be made comparable! Balanced weighting of different groups of variables Differing numbers of variables in each group Type of variables (quantitative, categorial) may differ between groups Integration of heterogeneous omics data 16 / 36

17 Descriptive integration with multiple factor analysis Multiple factor analysis Four major steps: 1 Perform principal components analysis (PCA) on each dataset individually 2 Normalize each dataset by dividing its elements by the square root of the first eigenvalue obtained from step 1 3 Merge normalized data, and perform a global PCA on the merged data 4 Project individual datasets onto the global analysis to analyze commonalities and discrepancies andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 17 / 36

18 Descriptive integration with multiple factor analysis Multiple factor analysis Superposed graphical representation of partial PCAs andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 18 / 36

19 Descriptive integration with multiple factor analysis Multiple factor analysis 2 for TCGA data 3 via ade4 Measure of proximity between each data table and the consensus = projected inertia from each table on the first two MFA axes 2 All MFA graphics courtesy of Denis Laloë 3 Pre-processing: log 2 ( + 1) for RNA-seq and mirna-seq, arcsin( ) for methylation andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 19 / 36

20 Descriptive integration with multiple factor analysis Multiple factor analysis for TCGA data Similarity between MFA and individual PCA results Integration of heterogeneous omics data 20 / 36

21 Descriptive integration with multiple factor analysis Multiple factor analysis for TCGA data Similarity between MFA and individual PCA results Integration of heterogeneous omics data 20 / 36

22 Descriptive integration with multiple factor analysis Multiple factor analysis for TCGA data Projection of data tables onto consensus Integration of heterogeneous omics data 21 / 36

23 Clustering integration with icluster+ Outline 1 Introduction Integromics Example data: TCGA multi-omics data 2 Descriptive integration with multiple factor analysis 3 Clustering integration with icluster+ 4 Discussion andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 22 / 36

24 Clustering integration with icluster+ Integrative clustering Goal: discover new phenotype subgroups (e.g., cancer subtypes) and their molecular drivers in a comprehensive genetic context Jointly model discrete and continuous variables arising from genomic/epigenomic/transcriptomic profiling Hypothesis: diverse molecular phenotypes can be predicted by a set of orthogonal latent variables 4 representing distinct molecular drivers 4 = not directly observable andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 23 / 36

25 Clustering integration with icluster+ icluster+ integrative clustering Integration of heterogeneous omics data 24 / 36

26 Clustering integration with icluster+ icluster+ integrative clustering Integrates binary (mutation), categorical (copy number gain/normal/loss), continuous or count (gene expression) data Generalized linear regression for joint model, with common set of latent variables 5 + penalization via lasso terms: f (X t ) = β t Z + E t where X t is the p t n data matrix for data type t, β t the loading matrix, Z the shared K n latent variables, and E t the uncorrelated Gaussian error terms Assume Z i N (0, I K ) Sparse model obtained via data-specific lasso penalties λ t 5 NOTE: similar to PCA but better suited to heteroscedastic data andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 25 / 36

27 Clustering integration with icluster+ A word on sparse methods High-dimensional data often contain many irrelevant variables for predicting a response / assigning observations to a group Including these irrelevant variables in a predictive model leads to a loss in predictive performance Sparse methods add an appropriate penalty term to the objective function of the method to suppress these irrelevant variables andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 26 / 36

28 Clustering integration with icluster+ icluster+ integrative clustering Let x ijt be the j th genomic feature in sample i of data type t. If x ijt is binary (i.e., mutation statuts): log P(x ijt = 1 Z i ) 1 P(x ijt = 1 Z i ) = α jt + β jt Z i andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 27 / 36

29 Clustering integration with icluster+ icluster+ integrative clustering Let x ijt be the j th genomic feature in sample i of data type t. If x ijt is binary (i.e., mutation statuts): log P(x ijt = 1 Z i ) 1 P(x ijt = 1 Z i ) = α jt + β jt Z i If x ijt is categorical (i.e., copy number status: loss/normal/gain): P(x ijt = c Z i ) = exp(α jct + β jct Z i ) c exp(α jct + β jct Z i ), c = 1,..., C andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 27 / 36

30 Clustering integration with icluster+ icluster+ integrative clustering Let x ijt be the j th genomic feature in sample i of data type t. If x ijt is binary (i.e., mutation statuts): log P(x ijt = 1 Z i ) 1 P(x ijt = 1 Z i ) = α jt + β jt Z i If x ijt is categorical (i.e., copy number status: loss/normal/gain): P(x ijt = c Z i ) = exp(α jct + β jct Z i ) c exp(α jct + β jct Z i ), c = 1,..., C If x ijt is continuous (i.e., expression): x ijt = α jt + β jt Z i + ε ijt, ε ijt N(0, σ 2 jt) andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 27 / 36

31 Clustering integration with icluster+ iclusterplus Bioconductor package Estimation via modified Monte Carlo Newton-Raphson algorithm Optimization of number of latent variables K (deviance ratio) and lasso penalty terms λ t (BIC) needed... andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 28 / 36

32 Clustering integration with icluster+ Preparing data for integrative clustering 6 Somatic mutation data: keep genes that have mutations in at least 2% of the samples RNA-seq data: keep the 1000 most variable genes (i.e., those with the largest coefficient of variance), and center data for each individual CNA data: keep the 1000 most variable genes (i.e., those with the largest coefficient of variance),set all values between and 0.25 equal to 0 Protein data: keep all values 6 For now, only 4 datasets may be integrated in icluster+. andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 29 / 36

33 Clustering integration with icluster+ Preparing data for integrative clustering 6 Somatic mutation data: keep genes that have mutations in at least 2% of the samples RNA-seq data: keep the 1000 most variable genes (i.e., those with the largest coefficient of variance), and center data for each individual CNA data: keep the 1000 most variable genes (i.e., those with the largest coefficient of variance),set all values between and 0.25 equal to 0 Protein data: keep all values Set K = 4 latent variables (equal to the number of cancer subtypes), use default values of lasso penalty parameters (λ t = 0.03 for all t) 6 For now, only 4 datasets may be integrated in icluster+. andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 29 / 36

34

35 Clustering integration with icluster+ icluster+ results (K = 4 latent variables) andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 31 / 36

36 Clustering integration with icluster+ icluster+ results (K = 4 latent variables) Top features based on lasso penalized coefficients for each data type: $mutation "CDH1" "GATA3" "PCDH15" "PIK3CA" "RYR1" "TP53"... $protein "4E-BP1-R-V" "Akt_pS473-R-V" "AR-R-V" "Bcl-2-M-V" "Bim-R-V" "c-kit-r-v"... $rna "A2ML " "ABCA " "ABCC8 6833" "ADCY1 107" "ADH1B 125" "ADIPOQ 9370"... $tumor "ASIC1" "ACVR1B" "ACVRL1" "APOF" "AQP2" "AQP5"... > lapply(sigfeatures, length) $mutation [1] 46 $protein [1] 39 $rna [1] 250 $tumor [1] 247 andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 32 / 36

37 Discussion Outline 1 Introduction Integromics Example data: TCGA multi-omics data 2 Descriptive integration with multiple factor analysis 3 Clustering integration with icluster+ 4 Discussion andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 33 / 36

38 Discussion Two major integrative strategies Description Variable symmetry No matrix inversions Multi-table analysis through MFA (supervised analysis possible between groups) Explanation / Prediction Asymmetry of variables: one group explains another group Matrix inversion Colinearity n < p and matrix ranks Regularization procedures needed Clustering via icluster+, supervised (discriminant) analysis via predictive methods like IPF-Lasso andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 34 / 36

39 Discussion Discussion Integrative predictive/explicative methods like iclusterplus seem very promising for integrative omics analysis but data preprocessing/model tuning is often needed (and not straightfoward to perfom) Choice of number of latent variables K Choice of lasso penalty terms Influence of pre-processing steps on results... andrea.rau@jouy.inra.fr Integration of heterogeneous omics data 35 / 36

40 Discussion Discussion Integrative approaches can (should?) account for the intrinsic structures of biological relationships from different high-throughput platforms Integration of heterogeneous omics data 36 / 36

41 R/Bioconductor Packages: Thank you! ade4: Multiple factor analysis, multiple co-inertia analysis, STATIS FactomineR: Multiple factor analysis mixomics: Correlation analysis, partial least squares iclusterplus

42 Some references... Meng, C. et al (2014). A multivariate approach to the integration of multi-omics datasets. BMC Bioinformatics 15:162 de Tayrac, M. et al (2009). Simultaneous analysis of distinct Omics data sets with integration of biological knowledge: Multiple Factor Analysis approach. BMC genomics, 10(1), 32 Culhane, A. C., et al (2005). MADE4: an R package for multivariate analysis of gene expression data. Bioinformatics, 21(11), Dray, S. et Dufour, A-B. (2007). The ade4 package: implementing the duality diagram for ecologists. Journal of Statistical Software, 22(4). Escofier B., et Pags, J.(1998). Analyses factorielles simples et multiples. Dunod. Lebart, L., Piron, M, Morineau, A. (2006). Statistique exploratoire multidimensionnelle. Dunod. L Cao, K. A.,et al (2008). A sparse PLS for variable selection when integrating omics data. Statistical applications in genetics and molecular biology, 7(1). Salmi B. et al (2010). Multivariate analysis to compare pig meat quality traits according to breed and rearing system.proceedings of the 9th WCGALP, Leipzig, August 1-6, 2010, 442 Tenenhaus, A., et Tenenhaus, M. (2014). Regularized generalized canonical correlation analysis for multiblock or multigroup data analysis. European Journal of Operational Research, 238(2),

Multivariate Methods to detecting co-related trends in data

Multivariate Methods to detecting co-related trends in data Multivariate Methods to detecting co-related trends in data Canonical correlation analysis Partial least squares Co-inertia analysis Classical CCA and PLS require n>p. Can apply Penalized CCA and sparse

More information

Unravelling `omics' data with the mixomics R package

Unravelling `omics' data with the mixomics R package Introduction Concept of Single Omics Analysis Integrative Omics Analysis Recent developments Conclusio Unravelling `omics' data with the R package Illustration on several studies Queensland Facility for

More information

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS

More information

The Future of IntegrOmics

The Future of IntegrOmics The Future of IntegrOmics Kristel Van Steen, PhD 2 kristel.vansteen@ulg.ac.be Systems and Modeling Unit, Montefiore Institute, University of Liège, Belgium Bioinformatics and Modeling, GIGA-R, University

More information

Research Powered by Agilent s GeneSpring

Research Powered by Agilent s GeneSpring Research Powered by Agilent s GeneSpring Agilent Technologies, Inc. Carolina Livi, Bioinformatics Segment Manager Research Powered by GeneSpring Topics GeneSpring (GS) platform New features in GS 13 What

More information

2017 HTS-CSRS COMMUNITY PUBLIC WORKSHOP

2017 HTS-CSRS COMMUNITY PUBLIC WORKSHOP 2017 HTS-CSRS COMMUNITY PUBLIC WORKSHOP GenomeNext Overview Olympus Platform The Olympus Platform provides a continuous workflow and data management solution from the sequencing instrument through analysis,

More information

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist Whole Transcriptome Analysis of Illumina RNA- Seq Data Ryan Peters Field Application Specialist Partek GS in your NGS Pipeline Your Start-to-Finish Solution for Analysis of Next Generation Sequencing Data

More information

Machine Learning in Computational Biology CSC 2431

Machine Learning in Computational Biology CSC 2431 Machine Learning in Computational Biology CSC 2431 Lecture 9: Combining biological datasets Instructor: Anna Goldenberg What kind of data integration is there? What kind of data integration is there? SNPs

More information

Smart India Hackathon

Smart India Hackathon TM Persistent and Hackathons Smart India Hackathon 2017 i4c www.i4c.co.in Digital Transformation 25% of India between age of 16-25 Our country needs audacious digital transformation to reach its potential

More information

Bioinformatics Analysis of Nano-based Omics Data

Bioinformatics Analysis of Nano-based Omics Data Bioinformatics Analysis of Nano-based Omics Data Penny Nymark, Pekka Kohonen, Vesa Hongisto and Roland Grafström Hands-on Workshop on Nano Safety Assessment, 29 th September, 2016, National Technical University

More information

Gene Regulation Solutions. Microarrays and Next-Generation Sequencing

Gene Regulation Solutions. Microarrays and Next-Generation Sequencing Gene Regulation Solutions Microarrays and Next-Generation Sequencing Gene Regulation Solutions The Microarrays Advantage Microarrays Lead the Industry in: Comprehensive Content SurePrint G3 Human Gene

More information

Bioinformatics : Gene Expression Data Analysis

Bioinformatics : Gene Expression Data Analysis 05.12.03 Bioinformatics : Gene Expression Data Analysis Aidong Zhang Professor Computer Science and Engineering What is Bioinformatics Broad Definition The study of how information technologies are used

More information

Our website:

Our website: Biomedical Informatics Summer Internship Program (BMI SIP) The Department of Biomedical Informatics hosts an annual internship program each summer which provides high school, undergraduate, and graduate

More information

Next-Generation Sequencing Gene Expression Analysis Using Agilent GeneSpring GX

Next-Generation Sequencing Gene Expression Analysis Using Agilent GeneSpring GX Next-Generation Sequencing Gene Expression Analysis Using Agilent GeneSpring GX Technical Overview Introduction RNA Sequencing (RNA-Seq) is one of the most commonly used next-generation sequencing (NGS)

More information

Lab 1: A review of linear models

Lab 1: A review of linear models Lab 1: A review of linear models The purpose of this lab is to help you review basic statistical methods in linear models and understanding the implementation of these methods in R. In general, we need

More information

Inferring Gene-Gene Interactions and Functional Modules Beyond Standard Models

Inferring Gene-Gene Interactions and Functional Modules Beyond Standard Models Inferring Gene-Gene Interactions and Functional Modules Beyond Standard Models Haiyan Huang Department of Statistics, UC Berkeley Feb 7, 2018 Background Background High dimensionality (p >> n) often results

More information

Knowledge-Guided Analysis with KnowEnG Lab

Knowledge-Guided Analysis with KnowEnG Lab Han Sinha Song Weinshilboum Knowledge-Guided Analysis with KnowEnG Lab KnowEnG Center Powerpoint by Charles Blatti Knowledge-Guided Analysis KnowEnG Center 2017 1 Exercise In this exercise we will be doing

More information

CS262 Lecture 12 Notes Single Cell Sequencing Jan. 11, 2016

CS262 Lecture 12 Notes Single Cell Sequencing Jan. 11, 2016 CS262 Lecture 12 Notes Single Cell Sequencing Jan. 11, 2016 Background A typical human cell consists of ~6 billion base pairs of DNA and ~600 million bases of mrna. It is time-consuming and expensive to

More information

Nima Hejazi. Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi. nimahejazi.org github/nhejazi

Nima Hejazi. Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi. nimahejazi.org github/nhejazi Data-Adaptive Estimation and Inference in the Analysis of Differential Methylation for the annual retreat of the Center for Computational Biology, given 18 November 2017 Nima Hejazi Division of Biostatistics

More information

Bioinformatics. Microarrays: designing chips, clustering methods. Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute

Bioinformatics. Microarrays: designing chips, clustering methods. Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Bioinformatics Microarrays: designing chips, clustering methods Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Course Syllabus Jan 7 Jan 14 Jan 21 Jan 28 Feb 4 Feb 11 Feb 18 Feb 25 Sequence

More information

Introduction to BIOINFORMATICS

Introduction to BIOINFORMATICS COURSE OF BIOINFORMATICS a.a. 2016-2017 Introduction to BIOINFORMATICS What is Bioinformatics? (I) The sinergy between biology and informatics What is Bioinformatics? (II) From: http://www.bioteach.ubc.ca/bioinfo2010/

More information

iclusterplus: integrative clustering of multiple genomic data sets

iclusterplus: integrative clustering of multiple genomic data sets iclusterplus: integrative clustering of multiple genomic data sets Qianxing Mo and Ronglai Shen 2 December 7, 202 Contents Division of Biostatistics Dan L. Duncan Cancer Center Baylor College of Medicine

More information

GENOMICS for DUMMIES

GENOMICS for DUMMIES ØGC seminar 31. oktober 2013 GENOMICS for DUMMIES Torben A. Kruse Klinisk Genetisk Afdeling, Odense Universitetshospital Klinisk Institut, Syddansk Universitet Human MicroArray Center, OUH / SDU Årsag:

More information

Bioinformatics for Biologists

Bioinformatics for Biologists Bioinformatics for Biologists Microarray Data Analysis. Lecture 1. Fran Lewitter, Ph.D. Director Bioinformatics and Research Computing Whitehead Institute Outline Introduction Working with microarray data

More information

advanced analysis of gene expression microarray data aidong zhang World Scientific State University of New York at Buffalo, USA

advanced analysis of gene expression microarray data aidong zhang World Scientific State University of New York at Buffalo, USA advanced analysis of gene expression microarray data aidong zhang State University of New York at Buffalo, USA World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI Contents

More information

Bioinformatics for Biologists

Bioinformatics for Biologists Bioinformatics for Biologists Functional Genomics: Microarray Data Analysis Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Outline Introduction Working with microarray data Normalization Analysis

More information

ROAD TO STATISTICAL BIOINFORMATICS CHALLENGE 1: MULTIPLE-COMPARISONS ISSUE

ROAD TO STATISTICAL BIOINFORMATICS CHALLENGE 1: MULTIPLE-COMPARISONS ISSUE CHAPTER1 ROAD TO STATISTICAL BIOINFORMATICS Jae K. Lee Department of Public Health Science, University of Virginia, Charlottesville, Virginia, USA There has been a great explosion of biological data and

More information

Our view on cdna chip analysis from engineering informatics standpoint

Our view on cdna chip analysis from engineering informatics standpoint Our view on cdna chip analysis from engineering informatics standpoint Chonghun Han, Sungwoo Kwon Intelligent Process System Lab Department of Chemical Engineering Pohang University of Science and Technology

More information

Pioneering Clinical Omics

Pioneering Clinical Omics Pioneering Clinical Omics Clinical Genomics Strand NGS An analysis tool for data generated by cutting-edge Next Generation Sequencing(NGS) instruments. Strand NGS enables read alignment and analysis of

More information

Gene expression connectivity mapping and its application to Cat-App

Gene expression connectivity mapping and its application to Cat-App Gene expression connectivity mapping and its application to Cat-App Shu-Dong Zhang Northern Ireland Centre for Stratified Medicine University of Ulster Outline TITLE OF THE PRESENTATION Gene expression

More information

Multi-SNP Models for Fine-Mapping Studies: Application to an. Kallikrein Region and Prostate Cancer

Multi-SNP Models for Fine-Mapping Studies: Application to an. Kallikrein Region and Prostate Cancer Multi-SNP Models for Fine-Mapping Studies: Application to an association study of the Kallikrein Region and Prostate Cancer November 11, 2014 Contents Background 1 Background 2 3 4 5 6 Study Motivation

More information

Introduction to Bioinformatics. Fabian Hoti 6.10.

Introduction to Bioinformatics. Fabian Hoti 6.10. Introduction to Bioinformatics Fabian Hoti 6.10. Analysis of Microarray Data Introduction Different types of microarrays Experiment Design Data Normalization Feature selection/extraction Clustering Introduction

More information

Genetics and Bioinformatics

Genetics and Bioinformatics Genetics and Bioinformatics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Lecture 1: Setting the pace 1 Bioinformatics what s

More information

Integrative clustering methods for high-dimensional molecular data

Integrative clustering methods for high-dimensional molecular data Review Article Integrative clustering methods for high-dimensional molecular data Prabhakar Chalise, Devin C. Koestler, Milan Bimali, Qing Yu, Brooke L. Fridley Department of Biostatistics, University

More information

Corporate Medical Policy

Corporate Medical Policy Corporate Medical Policy Proteogenomic Testing for Patients with Cancer (GPS Cancer Test) File Name: Origination: Last CAP Review: Next CAP Review: Last Review: proteogenomic_testing_for_patients_with_cancer_gps_cancer_test

More information

Single-cell sequencing

Single-cell sequencing Single-cell sequencing Harri Lähdesmäki Department of Computer Science Aalto University December 5, 2017 Contents Background & Motivation Single cell sequencing technologies Single cell sequencing data

More information

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with

More information

Corporate Medical Policy

Corporate Medical Policy Corporate Medical Policy Proteogenomic Testing for Patients with Cancer (GPS Cancer Test) File Name: Origination: Last CAP Review: Next CAP Review: Last Review: proteogenomic_testing_for_patients_with_cancer_gps_cancer_test

More information

Corporate Medical Policy

Corporate Medical Policy Corporate Medical Policy Proteogenomic Testing for Patients with Cancer (GPS Cancer Test) File Name: Origination: Last CAP Review: Next CAP Review: Last Review: proteogenomic_testing_for_patients_with_cancer_gps_cancer_test

More information

The 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential

The 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential The 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential Applications Richard Finkers Researcher Plant Breeding, Wageningen UR Plant Breeding, P.O. Box 16, 6700 AA, Wageningen, The Netherlands,

More information

G E N OM I C S S E RV I C ES

G E N OM I C S S E RV I C ES GENOMICS SERVICES ABOUT T H E N E W YOR K G E NOM E C E N T E R NYGC is an independent non-profit implementing advanced genomic research to improve diagnosis and treatment of serious diseases. Through

More information

Stefano Monti. Workshop Format

Stefano Monti. Workshop Format Gad Getz Stefano Monti Michael Reich {gadgetz,smonti,mreich}@broad.mit.edu http://www.broad.mit.edu/~smonti/aws Broad Institute of MIT & Harvard October 18-20, 2006 Cambridge, MA Workshop Format Morning

More information

DNA. bioinformatics. genomics. personalized. variation NGS. trio. custom. assembly gene. tumor-normal. de novo. structural variation indel.

DNA. bioinformatics. genomics. personalized. variation NGS. trio. custom. assembly gene. tumor-normal. de novo. structural variation indel. DNA Sequencing T TM variation DNA amplicon mendelian trio genomics NGS bioinformatics tumor-normal custom SNP resequencing target validation de novo prediction personalized comparative genomics exome private

More information

Introduction to Bioinformatics and Gene Expression Technologies

Introduction to Bioinformatics and Gene Expression Technologies Introduction to Bioinformatics and Gene Expression Technologies Utah State University Fall 2017 Statistical Bioinformatics (Biomedical Big Data) Notes 1 1 Vocabulary Gene: hereditary DNA sequence at a

More information

Introduction to Bioinformatics and Gene Expression Technologies

Introduction to Bioinformatics and Gene Expression Technologies Vocabulary Introduction to Bioinformatics and Gene Expression Technologies Utah State University Fall 2017 Statistical Bioinformatics (Biomedical Big Data) Notes 1 Gene: Genetics: Genome: Genomics: hereditary

More information

ILLUMINA SEQUENCING SYSTEMS

ILLUMINA SEQUENCING SYSTEMS ILLUMINA SEQUENCING SYSTEMS PROVEN QUALITY. TRUSTED SOLUTIONS. Every day, researchers are using Illumina next-generation sequencing (NGS) systems to better understand human health and disease, as well

More information

Measuring and Understanding Gene Expression

Measuring and Understanding Gene Expression Measuring and Understanding Gene Expression Dr. Lars Eijssen Dept. Of Bioinformatics BiGCaT Sciences programme 2014 Why are genes interesting? TRANSCRIPTION Genome Genomics Transcriptome Transcriptomics

More information

DNA. Clinical Trials. Research RNA. Custom. Reports CLIA CAP GCP. Tumor Genomic Profiling Services for Clinical Trials

DNA. Clinical Trials. Research RNA. Custom. Reports CLIA CAP GCP. Tumor Genomic Profiling Services for Clinical Trials Tumor Genomic Profiling Services for Clinical Trials Custom Reports DNA RNA Focused Gene Sets Clinical Trials Accuracy and Content Enhanced NGS Sequencing Extended Panel, Exomes, Transcriptomes Research

More information

Centro Nacional de Análisis Genómico. Where are the Bottlenecks of Genome Analysis Today? Teratec. Ecole Polytechnique, Palaiseau, F.

Centro Nacional de Análisis Genómico. Where are the Bottlenecks of Genome Analysis Today? Teratec. Ecole Polytechnique, Palaiseau, F. Centro Nacional de Análisis Genómico Where are the Bottlenecks of Genome Analysis Today? Teratec Ecole Polytechnique, Palaiseau, F Ivo Glynne Gut 29.06.2016 The genomehenge Sequencing capacity >1000 Gbases/day

More information

Designing a Complex-Omics Experiments. Xiangqin Cui. Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham

Designing a Complex-Omics Experiments. Xiangqin Cui. Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham Designing a Complex-Omics Experiments Xiangqin Cui Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham 1/7/2015 Some slides are from previous lectures of Grier

More information

Statistical Methods for Network Analysis of Biological Data

Statistical Methods for Network Analysis of Biological Data The Protein Interaction Workshop, 8 12 June 2015, IMS Statistical Methods for Network Analysis of Biological Data Minghua Deng, dengmh@pku.edu.cn School of Mathematical Sciences Center for Quantitative

More information

Data-Adaptive Estimation and Inference in the Analysis of Differential Methylation

Data-Adaptive Estimation and Inference in the Analysis of Differential Methylation Data-Adaptive Estimation and Inference in the Analysis of Differential Methylation for the annual retreat of the Center for Computational Biology, given 18 November 2017 Nima Hejazi Division of Biostatistics

More information

Data Mining for Biological Data Analysis

Data Mining for Biological Data Analysis Data Mining for Biological Data Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Data Mining Course by Gregory-Platesky Shapiro available at www.kdnuggets.com Jiawei Han

More information

Ecological genomics and molecular adaptation: state of the Union and some research goals for the near future.

Ecological genomics and molecular adaptation: state of the Union and some research goals for the near future. Ecological genomics and molecular adaptation: state of the Union and some research goals for the near future. Louis Bernatchez Genomics and Conservation of Aquatic Resources Université LAVAL! Molecular

More information

First steps in signal-processing level models of genetic networks: identifying response pathways and clusters of coexpressed genes

First steps in signal-processing level models of genetic networks: identifying response pathways and clusters of coexpressed genes First steps in signal-processing level models of genetic networks: identifying response pathways and clusters of coexpressed genes Olga Troyanskaya lecture for cheme537/cs554 some slides borrowed from

More information

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology. G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY Methods or systems for genetic

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 16 Reader s reaction to Dimension Reduction for Classification with Gene Expression Microarray Data by Dai et al

More information

Characterization of Allele-Specific Copy Number in Tumor Genomes

Characterization of Allele-Specific Copy Number in Tumor Genomes Characterization of Allele-Specific Copy Number in Tumor Genomes Hao Chen 2 Haipeng Xing 1 Nancy R. Zhang 2 1 Department of Statistics Stonybrook University of New York 2 Department of Statistics Stanford

More information

Computational Challenges of Medical Genomics

Computational Challenges of Medical Genomics Talk at the VSC User Workshop Neusiedl am See, 27 February 2012 [cbock@cemm.oeaw.ac.at] http://medical-epigenomics.org (lab) http://www.cemm.oeaw.ac.at (institute) Introducing myself to Vienna s scientific

More information

Welcome to the NGS webinar series

Welcome to the NGS webinar series Welcome to the NGS webinar series Webinar 1 NGS: Introduction to technology, and applications NGS Technology Webinar 2 Targeted NGS for Cancer Research NGS in cancer Webinar 3 NGS: Data analysis for genetic

More information

Functional genomics + Data mining

Functional genomics + Data mining Functional genomics + Data mining BIO337 Systems Biology / Bioinformatics Spring 2014 Edward Marcotte, Univ of Texas at Austin Edward Marcotte/Univ of Texas/BIO337/Spring 2014 Functional genomics + Data

More information

Basics of RNA-Seq. (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility

Basics of RNA-Seq. (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility 2018 ABRF Meeting Satellite Workshop 4 Bridging the Gap: Isolation to Translation (Single Cell RNA-Seq) Sunday, April 22 Basics of RNA-Seq (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly,

More information

Introduction to Bioinformatics and Gene Expression Technology

Introduction to Bioinformatics and Gene Expression Technology Vocabulary Introduction to Bioinformatics and Gene Expression Technology Utah State University Spring 2014 STAT 5570: Statistical Bioinformatics Notes 1.1 Gene: Genetics: Genome: Genomics: hereditary DNA

More information

IPA Advanced Training Course

IPA Advanced Training Course IPA Advanced Training Course Academia Sinica 2015 Oct Gene( 陳冠文 ) Supervisor and IPA certified analyst 1 Review for Introductory Training course Searching Building a Pathway Editing a Pathway for Publication

More information

Study on the Application of Data Mining in Bioinformatics. Mingyang Yuan

Study on the Application of Data Mining in Bioinformatics. Mingyang Yuan International Conference on Mechatronics Engineering and Information Technology (ICMEIT 2016) Study on the Application of Mining in Bioinformatics Mingyang Yuan School of Science and Liberal Arts, New

More information

Bioinformatics. Outline of lecture

Bioinformatics. Outline of lecture Bioinformatics Uma Chandran, MSIS, PhD Department of Biomedical Informatics University of Pittsburgh chandran@pitt.edu 412 648 9326 07/08/2014 Outline of lecture What is Bioinformatics? Examples of bioinformatics

More information

The EORTC Molecular Screening programme SPECTA

The EORTC Molecular Screening programme SPECTA The EORTC Molecular Screening programme SPECTA February 2016 Denis Lacombe, MD, MSc EORTC, Director General Brussels, Belgium The changing shape of clinical research Phase I RESOURCES Phase III The changing

More information

Cancer Genetics Solutions

Cancer Genetics Solutions Cancer Genetics Solutions Cancer Genetics Solutions Pushing the Boundaries in Cancer Genetics Cancer is a formidable foe that presents significant challenges. The complexity of this disease can be daunting

More information

BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology. Lecture 2: Microarray analysis

BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology. Lecture 2: Microarray analysis BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology Lecture 2: Microarray analysis Genome wide measurement of gene transcription using DNA microarray Bruce Alberts, et al., Molecular Biology

More information

Machine Learning. HMM applications in computational biology

Machine Learning. HMM applications in computational biology 10-601 Machine Learning HMM applications in computational biology Central dogma DNA CCTGAGCCAACTATTGATGAA transcription mrna CCUGAGCCAACUAUUGAUGAA translation Protein PEPTIDE 2 Biological data is rapidly

More information

latestdevelopments relevant for the Ag sector André Eggen Agriculture Segment Manager, Europe

latestdevelopments relevant for the Ag sector André Eggen Agriculture Segment Manager, Europe Overviewof Illumina s latestdevelopments relevant for the Ag sector André Eggen Agriculture Segment Manager, Europe Seminar der Studienrichtung Tierwissenschaften, TÜM, July 1, 2009 Overviewof Illumina

More information

Microarrays & Gene Expression Analysis

Microarrays & Gene Expression Analysis Microarrays & Gene Expression Analysis Contents DNA microarray technique Why measure gene expression Clustering algorithms Relation to Cancer SAGE SBH Sequencing By Hybridization DNA Microarrays 1. Developed

More information

Potential of human genome sequencing. Paul Pharoah Reader in Cancer Epidemiology University of Cambridge

Potential of human genome sequencing. Paul Pharoah Reader in Cancer Epidemiology University of Cambridge Potential of human genome sequencing Paul Pharoah Reader in Cancer Epidemiology University of Cambridge Key considerations Strength of association Exposure Genetic model Outcome Quantitative trait Binary

More information

Bioinformatics Advice on Experimental Design

Bioinformatics Advice on Experimental Design Bioinformatics Advice on Experimental Design Where do I start? Please refer to the following guide to better plan your experiments for good statistical analysis, best suited for your research needs. Statistics

More information

Lees J.A., Vehkala M. et al., 2016 In Review

Lees J.A., Vehkala M. et al., 2016 In Review Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes Lees J.A., Vehkala M. et al., 2016 In Review Journal Club Triinu Kõressaar 16.03.2016 Introduction Bacterial

More information

Analytics Behind Genomic Testing

Analytics Behind Genomic Testing A Quick Guide to the Analytics Behind Genomic Testing Elaine Gee, PhD Director, Bioinformatics ARUP Laboratories 1 Learning Objectives Catalogue various types of bioinformatics analyses that support clinical

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics If the 19 th century was the century of chemistry and 20 th century was the century of physic, the 21 st century promises to be the century of biology...professor Dr. Satoru

More information

Feature Selection of Gene Expression Data for Cancer Classification: A Review

Feature Selection of Gene Expression Data for Cancer Classification: A Review Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 50 (2015 ) 52 57 2nd International Symposium on Big Data and Cloud Computing (ISBCC 15) Feature Selection of Gene Expression

More information

Introducing QIAseq. Accelerate your NGS performance through Sample to Insight solutions. Sample to Insight

Introducing QIAseq. Accelerate your NGS performance through Sample to Insight solutions. Sample to Insight Introducing QIAseq Accelerate your NGS performance through Sample to Insight solutions Sample to Insight From Sample to Insight let QIAGEN enhance your NGS-based research High-throughput next-generation

More information

Discriminant models for high-throughput proteomics mass spectrometer data

Discriminant models for high-throughput proteomics mass spectrometer data Proteomics 2003, 3, 1699 1703 DOI 10.1002/pmic.200300518 1699 Short Communication Parul V. Purohit David M. Rocke Center for Image Processing and Integrated Computing, University of California, Davis,

More information

Gene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis

Gene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis Gene expression analysis Biosciences 741: Genomics Fall, 2013 Week 5 Gene expression analysis From EST clusters to spotted cdna microarrays Long vs. short oligonucleotide microarrays vs. RT-PCR Methods

More information

Statistical Inference and Reconstruction of Gene Regulatory Network from Observational Expression Profile

Statistical Inference and Reconstruction of Gene Regulatory Network from Observational Expression Profile Statistical Inference and Reconstruction of Gene Regulatory Network from Observational Expression Profile Prof. Shanthi Mahesh 1, Kavya Sabu 2, Dr. Neha Mangla 3, Jyothi G V 4, Suhas A Bhyratae 5, Keerthana

More information

SEQUENCING. M Ataei, PhD. Feb 2016

SEQUENCING. M Ataei, PhD. Feb 2016 CLINICAL NEXT GENERATION SEQUENCING M Ataei, PhD Tehran Medical Genetics Laboratory Feb 2016 Overview 2 Background NGS in non-invasive prenatal diagnosis (NIPD) 3 Background Background 4 In the 1970s,

More information

Normalization of metabolomics data using multiple internal standards

Normalization of metabolomics data using multiple internal standards Normalization of metabolomics data using multiple internal standards Matej Orešič 1 1 VTT Technical Research Centre of Finland, Tietotie 2, FIN-02044 Espoo, Finland matej.oresic@vtt.fi Abstract. Success

More information

Agilent GeneSpring GX 10: Beyond. Pam Tangvoranuntakul Product Manager, GeneSpring October 1, 2008

Agilent GeneSpring GX 10: Beyond. Pam Tangvoranuntakul Product Manager, GeneSpring October 1, 2008 Agilent GeneSpring GX 10: Gene Expression and Beyond Pam Tangvoranuntakul Product Manager, GeneSpring October 1, 2008 GeneSpring GX 10 in the News Our Goals for GeneSpring GX 10 Goal 1: Bring back GeneSpring

More information

Deep Sequencing technologies

Deep Sequencing technologies Deep Sequencing technologies Gabriela Salinas 30 October 2017 Transcriptome and Genome Analysis Laboratory http://www.uni-bc.gwdg.de/index.php?id=709 Microarray and Deep-Sequencing Core Facility University

More information

Genomic solutions for complex disease

Genomic solutions for complex disease Genomic solutions for complex disease Power your with our genomic solutions Access a breadth of applications. Gain a depth of insights. To enhance their understanding of complex disease, researchers are

More information

Syllabus for BIOS 101, SPRING 2013

Syllabus for BIOS 101, SPRING 2013 Page 1 Syllabus for BIOS 101, SPRING 2013 Name: BIOSTATISTICS 101 for Cancer Researchers Time: March 20 -- May 29 4-5pm in Wednesdays, [except 4/15 (Mon) and 5/7 (Tue)] Location: SRB Auditorium Background

More information

Analysis of RNA-seq Data. Feb 8, 2017 Peikai CHEN (PHD)

Analysis of RNA-seq Data. Feb 8, 2017 Peikai CHEN (PHD) Analysis of RNA-seq Data Feb 8, 2017 Peikai CHEN (PHD) Outline What is RNA-seq? What can RNA-seq do? How is RNA-seq measured? How to process RNA-seq data: the basics How to visualize and diagnose your

More information

Introduction to Microarray Analysis

Introduction to Microarray Analysis Introduction to Microarray Analysis Methods Course: Gene Expression Data Analysis -Day One Rainer Spang Microarrays Highly parallel measurement devices for gene expression levels 1. How does the microarray

More information

Biomarker discovery and high dimensional datasets

Biomarker discovery and high dimensional datasets Biomarker discovery and high dimensional datasets Biomedical Data Science Marco Colombo Lecture 4, 2017/2018 High-dimensional medical data In recent years, the availability of high-dimensional biological

More information

resequencing storage SNP ncrna metagenomics private trio de novo exome ncrna RNA DNA bioinformatics RNA-seq comparative genomics

resequencing storage SNP ncrna metagenomics private trio de novo exome ncrna RNA DNA bioinformatics RNA-seq comparative genomics RNA Sequencing T TM variation genetics validation SNP ncrna metagenomics private trio de novo exome mendelian ChIP-seq RNA DNA bioinformatics custom target high-throughput resequencing storage ncrna comparative

More information

Supplementary Methods

Supplementary Methods Supplemental Information for funtoonorm: An improvement of the funnorm normalization method for methylation data from multiple cell or tissue types. Kathleen Oros Klein et al. Supplementary Methods funtoonorm

More information

Support Vector Machines (SVMs) for the classification of microarray data. Basel Computational Biology Conference, March 2004 Guido Steiner

Support Vector Machines (SVMs) for the classification of microarray data. Basel Computational Biology Conference, March 2004 Guido Steiner Support Vector Machines (SVMs) for the classification of microarray data Basel Computational Biology Conference, March 2004 Guido Steiner Overview Classification problems in machine learning context Complications

More information

Including prior knowledge in shrinkage classifiers for genomic data

Including prior knowledge in shrinkage classifiers for genomic data Including prior knowledge in shrinkage classifiers for genomic data Jean-Philippe Vert Jean-Philippe.Vert@mines-paristech.fr Mines ParisTech / Curie Institute / Inserm Statistical Genomics in Biomedical

More information

Additional file 2. Figure 1: Receiver operating characteristic (ROC) curve using the top

Additional file 2. Figure 1: Receiver operating characteristic (ROC) curve using the top Additional file 2 Figure Legends: Figure 1: Receiver operating characteristic (ROC) curve using the top discriminatory features between HIV-infected (n=32) and HIV-uninfected (n=15) individuals. The top

More information

Sample to Insight. Dr. Bhagyashree S. Birla NGS Field Application Scientist

Sample to Insight. Dr. Bhagyashree S. Birla NGS Field Application Scientist Dr. Bhagyashree S. Birla NGS Field Application Scientist bhagyashree.birla@qiagen.com NGS spans a broad range of applications DNA Applications Human ID Liquid biopsy Biomarker discovery Inherited and somatic

More information

Microarray Informatics

Microarray Informatics Microarray Informatics Donald Dunbar MSc Seminar 4 th February 2009 Aims To give a biologistʼs view of microarray experiments To explain the technologies involved To describe typical microarray experiments

More information

Random matrix analysis for gene co-expression experiments in cancer cells

Random matrix analysis for gene co-expression experiments in cancer cells Random matrix analysis for gene co-expression experiments in cancer cells OIST-iTHES-CTSR 2016 July 9 th, 2016 Ayumi KIKKAWA (MTPU, OIST) Introduction : What is co-expression of genes? There are 20~30k

More information

Integrative Genomics 1a. Introduction

Integrative Genomics 1a. Introduction 2016 Course Outline Integrative Genomics 1a. Introduction ggibson.gt@gmail.com http://www.cig.gatech.edu 1a. Experimental Design and Hypothesis Testing (GG) 1b. Normalization (GG) 2a. RNASeq (MI) 2b. Clustering

More information