Gene expression data analysis in clinical cancer research
|
|
- Damian Cunningham
- 6 years ago
- Views:
Transcription
1 Gene expression data analysis in clinical cancer research L analisi dell espressione genica nella ricerca oncologica Philippe Broët 1 INSERM U47 and Faculty of Medicine Paris-Sud broet@vjf.inserm.fr Summary: Nell ambito degli studi d associazione che utilizzano le biotecnologie orientate verso la transcriptomica dove l obiettivo è l identificazione dei geni le cui modifiche d espressione sono correlate a un fattore bio-clinico, uno dei problemi maggiori è l identificazione dei geni tenendo conto della molteplicità dei confronti effettuati. I due principali criteri utilizzati nell ambito delle procedure dei paragoni multipli sono : il FWER (Family Wise Error Rate) e il FDR (False Discovery Rate). Attualmente, esistono numerose procedure che permettono di controllare (o di stimare) questi diversi criteri d errore. Ciò nonostante, queste procedure rispondono solo parzialmente ai bisogni della ricerca clinica oncologica. In questo contesto presentiamo un metodo basato su modelli di misture Bayesiane che permettono di calcolare il FDR per un insieme qualsiasi di geni. Un esempio é presentato a partire da dati reali sul cancro del seno. Keywords: Bayesian mixture model, Clinical research, FDR, Microarray data analysis, Oncology. 1. Introduction Transcriptome-oriented biotechnologies have led to the availability for researchers of comparatively analysing thousands of mrna expression in parallel. Typically, these data consist of the measurement of gene expression under various experimental or biological conditions that can potentially provide information on the complex transcriptional activity for the biological system under study (Schena, 000). In parallel to the rapid development of this genomic technology, research into ways of interpreting the vast and rich body of generated data has become an active area. The interest in this new challenge for biostatisticians is underscored by the increasing number of articles recently published in the scientific literature. From well-designed experiments, research scientists pose questions related to comparison, prediction and clustering problems. For class comparison, the aim is to select relevant genes based on the relationship between its expression measurement and a response variable. For class prediction, the main interest is in deriving predictors defined from a linear or non-linear combination of gene measurement expressions. For class discovery, the major objective is to find new sub-classes of a disease entity that could help for future clinical and fondamental research. For class comparison, research into ways of identifying gene expression changes in microarray experiments taking into account false conclusions has become an active area. Up to now, statistical procedures have mostly relied on the multiple comparisons framework in order to control false positive conclusions (Hochberg and Tamhane, 1987). In this framework, two quantities have 1 16 Avenue Paul Vaillant Couturier Villejuif, France 1 Il lavoro è stato svolto con Sylvia RICHARDSON e Alex LEWIN, Department of Public Health, Imperial College, Norfolk Place, London W 1PG, United Kingdom
2 been considered : the familywise error rate (FWER) and the false discovery rate (FDR). The FWER, which is the oldest criterion considered in multiple comparisons, is defined as the probability of at least one false positive conclusion over all the true null hypotheses (a null hypothesis corresponds to the lack of relationship between gene expression measurement and a response variable). The most classical methods are Bonferroni and Sidãk methods (Hochberg and Tamhane, 1987). However, as argued by Benjamini and Hochberg (Benjamini and Hochberg, 1995), controlling the FWER in multiple testing settings may not always be appropriate. As an alternative and less stringent concept of error control they introduced the false discovery rate (FDR). The FDR is the expected proportion of erroneously rejected null hypotheses among the rejected ones. The main interest of the FDR is that it is an appealing error criteria which leads to more powerful procedures than those relying on the FWER. Moreover, the FDR seems well-suited for genomic and post-genomic biotechnologies which are mostly in the line of exploratory data analysis and screening. Based on this concept, they initially developed an step-up procedure under the hypothesis of independency which controls FDR at a prespecified value (Benjamini and Hochberg, 1995). Extension for the case of dependent tests has also been recently proposed (Benjamini and Yekutieli, 001). In this spirit, seminal work has been done for estimating the FDR, or the pfdr as defined by Storey (Storey, 001), in a non-parametric spirit (for some key contributions, see Storey and Tibshirani, 003; Tusher et al, 001; Efron et al, 001). A drawback of these latter procedures is that they only focus on protecting against false positive conclusions. However, in the exploratory and screening context of most microarray data analysis, investigators may be seriously concerned that such methods do not take into account false negatives and lead to the discarding of too large a proportion of meaningful experimental information. Indeed, a large gene expression variation does not necessarily translate into a major role in the biological process studied and vice versa. This is especially true for microarray experiments in oncology where the top genes (based on p-value or gene statistics) are not necessarily key genes whereas other interesting genes (related to biological pathway or target drug) may exhibit smaller transcriptional variations. In this setting, finite mixture modelling offers a flexible framework (see the numerous illustrations in McLachlan et al, 000) and allows for inferences obtained from a frequentist or Bayesian approach (for a few Pan et al, 003, Broët et al, 00). In this work we present a fully Bayesian mixture model that pays particular attention to the modelling of the alternative hypothesis in order to obtain good estimates of the FDR and its dual quantity the FNR as defined by Genovese and Wasserman (00). Moreover, it allows us to estimate the FDR and FNR for any subset of genes, a feature that cannot be obtained from classical approach that only considers monotone rejection regions. We illustrate our purpose in reanalyzing a dataset about breast cancer (Hedenfalk et al. 001), where the aim is to select relevant gene in a multi-class response experiments comparing BRCA1, BRCA related cancer and sporadic cancer.. Bayesian mixture modelling approach..1 Gene-based statistic In this subsection, we define a gene-based statistic for multi-class response experiments. In the following, let X ijk denote the measurement from the i th gene (1,..., I), in the j th sample (1,..., J k ) belonging to the k th class (1,..., K). The gene-based statistic D i
3 used in our proposed model-based approach is a transformation of the gene statistic F i (following under H 0 (corresponding to truly unmodified expression) a Fisher distribution, denoted FN K K 1 with (K 1) and (N K) degrees of freedom): D i = [(1 9(N K) )F 1 3 i (1 9(K 1) )][ 9(N K) F 3 i + 9(K 1) ] This transformation normalizes the distribution of the F i (Johnson and Kotz, 1970). Under H 0, D i is approximately distributed as a standard normal distribution, while D i has a more complex decentered distribution otherwise. Note that the decentered D i values summarize different gene expression changes across the conditions. Thus, the marginal distribution of D i is a mixture of distributions related to modified and unmodified gene expression measurements over the different classes... Model Our purpose is to model the mixture distribution of D i and to estimate for each gene the posterior probability to belong to the null component representing no difference over the different classes, conditional on the observed data. Our modelling approach assumes that the marginal density of D i can be written such as: f(d i ) = G g=0 w g f(. µ g, σ g) where f(. µ g, σg) are Gaussian densities, with unknown parameters ( µ g, σg) for the g th component density in the mixture. The quantities w g are the mixing proportions with 0 w g 1 and G g=0 w g = 1. Here, we define g = 0 to be the unmodified component having no expression change over the different conditions. This has a centered normal distribution. The number of modified components G in the mixture is treated as unknown since the alternative is expected to have a complex distribution summarizing various pattern of gene expression. The prior distribution for G is a Poisson distribution with parameter m, with m chosen small so as to encourage a parsimonious number of components being fitted. The mean parameter for the unmodified component µ 0 was set to 0 and we impose that µ G remark that under H 0 the distribution of F i are FN K K 1 Fisher distributions and noncentral Fisher distributions FN K K 1 (η) where η parameter under the alternative. The prior distributions specify that µ g;g 0, σg and w g are all drawn independently, with uniform, gamma and Dirichlet priors respectively. As usual for mixture models, we introduce L i an unobserved (latent) categorical variable taking the values 0,..., G with probability w 0,..., w G, respectively (McLachlan et al, 000). Thus, when L i 0 it will indicate that the gene i is not belonging to the null component. A joint posterior distribution for all unknowns is formed. Inference is then undertaken by simulating realizations from the resulting posterior distribution using a reversible-jump Metropolis-Hastings algorithm similar to the one used in Broët et al. (00) and Richardson and Green (1997). The full output of the Bayesian analysis includes information on the posterior distribution of G as well as our main quantities of interest, the posterior probabilities p 0i = p(l i = 0 data) for each gene. The p 0i are estimated within the algorithm by counting the number of times when L i = 0 divided by the length of the simulation run. Note that these probabilities are integrated over the range of normal mixtures (with different G) which are used by to fit the marginal density of D i, a unique feature of our model. From these posterior probabilities we can obtain model-based estimates of the observed false discovery and non-discovery rates conditionally upon the data. 1
4 ..3 The analysis of the Hedenfalk breast cancer dataset Dataset We analyzed the cdna microarray dataset publicly available from the breast cancer study conducted by Hedenfalk et al. (001). The aim of the study was to study breastcancer tissues from patients with BCRA1-related cancer, BCRA-related cancer, and sporadic cases of breast cancer for determining global gene-expression patterns in these three classes of tumors. The initial dataset consists of gene expression ratios derived from the fluorescent intensities from a tumor sample divided by those from a common reference sample. For each gene, a log-expression ratio was available. Here, we focus on the subset of 471 genes having a nominal denomination (EST and unknown gene were excluded). We consider each log-ratio measurement to be an additive sum of four terms: (i) a gene effect, (ii) a differential effect between the tumor sample and the reference sample co-hybridized on a defined array, (iii) an interaction gene cell line effect that reflects differential gene expression among the three tumor classes specific to each gene, (iv) an error term. As the term of interest is the interaction term, we estimate this term through a classical analysis of variance model. In practice, row and column effects are subtracted. Results The mixture integrated over different numbers of components provides a good semi-parametric fit to the gene-based statistics. This dataset appears to have a large number of differentiated genes (the Bayes estimate for the proportion of truly modified genes is 48%). The Bayes rule with the mixture model would give us a list of 995 genes, which is too many for practical purposes. Considering ordered p 0i, our method will provide FDR estimates for a list of the 96 or 384 genes (corresponding to classical 96 or 384 wellplates) of 1.6% and 6.1%, whereas FNR estimates are of 39% and 31.6%, respectively. In contrast, if the investigator is interested in studying a biological function, FDR and FNR can be obtained from individual p 0i. As an example, we consider three subsets of genes based on their known classical biological functions such as: apoptosis, cyclins and cell cycle regulation and cytoskelet. This gave us list size of 6, 1 and 5 genes of interest, respectively. Estimates for the FDR were 85% for apoptosis, 10% for cyclins and cell cycle regulation and 87% for cytoskelet. These results suggest that gene expression changes are different over the three tumor classes for cyclins and cell cycle regulation pathway as compared to the other considered biological functions and may lead the investigator to focus preferentially on gene involved in cell cycle. 3. Discussion Our fully Bayesian normal mixture model gives flexibility since the number of component is treated as an unknown parameter and can be considered as a parsimonious representation of a complex mixture density in a semi-parametric way. In this context, a mixture model-based approach such as the one presented here seems well suited for multi-class comparison experiments. obtained using our mixture model for the FDR and FNR are generaly accurate over a range of cases. When there is a substantial overlap between truly modified and unmodified gene profiles, the estimates outperform those obtained from classical nonparametric approach (such as Storey qvalue, 003). Moreover, our approach gives an estimate of the individual posterior probability for a gene of belonging to the null component integrated over all the possible mixture models. This allows to estimate FDR and FNR for any subset of genes, a feature that cannot be obtained from classical nonparametric approaches (such as Storey qvalue or SAM Tusher al, 001).
5 We applied the model to a cdna microarray dataset from a breast cancer study. When comparing for example three subset of genes defined from their biological functions, our results suggested that transcriptional expression for gene involved in kinase and cell cycle pathway differ between BRCA1, BRCA and sporadic tumors. In summary, we think this modelling approach gives an efficient way for obtaining the FDR and FNR and for analyzing relevant subset of genes that are particularly relevant in clinical cancer research. References Benjamini, Y., Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society, Ser. B,,57, Benjamini, Y., Yekutieli, D. (001) The control of the false discovery rate in multiple testing under dependency, The Annals of Statistics, 9, Broët, P., Richardson, S., Radvanyi, F. (00) Bayesian hierarchical model for identifying changes in gene expression from microarray experiments. J. Comput. Biol.,9, Efron B. Tibshirani R. Storey J. Tusher V. (001) Empirical Bayes Analysis of a Microarray experiment, Journal of the American Statistical Association,96, Genovese, C., Wasserman, L. (00) Operating characteristics and extensions of the false discovery rate procedure. Journal of the Royal Statistical Society, Series B.,64, Hedenfalk, I., Duggan, D., Chen, Y. et al. (001) Gene-expression profiles in hereditary breast cancer. N Engl J Med,344, Hochberg, Y., Tamhane, A.(1987) Multiple comparison procedures, Wiley, New York. Johnson N.L., Kotz S. (1970) Continuous univariate distributions. Vol., Wiley, New York. McLachlan, G., Peel, D. (000) Finite Mixture models, Wiley, New York. Pan W, Lin J, Le C. A (003) mixture model approach to detecting differentially expressed genes with microarray data. Funct Integr Genomics, 3,117-4 Richardson, S., and Green, P.J. (1997) On Bayesian analysis of mixtures with an unknown number of components. J.R.Statist. Soc. B.,59, Schena, M. (000) Microarray Biochip Technology, Eaton. Storey, J.D. (001) A direct approach to false dis rates, Journal of the Royal Statistical Society, Series B.,64, Storey, J.D, Tibshirani R. (003) Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA.,100, QVALUE: The manual jstorey/qvalue/manual.pdf Storey JD. (003) The positive false discovery r Bayesian interpretation and the q-value. Annals of Statistics,31, 1-3. Tusher, V., Tibshirani, R., Chu, G. (001) Significant analysis of microarray applied to the ionising radiation response, Proc. Natl Acad. Sci. USA.,98,
Introduction to microarrays
Bayesian modelling of gene expression data Alex Lewin Sylvia Richardson (IC Epidemiology) Tim Aitman (IC Microarray Centre) Philippe Broët (INSERM, Paris) In collaboration with Anne-Mette Hein, Natalia
More informationSTATISTICAL CHALLENGES IN GENE DISCOVERY
STATISTICAL CHALLENGES IN GENE DISCOVERY THROUGH MICROARRAY DATA ANALYSIS 1 Central Tuber Crops Research Institute,Kerala, India 2 Dept. of Statistics, St. Thomas College, Pala, Kerala, India email:sreejyothi
More informationMethods for comparing multiple microbial communities. james robert white, October 1 st, 2007
Methods for comparing multiple microbial communities. james robert white, whitej@umd.edu Advisor: Mihai Pop, mpop@umiacs.umd.edu October 1 st, 2007 Abstract We propose the development of new software to
More informationHeterogeneity of Variance in Gene Expression Microarray Data
Heterogeneity of Variance in Gene Expression Microarray Data DavidM.Rocke Department of Applied Science and Division of Biostatistics University of California, Davis March 15, 2003 Motivation Abstract
More informationGene Expression Data Analysis
Gene Expression Data Analysis Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu BMIF 310, Fall 2009 Gene expression technologies (summary) Hybridization-based
More informationBayesian Robust Inference for Differential Gene Expression in Microarrays with Multiple Samples
Biometrics 62, 10 18 March 2006 DOI: 10.1111/j.1541-0420.2005.00397.x Bayesian Robust Inference for Differential Gene Expression in Microarrays with Multiple Samples Raphael Gottardo, 1, Adrian E. Raftery,
More informationIdentification of biological themes in microarray data from a mouse heart development time series using GeneSifter
Identification of biological themes in microarray data from a mouse heart development time series using GeneSifter VizX Labs, LLC Seattle, WA 98119 Abstract Oligonucleotide microarrays were used to study
More informationBIOINFORMATICS ORIGINAL PAPER
BIOINFORMATICS ORIGINAL PAPER Vol. 21 no. 13 2005, pages 3017 3024 doi:10.1093/bioinformatics/bti448 Gene expression False discovery rate, sensitivity and sample size for microarray studies Yudi Pawitan
More informationBootstrapping Cluster Analysis: Assessing the Reliability of Conclusions from Microarray Experiments
Bootstrapping Cluster Analysis: Assessing the Reliability of Conclusions from Microarray Experiments M. Kathleen Kerr The Jackson Laboratory Bar Harbor, Maine 469 U.S.A. mkk@jax.org Gary A. Churchill 1
More informationComparison of Microarray Pre-Processing Methods
Comparison of Microarray Pre-Processing Methods K. Shakya, H. J. Ruskin, G. Kerr, M. Crane, J. Becker Dublin City University, Dublin 9, Ireland Abstract Data pre-processing in microarray technology is
More informationA survey of statistical software for analysing RNA-seq data
A survey of statistical software for analysing RNA-seq data Dexiang Gao, 1,5* Jihye Kim, 2 Hyunmin Kim, 4 Tzu L. Phang, 3 Heather Selby, 2 Aik Choon Tan 2,5 and Tiejun Tong 6** 1 Department of Pediatrics,
More informationGene Expression Data Analysis (I)
Gene Expression Data Analysis (I) Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Bioinformatics tasks Biological question Experiment design Microarray experiment
More informationROAD TO STATISTICAL BIOINFORMATICS CHALLENGE 1: MULTIPLE-COMPARISONS ISSUE
CHAPTER1 ROAD TO STATISTICAL BIOINFORMATICS Jae K. Lee Department of Public Health Science, University of Virginia, Charlottesville, Virginia, USA There has been a great explosion of biological data and
More informationMulCom: a Multiple Comparison statistical test for microarray data in Bioconductor.
MulCom: a Multiple Comparison statistical test for microarray data in Bioconductor. Claudio Isella, Tommaso Renzulli, Davide Corà and Enzo Medico May 3, 2016 Abstract Many microarray experiments compare
More informationSome Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods
Some Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods EPP 245/298 Statistical Analysis of Laboratory Data October 11, 2005 1 The
More informationCS-E5870 High-Throughput Bioinformatics Microarray data analysis
CS-E5870 High-Throughput Bioinformatics Microarray data analysis Harri Lähdesmäki Department of Computer Science Aalto University September 20, 2016 Acknowledgement for J Salojärvi and E Czeizler for the
More informationEECS730: Introduction to Bioinformatics
EECS730: Introduction to Bioinformatics Lecture 14: Microarray Some slides were adapted from Dr. Luke Huan (University of Kansas), Dr. Shaojie Zhang (University of Central Florida), and Dr. Dong Xu and
More informationNima Hejazi. Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi. nimahejazi.org github/nhejazi
Data-Adaptive Estimation and Inference in the Analysis of Differential Methylation for the annual retreat of the Center for Computational Biology, given 18 November 2017 Nima Hejazi Division of Biostatistics
More informationSeven Keys to Successful Microarray Data Analysis
Seven Keys to Successful Microarray Data Analysis Experiment Design Platform Selection Data Management System Access Differential Expression Biological Significance Data Publication Type of experiment
More informationMicroarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison. CodeLink compatible
Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison CodeLink compatible Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood
More informationMicroarray probe expression measures, data normalization and statistical validation
Comparative and Functional Genomics Comp Funct Genom 2003; 4: 442 446. Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cfg.312 Conference Review Microarray probe expression
More informationStatistical Issues in Microarray Data and Data Analysis
Statistical Issues in Microarray Data and Data Analysis Outline Background Reliability of Microarray Technology MAQC experimental design and data Analysis of MAQC data Selection of Differential Expressed
More informationMultiple Testing in RNA-Seq experiments
Multiple Testing in RNA-Seq experiments O. Muralidharan et al. 2012. Detecting mutations in mixed sample sequencing data using empirical Bayes. Bernd Klaus Institut für Medizinische Informatik, Statistik
More informationSupervised Learning from Micro-Array Data: Datamining with Care
November 18, 2002 Stanford Statistics 1 Supervised Learning from Micro-Array Data: Datamining with Care Trevor Hastie Stanford University November 18, 2002 joint work with Robert Tibshirani, Balasubramanian
More informationNonparametric Stepwise Procedure for Identification of Minimum Effective Dose (MED)
International Journal of Statistics and Systems ISSN 097-675 Volume, Number (06), pp. 77-88 Research India Publications http://www.ripublication.com Nonparametric Stepwise Procedure for Identification
More informationII. METHODS. A. DGE/RNA-seq data
Differential expression analysis of digital gene expression data: RNA-tag filtering, comparison of t-type tests and their genome-wide co-expression based adjustments Yinglei Lai Department of Statistics
More informationDavid M. Rocke Division of Biostatistics and Department of Biomedical Engineering University of California, Davis
David M. Rocke Division of Biostatistics and Department of Biomedical Engineering University of California, Davis Outline RNA-Seq for differential expression analysis Statistical methods for RNA-Seq: Structure
More informationSome Statistical Issues in Microarray Gene Expression Data
From the SelectedWorks of Jeffrey S. Morris June, 2006 Some Statistical Issues in Microarray Gene Expression Data Matthew S. Mayo, University of Kansas Medical Center Byron J. Gajewski, University of Kansas
More informationLab 1: A review of linear models
Lab 1: A review of linear models The purpose of this lab is to help you review basic statistical methods in linear models and understanding the implementation of these methods in R. In general, we need
More informationSome Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods
Some Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods BST 226 Statistical Methods for Bioinformatics January 8, 2014 1 The -Omics
More informationReview Statistical tests for differential expression in cdna microarray experiments Xiangqin Cui and Gary A Churchill
Review Statistical tests for differential expression in cdna microarray experiments Xiangqin Cui and Gary A Churchill Address: The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine 04609, USA. Correspondence:
More informationExploration and Analysis of DNA Microarray Data
Exploration and Analysis of DNA Microarray Data Dhammika Amaratunga Senior Research Fellow in Nonclinical Biostatistics Johnson & Johnson Pharmaceutical Research & Development Javier Cabrera Associate
More informationBioinformatics : Gene Expression Data Analysis
05.12.03 Bioinformatics : Gene Expression Data Analysis Aidong Zhang Professor Computer Science and Engineering What is Bioinformatics Broad Definition The study of how information technologies are used
More informationFeature selection methods for SVM classification of microarray data
Feature selection methods for SVM classification of microarray data Mike Love December 11, 2009 SVMs for microarray classification tasks Linear support vector machines have been used in microarray experiments
More informationAnalysis of a Proposed Universal Fingerprint Microarray
Analysis of a Proposed Universal Fingerprint Microarray Michael Doran, Raffaella Settimi, Daniela Raicu, Jacob Furst School of CTI, DePaul University, Chicago, IL Mathew Schipma, Darrell Chandler Bio-detection
More informationAnalysis of Cancer Gene Expression Profiling in DNA Microarray Data using Clustering Technique
Analysis of Cancer Gene Expression Profiling in DNA Microarray Data using Clustering Technique 1 C. Premalatha, 2 D. Devikanniga 1, 2 Assistant Professor, Department of Information Technology Sri Ramakrishna
More informationImproving statistical inference for gene expression profiling data by borrowing information
Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2010 Improving statistical inference for gene expression profiling data by borrowing information Long Qu Iowa
More informationFacilitating Antibacterial Drug Development: Bayesian vs Frequentist Methods
Facilitating Antibacterial Drug Development: Bayesian vs Frequentist Methods Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington The Brookings Institution May 9, 2010 First:
More informationPackage IsoGeneGUI. December 9, Type Package
Type Package Package IsoGeneGUI December 9, 2018 Title A graphical user interface to conduct a dose-response analysis of microarray data Version 2.18.0 Date 2015-04-09 Author Setia Pramana, Dan Lin, Philippe
More informationALLEN Human Brain Atlas
TECHNICAL WHITE PAPER: MICROARRAY DATA NORMALIZATION The is a publicly available online resource of gene expression information in the adult human brain. Comprising multiple datasets from various projects
More informationMicroarrays: since we use probes we obviously must know the sequences we are looking at!
These background are needed: 1. - Basic Molecular Biology & Genetics DNA replication Transcription Post-transcriptional RNA processing Translation Post-translational protein modification Gene expression
More informationSome Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods
Some Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods SPH 247 Statistical Analysis of Laboratory Data April 21, 2015 1 The -Omics
More information3.1.4 DNA Microarray Technology
3.1.4 DNA Microarray Technology Scientists have discovered that one of the differences between healthy and cancer is which genes are turned on in each. Scientists can compare the gene expression patterns
More informationSAS Microarray Solution for the Analysis of Microarray Data. Susanne Schwenke, Schering AG Dr. Richardus Vonk, Schering AG
for the Analysis of Microarray Data Susanne Schwenke, Schering AG Dr. Richardus Vonk, Schering AG Overview Challenges in Microarray Data Analysis Software for Microarray Data Analysis SAS Scientific Discovery
More informationCell Lines, Microarrays, Drugs and Disease: Trying to Predict Response to Chemotherapy
Cell Lines, Microarrays, Drugs and Disease: Trying to Predict Response to Chemotherapy Keith Baggerly, Ph.D Associate Professor Department of Bioinformatics and Computational Biology M. D. Anderson Cancer
More informationExploration and Analysis of DNA Microarray Data
Exploration and Analysis of DNA Microarray Data Dhammika Amaratunga Senior Research Fellow in Nonclinical Biostatistics Johnson & Johnson Pharmaceutical Research & Development Javier Cabrera Associate
More informationV10-8. Gene Expression
V10-8. Gene Expression - Regulation of Gene Transcription at Promoters - Experimental Analysis of Gene Expression - Statistics Primer - Preprocessing of Data - Differential Expression Analysis Fri, May
More informationTotal RNA was isolated using the TRIZOL reagent according to the manufacturer s
RNA extraction Total RNA was isolated using the TRIZOL reagent according to the manufacturer s instructions (Invitrogen, Carlsbad, CA). RNA integrity for each sample was confirmed with the Agilent 2100
More informationDose-Response Modeling of Gene Expression Data in Microarray Experiments
Dose-Response Modeling of Gene Expression Data in Microarray Experiments Setia Pramana Interuniversity Institute for Biostatistics and Statistical Bioinformatics, Universiteit Hasselt, Diepenbeek, Belgium
More informationOur view on cdna chip analysis from engineering informatics standpoint
Our view on cdna chip analysis from engineering informatics standpoint Chonghun Han, Sungwoo Kwon Intelligent Process System Lab Department of Chemical Engineering Pohang University of Science and Technology
More informationOptimal alpha reduces error rates in gene expression studies: a meta-analysis approach
Mudge et al. BMC Bioinformatics (2017) 18:312 DOI 10.1186/s12859-017-1728-3 METHODOLOGY ARTICLE Open Access Optimal alpha reduces error rates in gene expression studies: a meta-analysis approach J. F.
More informationInferring Gene-Gene Interactions and Functional Modules Beyond Standard Models
Inferring Gene-Gene Interactions and Functional Modules Beyond Standard Models Haiyan Huang Department of Statistics, UC Berkeley Feb 7, 2018 Background Background High dimensionality (p >> n) often results
More informationDisclaimer This presentation expresses my personal views on this topic and must not be interpreted as the regulatory views or the policy of the FDA
On multiplicity problems related to multiple endpoints of controlled clinical trials Mohammad F. Huque, Ph.D. Div of Biometrics IV, Office of Biostatistics OTS, CDER/FDA JSM, Vancouver, August 2010 Disclaimer
More informationadvanced analysis of gene expression microarray data aidong zhang World Scientific State University of New York at Buffalo, USA
advanced analysis of gene expression microarray data aidong zhang State University of New York at Buffalo, USA World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI Contents
More informationSignificance testing for small microarray experiments
CHAPTER 8 Significance testing for small microarray experiments Charles Kooperberg, Aaron Aragaki, Charles C. Carey, and Suzannah Rutherford 8.1 Introduction When a study has many degrees of freedom it
More informationarxiv: v1 [stat.me] 13 Apr 2013
arxiv:1304.3838v1 [stat.me] 13 Apr 2013 Article type: Overview Identification of significant features in DNA microarray data 2DPP Eric Bair Departments of Endodontics and Biostatistics Univ. of North Carolina
More informationApplication of Whole-Genome Prediction Methods for Genome-Wide Association Studies: A Bayesian Approach
Application of Whole-Genome Prediction Methods for Genome-Wide Association Studies: A Bayesian Approach Rohan Fernando, Ali Toosi, Anna Wolc, Dorian Garrick, and Jack Dekkers Data that are collected for
More informationDesign and Analysis of Microarray Experiments for Pharmacogenomics
Chapter 7 Design and Analysis of Microarray Experiments for Pharmacogenomics 7.1 7.2 Potential uses of biomarkers............................................. Clinical uses of genetic profiling.........................................
More informationWorkshop on Data Science in Biomedicine
Workshop on Data Science in Biomedicine July 6 Room 1217, Department of Mathematics, Hong Kong Baptist University 09:30-09:40 Welcoming Remarks 9:40-10:20 Pak Chung Sham, Centre for Genomic Sciences, The
More informationIntroduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics
Introduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics abedi777@ymail.com Outlines Technology Basic concepts Data analysis Printed Microarrays In Situ-Synthesized
More informationMicroarray Experiment Design
Microarray Experiment Design Samples used, extract preparation and labelling: AML blasts were isolated from bone marrow by centrifugation on a Ficoll- Hypaque gradient. Total RNA was extracted using TRIzol
More informationSome observations on experimental design of microarray experiments
16 Technical Paper Some observations on experimental design of microarray experiments Lara Lusa Abstract. Gene-expression microarrays measure simultaneously the expression of thousands of genes and are
More informationIntroduction to Microarray Analysis
Introduction to Microarray Analysis Methods Course: Gene Expression Data Analysis -Day One Rainer Spang Microarrays Highly parallel measurement devices for gene expression levels 1. How does the microarray
More informationA comparison of methods for differential expression analysis of RNA-seq data
Soneson and Delorenzi BMC Bioinformatics 213, 14:91 RESEARCH ARTICLE A comparison of methods for differential expression analysis of RNA-seq data Charlotte Soneson 1* and Mauro Delorenzi 1,2 Open Access
More informationStatistical signal detection in Clinical Trial data
Statistical signal detection in Clinical Trial data Andreas Brueckner Christiane Ahlers, Anngret Mallick, Nils Opitz, Vlasta Pinkston, Bruno Tran, Janet Scott, Harry Southworth, Bruno Tran, Lionel Van
More informationExperimental Design for Gene Expression Microarray. Jing Yi 18 Nov, 2002
Experimental Design for Gene Expression Microarray Jing Yi 18 Nov, 2002 Human Genome Project The HGP continued emphasis is on obtaining by 2003 a complete and highly accurate reference sequence(1 error
More informationMicroarray Informatics
Microarray Informatics Donald Dunbar MSc Seminar 31 st January 2007 Aims To give a biologist s view of microarray experiments To explain the technologies involved To describe typical microarray experiments
More informationNing Tang ALL RIGHTS RESERVED
2014 Ning Tang ALL RIGHTS RESERVED ROBUST GENE SET ANALYSIS AND ROBUST GENE EXPRESSION By NING TANG A dissertation submitted to the Graduate School New Brunswick Rutgers, The State University of New Jersey
More informationModeling & Simulation in pharmacogenetics/personalised medicine
Modeling & Simulation in pharmacogenetics/personalised medicine Julie Bertrand MRC research fellow UCL Genetics Institute 07 September, 2012 jbertrand@uclacuk WCOP 07/09/12 1 / 20 Pharmacogenetics Study
More informationEvaluating Diagnostic Tests in the Absence of a Gold Standard
Evaluating Diagnostic Tests in the Absence of a Gold Standard Nandini Dendukuri Departments of Medicine & Epidemiology, Biostatistics and Occupational Health, McGill University; Technology Assessment Unit,
More informationA STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET
A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET 1 J.JEYACHIDRA, M.PUNITHAVALLI, 1 Research Scholar, Department of Computer Science and Applications,
More informationEvaluation of Some Statistical Methods for the Identification of Differentially Expressed Genes
Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 3-24-2015 Evaluation of Some Statistical Methods for the Identification of Differentially
More informationMicroarray data analysis: from disarray to consolidation and consensus
Microarray data analysis: from disarray to consolidation and consensus David B. Allison*, Xiangqin Cui*, Grier P. Page* and Mahyar Sabripour* Abstract In just a few years, microarrays have gone from obscurity
More informationPage 78
A Case Study for Radiation Therapy Dose Finding Utilizing Bayesian Sequential Trial Design Author s Details: (1) Fuyu Song and (2)(3) Shein-Chung Chow 1 Peking University Clinical Research Institute, Peking
More informationIntroduction to ChIP Seq data analyses. Acknowledgement: slides taken from Dr. H
Introduction to ChIP Seq data analyses Acknowledgement: slides taken from Dr. H Wu @Emory ChIP seq: Chromatin ImmunoPrecipitation it ti + sequencing Same biological motivation as ChIP chip: measure specific
More informationComparative analysis of RNA-Seq data with DESeq2
Comparative analysis of RNA-Seq data with DESeq2 Simon Anders EMBL Heidelberg Two applications of RNA-Seq Discovery find new transcripts find transcript boundaries find splice junctions Comparison Given
More informationData-Adaptive Estimation and Inference in the Analysis of Differential Methylation
Data-Adaptive Estimation and Inference in the Analysis of Differential Methylation for the annual retreat of the Center for Computational Biology, given 18 November 2017 Nima Hejazi Division of Biostatistics
More informationreview Expression Microarrays Tiling genomic microarrays Sequencing methods Riassunto puntate precedenti RNA transcripts
Riassunto puntate precedenti Expression Microarrays Tiling genomic microarrays Sequencing methods RNA transcripts Depend on kind of RNA prep from cells: Total RNA Poly(A) + fraction Long RNA Small RNA.bound
More informationMachine Learning in Computational Biology CSC 2431
Machine Learning in Computational Biology CSC 2431 Lecture 9: Combining biological datasets Instructor: Anna Goldenberg What kind of data integration is there? What kind of data integration is there? SNPs
More informationCOS 597c: Topics in Computational Molecular Biology. DNA arrays. Background
COS 597c: Topics in Computational Molecular Biology Lecture 19a: December 1, 1999 Lecturer: Robert Phillips Scribe: Robert Osada DNA arrays Before exploring the details of DNA chips, let s take a step
More informationNon-parametric optimal design in dose finding studies
Biostatistics (2002), 3, 1,pp. 51 56 Printed in Great Britain Non-parametric optimal design in dose finding studies JOHN O QUIGLEY Department of Mathematics, University of California, San Diego, CA 92093,
More informationMicroarray Informatics
Microarray Informatics Donald Dunbar MSc Seminar 4 th February 2009 Aims To give a biologistʼs view of microarray experiments To explain the technologies involved To describe typical microarray experiments
More informationTime-series microarray data simulation modeled with a case-control label
Time-series microarray data simulation modeled with a case-control label Y.J. Liu and J.Y. Zhang School of Computer Science and Technology, Xidian University, Xi an, China Corresponding author: J.Y. Zhang
More informationFinding molecular signatures from gene expression data: review and a new proposal
Finding molecular signatures from gene expression data: review and a new proposal Ramón Díaz-Uriarte rdiaz@cnio.es http://bioinfo.cnio.es/ rdiaz Unidad de Bioinformática Centro Nacional de Investigaciones
More informationChIP-seq data analysis with Chipster. Eija Korpelainen CSC IT Center for Science, Finland
ChIP-seq data analysis with Chipster Eija Korpelainen CSC IT Center for Science, Finland chipster@csc.fi What will I learn? Short introduction to ChIP-seq Analyzing ChIP-seq data Central concepts Analysis
More informationRecent technology allow production of microarrays composed of 70-mers (essentially a hybrid of the two techniques)
Microarrays and Transcript Profiling Gene expression patterns are traditionally studied using Northern blots (DNA-RNA hybridization assays). This approach involves separation of total or polya + RNA on
More informationIntroduction to Quantitative Genomics / Genetics
Introduction to Quantitative Genomics / Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics September 10, 2008 Jason G. Mezey Outline History and Intuition. Statistical Framework. Current
More informationSIMS2003. Instructors:Rus Yukhananov, Alex Loguinov BWH, Harvard Medical School. Introduction to Microarray Technology.
SIMS2003 Instructors:Rus Yukhananov, Alex Loguinov BWH, Harvard Medical School Introduction to Microarray Technology. Lecture 1 I. EXPERIMENTAL DETAILS II. ARRAY CONSTRUCTION III. IMAGE ANALYSIS Lecture
More informationDetecting Outliers in Exponentiated Pareto Distribution
Journal of Sciences, Islamic Republic of Iran 28(3): 267-272 (207) University of Tehran, ISSN 06-04 http://jsciences.ut.ac.ir Detecting Outliers in Exponentiated Pareto Distribution M. Jabbari Nooghabi
More informationSample Size and Power Calculation for High Order Crossover Designs
Sample Size and Power Calculation for High Order Crossover Designs Roger P. Qu, Ph.D Department of Biostatistics Forest Research Institute, New York, NY, USA 1. Introduction Sample size and power calculation
More informationDNA Microarrays and Computational Analysis of DNA Microarray. Data in Cancer Research
DNA Microarrays and Computational Analysis of DNA Microarray Data in Cancer Research Mario Medvedovic, Jonathan Wiest Abstract 1. Introduction 2. Applications of microarrays 3. Analysis of gene expression
More informationRECENT developments in methods for controlling tested are truly false, the FDR procedure will identify a
Copyright 2003 by the Genetics Society of America Note False Discovery Rate in Linkage and Association Genome Screens for Complex Disorders Chiara Sabatti,*,1 Susan Service and Nelson Freimer *Departments
More informationThe samr Package. R topics documented: October 7, Title SAM: Significance Analysis of Microarrays. Version 1.20
The samr Package October 7, 2005 Title SAM: Significance Analysis of Microarrays Version 1.20 Author R. Tibshirani, G. Chu, T. Hastie, Balasubramanian Narasimhan Description Significance Analysis of Microarrays
More informationBig Data. Methodological issues in using Big Data for Official Statistics
Giulio Barcaroli Istat (barcarol@istat.it) Big Data Effective Processing and Analysis of Very Large and Unstructured data for Official Statistics. Methodological issues in using Big Data for Official Statistics
More informationIntroduction to Bioinformatics. Fabian Hoti 6.10.
Introduction to Bioinformatics Fabian Hoti 6.10. Analysis of Microarray Data Introduction Different types of microarrays Experiment Design Data Normalization Feature selection/extraction Clustering Introduction
More informationThe samr Package. June 7, 2007
The samr Package June 7, 2007 Title SAM: Significance Analysis of Microarrays Version 1.25 Author R. Tibshirani, G. Chu, T. Hastie, Balasubramanian Narasimhan Description Significance Analysis of Microarrays
More informationInherent variation in the reactions, type of enzymes used. Depends on the type of labeling and procedures, as well as the age of the labels.
332 Experimental design, analysis of variance and slide quality assessment in gene expression arrays Sorin Draghici*, Alexander Kuklin, Bruce Hoff & Soheil Shams Address BioDiscovery Inc 11150 West Olympic
More informationA robust statistical procedure to discover expression biomarkers using microarray genomic expression data *
Zou et al. / J Zhejiang Univ SCIENCE B 006 7(8):603-607 603 Journal of Zhejiang University SCIENCE B ISSN 1673-1581 (Print); ISSN 186-1783 (Online) www.zju.edu.cn/jzus; www.springerlink.com E-mail: jzus@zju.edu.cn
More informationSTATISTICAL ANALYSIS OF 70-MER OLIGONUCLEOTIDE MICROARRAY DATA FROM POLYPLOID EXPERIMENTS USING REPEATED DYE-SWAPS
STATISTICAL ANALYSIS OF 7-MER OLIGONUCLEOTIDE MICROARRAY DATA FROM POLYPLOID EXPERIMENTS USING REPEATED DYE-SWAPS Hongmei Jiang 1, Jianlin Wang, Lu Tian, Z. Jeffrey Chen, and R.W. Doerge 1 1 Department
More informationCS 5984: Application of Basic Clustering Algorithms to Find Expression Modules in Cancer
CS 5984: Application of Basic Clustering Algorithms to Find Expression Modules in Cancer T. M. Murali January 31, 2006 Innovative Application of Hierarchical Clustering A module map showing conditional
More information