Heterogeneity of Variance in Gene Expression Microarray Data

Size: px
Start display at page:

Download "Heterogeneity of Variance in Gene Expression Microarray Data"

Transcription

1 Heterogeneity of Variance in Gene Expression Microarray Data DavidM.Rocke Department of Applied Science and Division of Biostatistics University of California, Davis March 15, 2003 Motivation Abstract One important problem in the analysis of gene expression microarray data is that the variation in expression under constant conditions is not stable from gene to gene. Recently variance stabilizing transformations have been developed that can remove the systematic dependence of the variance on the mean, but it appears that there is still considerable variance heterogeneity that can interfere with global analysis of expression data. Results We develop a method consisting of a variance stabilizing data transformation followed by empirical Bayes estimation of gene-specific variances that is more powerful than using data from that gene alone, but does not suffer from the bias caused by the use of global error models. Availability R code will be available from the author by or on the website Contact dmrocke@ucdavis.edu. 1

2 1. Introduction Consider a set of microarray experiments of n arrays each with p genes. For each gene considered separately we entertain a statistical model which is linear in a set of factors or variables that are attached to the arrays, so that the statistical model is common to all genes. Assume that the expression data have been transformed so that the variances neither increase nor decrease systematically with the mean expression of the gene. Given a statistically hypothesis framed within the linear model for each gene, there is almost always an exact or approximate F -test, in which the numerator can be calculated from the cell means of the data for a particular gene, and the denominator (if the test is conductedinisolationforeachgene)isafunctionofthedeviationsofthedatafromthe cell mean (Kerr 2003). An alternative approach with variance-stabilized data is to obtain the numerator of the test from the particular gene, but obtain the denominator from a global error model (in this case, constant variance). This increases the power of the tests considerably because the variance estimates will be based on thousands of points, not just a few. However, it introduces possible biases if the variances are not truly homogeneous (Kerr 2003; Kerr, Martin, and Churchill 2000). A compromise between power and bias may be obtained by using variance estimates for the denominators of the F -test that are a compromise themselves between the gene-specific variance and the global variance. 2. A Motivating Example We consider an experiment in which cell lines in four conditions are to be compared. There are two observations for each of the four conditions consisting of an Affymetrix U95A GeneChip for each sample. For the sake of illustration, we will consider the MAS 4.0 average difference summary, one main advantage of which is that it does not artificially compress the low-level data. One goal of the analysis is to determine what genes are differentially expressed among the four conditions. A standard approach if we consider only one gene would be to perform a one-way analysis of variance (ANOVA). However, a standard assumption of that standard analysis is that the variance at the different levels is the same. In the case of microarray data, there is a strong dependence of the variance on the mean, as is shown for these data by Figures 1 and 2, which give the difference of replicates in a gene-by-group condition vs. the sum. This type of variability can be removed by the generalized log (glog) transform introduced independently by Durbin et al. (2002), Hawkins (2002), Huber et al. (2002), and Munson (2001), and further developed in Durbin and Rocke (2003a; 2003b), Geller et al. (2003) and Rocke and Durbin (2003a; 2003b). Figure 3 shows the same sum/difference data after transforming by the glog with a parameter of λ = 1225 estimated by maximum likelihood (Durbin and Rocke 2003a). 2

3 MSE Source TWER FWER FDR Gene-Specific Global Posterior Table 1: Number of genes out of 12,625 significant at the 5% level for three methods of estimating the MSE in a microarray experiment. Column 2 is the raw p-values with test-wise error rate (TWER) 5%. Column 3 give the family-wise error rate (FWER) using the Bonferroni inequality, and column 4 is the set of genes nominated as significant by the false-discovery-rate (FDR) method of Benjamin and Hochberg (1995; see also Reiner et al. 2003). At this point, one could reasonably perform an ANOVA for gene i using the model z ijk = β j + ² jk, (2.1) whereherethez ijk are additively-normalized, glog-transformed expression values. In this way, we obtain 12,625 F-tests of the null hypothesis of equal expression for all groups in which we compare the mean square for groups from gene i (MSG i )tothemeansquarefor error from gene i (MSE i ) by referring the ratio MSG i /MSE i to an F distribution with 3 and 4 degrees of freedom. This procedure should be valid, and after an adjustment for multiplicity, the results could be used directly. Figure 4 gives a histogram of the 12,625 p-values showing that certainly some of them represent real effects. The first line of figures in Table 1 shows that, at the 5% level, 1 gene is significant using the Bonferroni method, and 18 are significant using the FDR method of Benjamin and Hochberg (1995; see also Reiner et al. 2003). A possible objection to this procedure is that we are losing power by not employing information from other genes. If we employ the perspective of Kerr, Martin, and Churchill (2000), we could estimate the model z ijk = µ i + n k + β ij + ² ijk (2.2) where the z ijk are glog-transformed (unnormalized) expression values, the normalization is part of the ANOVA (the n k terms), and the group effects are in the gene-by-group interaction terms β ij (Kerr 2003). This analysis gives as another mean square for error that we could use as a denominator, in which case the F-statistics for each gene separately would have 3 and 50,493 df. Figure 5 shows the histogram of the p-values using this method. The excess of very small F-statistics is a sign that the model is incorrect. In this case, the assumption that all genes have the same MSE is almost certainly false. Use of an average MSE, when small or large ones will be more appropriate, will lead to an excess of p-values at both ends. In the second line of figures in Table 1, the number of genes nominated as significant is much greater for each of the three methods than when the gene-specific MSE 3

4 is used. It is likely that some of these are mistakes, being due to a large true gene-specific MSE being coupled with using an average MSE as a denominator instead of an unbiased gene-specific MSEestimate. The average value over all 12,625 genes of the MSE is , which is also the residual MSE from the global model. If the 4df estimates from each gene had the distribution predicted from normality and constant true variance, the variance of these MSE estimates across genes would be 2σ 4 /ν = (0.1017) 2 /2 = Instead, it is , nearly 10 times the size it should be. Of the two simple explanations for this: nonnormality and heterogeneity of variance, the latter is the simpler possibility. We now proceed to account for this situation using a standard empirical Bayes estimate for the individual gene MSE. 3. The Modeling Setup Given n genes indexed by i, supposethatthetruevarianceoftheeffect of interest for gene i is σi 2.Foreachi we obtain a ν degree-of-freedom estimate s2 i of σ2 i. We will work in the Gaussian framework for convenience, in which case we may assume that s 2 i has a gamma distribution with parameters τ (the mean) and a = ν/2 (the shape parameter). Again for simplicity,wetreatthecasewhereν is constant across genes. Though the case where ν varies is not conceptually more difficult, the computations are more complex. We model these individual values σi 2 = τ i as random with an inverse gamma distribution with parameters α and η = αβ. Notethatη isthemeanoftheinverseofτ (the reciprocal variance 1/τ is sometimes called the precision). With this as a prior distribution, and an observed value s 2 i, the posterior distribution for τ is proportional to e 1/τβ τ ν/2+α+1 (3.1) where β 2 = xν +2α/η Thus, the posterior distribution is inverse gamma, like the prior, with parameters (3.2) Also 1 η α = ν/2+α (3.3) β = 2 xν +2α/η (3.4) η = α β = ν +2α xν +2α/η (3.5) xν +2α/η = ν +2α µ ν = x + 1 µ 2α ν +2α η ν +2α 4 (3.6) (3.7)

5 Now x here is an observed value of s 2 i,and1/η is the reciprocal of the mean prior precision, which is thus an estimate of the center of the prior distribution for τ i = σi 2.Also ν is the degrees of freedom of s 2 i and 2α is the equivalent degrees of freedom of the prior. Thus, the posterior estimate of the variance used here will be a weighted average of the individual variance and the prior mean reciprocal precision, each weighted by its degrees of freedom. This method of estimation of a variance using an inverse gamma conjugate prior is completely standard (Carlin and Lewis 2000; Gelman et al. 1995), and has been used previously in a microarray context by Baldi and Long (2001). The first two references give more detail on the derivation of the posterior in this case. 4. Empirical Estimation of the Prior To complete the empirical Bayes estimation procedure, we need to specify how we estimate the parameters of the prior from the ensemble of variances. If each observed variance s 2 i has a gamma distribution F i with parameters τ and a = ν/2, and if the prior distribution G of τ is inverse gamma with parameters α and β then E(s 2 i ) = V (s 2 i ) = 1 β(α 1) 2(α 1)/ν +1 β 2 (α 1) 2 (α 2) (4.1) If an ensemble of variances has mean M and variance V, then a method of moments estimate of α and β is given by solving M = V = 1 β(α 1) 2(α 1)/ν +1 β 2 (α 1) 2 (α 2) (4.2) for α and β. This leads to ˆα = M 2 (1 2/ν)+2V V 2m 2 /ν 1 ˆβ = M(ˆα 1) (4.3) as method-of-moments estimates. If the variances were homogeneous, then we would have that V 2M 2 /ν. If the either the denominator or the numerator is negative, that is presumably a sign that there is not an important amount of heterogeneity in the variances. However, usually both will be bounded well away from zero. 5

6 5. The Example Continued For the example data set, the mean of the 12,625 values of the residual MSE is and the variance of the same collection is Using (4.3), we obtain ˆα = ˆβ = ˆη = /ˆν = The degrees of freedom of the prior is 2α =4.615, so for each gene i,weobtainan8.6dfmse estimate by taking a weighted average of the 4df MSE from the ANOVA of that gene (with weight 4/8.6), and the prior best estimate (with weight4.6/8.6). Figure 6 shows the histogram of the p-values obtained by this method, which shows no sign of distortion at the high p-value end. Comparing the three methods shown in Table 1, we see that the global MSE estimate rejects the most genes, but Figure 5 shows that these rejections cannot be trusted. The posterior best estimate MSE identifies a much larger number of genes as differentially expressed than using 4df gene-specific MSE s, without apparent signs of problems with maintaining thesizeofthetests. 6. Concluding Remarks Bayesian and empirical Bayesian methods are frequently proposed for the analysis of microarray data (for example, Baldi and Long 2001; Broët et al. 2002; Efron et al. 2002; Ibrahim et al. 2002; Newton et al. 2001, 2003; Theilhaber et al. 2001). What is proposed here is a sort of minimal empirical Bayesian approach. We do not need to put a prior distribution on the mean expression across genes or on the probability of positive expression, since this is handled by the multiplicity-adjusted F-tests. Our approach resembles most closely the treatment in Baldi and Long (2001). However, their use of the log transform resulted in substantial dependence of the variance on the mean, whereas by use of the glog transform, we have removed at least most of this dependence. This makes the Bayesian model fit the data better than in their case. We have written code in the R language (Ihaka and Gentleman 1996) that implements many of the required calculations in standard situations. They will be available from the author by or on the website Acknowledgements The research reported in this paper was supported by grants from the National Science Foundation (ACI , and DMS ) and the National Institute of Environmental Health Sciences, National Institutes of Health (P43 ES04699). 6

7 Appendix: The Gamma and Inverse Gamma Distributions The gamma distribution with parameters α and β has density The first two moments are given by f X (x) = xα 1 e x/β Γ(α)β α (.1) E(X) = αβ = τ (.2) V (X) = αβ 2 = τ 2 /α (.3) The inverse gamma distribution with parameters α and β is the distribution of Y =1/X where X is gamma distributed with parameters α and β. The density of Y is The first two moments are given by f Y (y) = e 1/yβ Γ(α)β α y α+1 (.4) E(Y ) = V (Y ) = 1 β(α 1) 1 β 2 (α 1) 2 (α 2) (.5) (.6) We will re-parametrize in terms of α and η = αβ, which is the mean of the reciprocal of the inverse gamma variate. We then have that the density is f Y (y) = e α/yη Γ(α)(η/α) α y α+1 (.7) The first two moments are given in this parametrization by E(Y ) = V (Y ) = α η(α 1) α 2 η 2 (α 1) 2 (α 2) (.8) (.9) References Baldi, P. and Long, A.D. (2001) A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inference of gene changes, Bioinformatics, 17,

8 Benjamani, Y. and Hochberg, Y. (1995) Controlling the false discovery rate, Journal of the Royal Statistical Society, Series B, 57, Broët, P., Richardson, S., and Radvanyi, F. (2002) Bayesian hierarchical model for identifying changes in gene expression from microarray experiments, Journal of Computational Biology, 9, Carlin,B.P.andThomas,L.A.(2000)Bayes and Empirical Bayes Methods for Data Analysis, Second Edition, New York: Chapman and Hall. Durbin, B.P., Hardin, J.S., Hawkins, D.M., and Rocke, D.M. (2002) A variance-stabilizing transformation for gene-expression microarray data, Bioinformatics, 18, S105 S110. Durbin, B. and Rocke, D. M. (2003a) Estimation of transformation parameters for microarray data, Bioinformatics, in press. Durbin, B. and Rocke, D. M. (2003b) Exact and approximate variance-stabilizing transformations for two-color microarrays, submitted for publication. Efron, B., Tibshirani, R., Storey, J.D., and Tusher, V. (2002) Empirical Bayes analysis of a microarray experiment, Journal of the American Statistical Association, 96, Geller, S.C., Gregg, J.P., Hagerman, P.J., and Rocke, D.M. (2003) Transformation and normalization of oligonucleotide microarray data, submitted for publication. Gelman, A., Carlin, J.B., Stern, H.S., and Rubin, D.B. (1995) Bayesian Data Analysis, New York: Chapman and Hall. Hawkins, D.M. (2002) Diagnostics for conformity of paired quantitative measurements, Statistics in Medicine, 21, Holder,D.,Raubertas,R.F.,Pikounis,V.B.,Svetnik,V.,andSoper,K.(2001) Statistical analysis of high density oligonucleotide arrars: A SAFER approach, GeneLogic Workshop on Low Level Analysis of Affymetrix GeneChip Data. Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A., and Vingron, M. (2002) Variance stabilization applied to microarray data calibration and to the quantification of differential expression, Bioinformatics, 18, S96 S104. Ibrahim, J.G., Chen, M.-H., and Gray, R.J. (2002) Bayesian models for gene expression with microarray data, Journal of the American Statistical Association, 97, Ihaka, R. and Gentleman, R. (1996) R: A language for data analysis and graphics, Journal of Computational and Graphical Statistics, 5, (See 8

9 Kerr, M.K. (2003) Linear models for microarray data analysis: Hidden similarity and differences, University of Washington Biostatistics Working Paper 190. Kerr, M.K., Martin, M., and Churchill, G.A. (2000) Analysis of variance for gene expression microarray data, Journal of Computational Biology, 7, Munson, P. (2001) A Consistency Test for Determining the Significance of Gene Expression Changes on Replicate Samples and Two Convenient Variance-stabilizing Transformations, GeneLogic Workshop on Low Level Analysis of Affymetrix GeneChip Data. Newton,M.A.,Kendziorski,C.M.,Richmond,C.S.,Blattner,F.R.,andTsui,K.W.(2001) On differential variability of expression ratios: improving statistical inference about gene expression changes from microarray data, Journal of Computational Biology, 8, Newton, M.A., Noueiry, A., Sarkar, D., and Ahlquist, P. (2003) Detecting differential gene expression with a semiparametric heirarchical mixture model, manuscript. Reiner, A., Yekutieli, D. and Benjamini, Y. (2003) Identifying differntially expressed genes using false discovery rate controllling procedures, Bioinformatics, 19, Rocke, D., and Durbin, B. (2001) A model for measurement error for gene expression arrays, Journal of Computational Biology, 8, Rocke, D. and Durbin, B. (2003) Approximate variance-stabilizing transformations for gene-expression microarray data, Bioinformatics, in press. Theilhaber, J., Bushnell, S., Jackson, A., and Fuchs, R. (2001) Bayesian estimation of fold changes in the analysis of gene expression: The PFOLD algorithm, Journal of Computational Biology, 8,

10 List of Figures 1. Absolute difference in replicates versus the sum for the 12,625 4 gene-by-group combinations. 2. Absolute difference in replicates versus the rank of the sum for the 12,625 4 geneby-group combinations. 3. Absolute difference in replicates versus the rank of the sum for the 12,625 4 geneby-group combinations after transformation by the glog with λ = Histogram of p-values for 12,625 F-tests using gene-specific MSE. 5. Histogram of p-values for 12,625 F-tests using global MSE. 6. Histogram of p-values for 12,625 F-tests using posterior best-estimate MSE. 10

11 Difference Sum Raw Data

12 Difference Rank of Sum Raw Data

13 Difference Rank of Sum Glog of Data

14 Histogram of Gene-Specific p-values Raw p-values Frequency

15 Histogram of Global p-values Raw p-values Frequency

16 Histogram of Posterior p-values Raw p-values Frequency

Some Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods

Some Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods Some Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods EPP 245/298 Statistical Analysis of Laboratory Data October 11, 2005 1 The

More information

Some Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods

Some Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods Some Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods BST 226 Statistical Methods for Bioinformatics January 8, 2014 1 The -Omics

More information

Some Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods

Some Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods Some Principles for the Design and Analysis of Experiments using Gene Expression Arrays and Other High-Throughput Assay Methods SPH 247 Statistical Analysis of Laboratory Data April 21, 2015 1 The -Omics

More information

Design and analysis of experiments with high throughput biological assay data

Design and analysis of experiments with high throughput biological assay data Seminars in Cell & Developmental Biology 15 (2004) 703 713 Design and analysis of experiments with high throughput biological assay data David M. Rocke Division of Biostatistics, University of California,

More information

STATISTICAL CHALLENGES IN GENE DISCOVERY

STATISTICAL CHALLENGES IN GENE DISCOVERY STATISTICAL CHALLENGES IN GENE DISCOVERY THROUGH MICROARRAY DATA ANALYSIS 1 Central Tuber Crops Research Institute,Kerala, India 2 Dept. of Statistics, St. Thomas College, Pala, Kerala, India email:sreejyothi

More information

Microarray Data Analysis Workshop. Preprocessing and normalization A trailer show of the rest of the microarray world.

Microarray Data Analysis Workshop. Preprocessing and normalization A trailer show of the rest of the microarray world. Microarray Data Analysis Workshop MedVetNet Workshop, DTU 2008 Preprocessing and normalization A trailer show of the rest of the microarray world Carsten Friis Media glna tnra GlnA TnrA C2 glnr C3 C5 C6

More information

Introduction to microarrays

Introduction to microarrays Bayesian modelling of gene expression data Alex Lewin Sylvia Richardson (IC Epidemiology) Tim Aitman (IC Microarray Centre) Philippe Broët (INSERM, Paris) In collaboration with Anne-Mette Hein, Natalia

More information

Significance testing for small microarray experiments

Significance testing for small microarray experiments CHAPTER 8 Significance testing for small microarray experiments Charles Kooperberg, Aaron Aragaki, Charles C. Carey, and Suzannah Rutherford 8.1 Introduction When a study has many degrees of freedom it

More information

David M. Rocke Division of Biostatistics and Department of Biomedical Engineering University of California, Davis

David M. Rocke Division of Biostatistics and Department of Biomedical Engineering University of California, Davis David M. Rocke Division of Biostatistics and Department of Biomedical Engineering University of California, Davis Outline RNA-Seq for differential expression analysis Statistical methods for RNA-Seq: Structure

More information

Gene Expression Data Analysis

Gene Expression Data Analysis Gene Expression Data Analysis Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu BMIF 310, Fall 2009 Gene expression technologies (summary) Hybridization-based

More information

Introduction to Quantitative Genomics / Genetics

Introduction to Quantitative Genomics / Genetics Introduction to Quantitative Genomics / Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics September 10, 2008 Jason G. Mezey Outline History and Intuition. Statistical Framework. Current

More information

Identification of biological themes in microarray data from a mouse heart development time series using GeneSifter

Identification of biological themes in microarray data from a mouse heart development time series using GeneSifter Identification of biological themes in microarray data from a mouse heart development time series using GeneSifter VizX Labs, LLC Seattle, WA 98119 Abstract Oligonucleotide microarrays were used to study

More information

Nima Hejazi. Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi. nimahejazi.org github/nhejazi

Nima Hejazi. Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi. nimahejazi.org github/nhejazi Data-Adaptive Estimation and Inference in the Analysis of Differential Methylation for the annual retreat of the Center for Computational Biology, given 18 November 2017 Nima Hejazi Division of Biostatistics

More information

CS-E5870 High-Throughput Bioinformatics Microarray data analysis

CS-E5870 High-Throughput Bioinformatics Microarray data analysis CS-E5870 High-Throughput Bioinformatics Microarray data analysis Harri Lähdesmäki Department of Computer Science Aalto University September 20, 2016 Acknowledgement for J Salojärvi and E Czeizler for the

More information

Data-Adaptive Estimation and Inference in the Analysis of Differential Methylation

Data-Adaptive Estimation and Inference in the Analysis of Differential Methylation Data-Adaptive Estimation and Inference in the Analysis of Differential Methylation for the annual retreat of the Center for Computational Biology, given 18 November 2017 Nima Hejazi Division of Biostatistics

More information

Review Statistical tests for differential expression in cdna microarray experiments Xiangqin Cui and Gary A Churchill

Review Statistical tests for differential expression in cdna microarray experiments Xiangqin Cui and Gary A Churchill Review Statistical tests for differential expression in cdna microarray experiments Xiangqin Cui and Gary A Churchill Address: The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine 04609, USA. Correspondence:

More information

Comparison of Microarray Pre-Processing Methods

Comparison of Microarray Pre-Processing Methods Comparison of Microarray Pre-Processing Methods K. Shakya, H. J. Ruskin, G. Kerr, M. Crane, J. Becker Dublin City University, Dublin 9, Ireland Abstract Data pre-processing in microarray technology is

More information

Downloaded from:

Downloaded from: Lewin, A; Richardson, S; Marshall, C; Glazier, A; Aitman, T (2006) Bayesian modeling of differential gene expression. Biometrics, 62 (1). pp. 1-9. ISSN 0006-341X DOI: https://doi.org/10.1111/j.1541-0420.2005.00394.x

More information

Designing a Complex-Omics Experiments. Xiangqin Cui. Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham

Designing a Complex-Omics Experiments. Xiangqin Cui. Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham Designing a Complex-Omics Experiments Xiangqin Cui Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham 1/7/2015 Some slides are from previous lectures of Grier

More information

Seven Keys to Successful Microarray Data Analysis

Seven Keys to Successful Microarray Data Analysis Seven Keys to Successful Microarray Data Analysis Experiment Design Platform Selection Data Management System Access Differential Expression Biological Significance Data Publication Type of experiment

More information

Mixture modeling for genome-wide localization of transcription factors

Mixture modeling for genome-wide localization of transcription factors Mixture modeling for genome-wide localization of transcription factors Sündüz Keleş 1,2 and Heejung Shim 1 1 Department of Statistics 2 Department of Biostatistics & Medical Informatics University of Wisconsin,

More information

Lab 1: A review of linear models

Lab 1: A review of linear models Lab 1: A review of linear models The purpose of this lab is to help you review basic statistical methods in linear models and understanding the implementation of these methods in R. In general, we need

More information

Bayesian Robust Inference for Differential Gene Expression in Microarrays with Multiple Samples

Bayesian Robust Inference for Differential Gene Expression in Microarrays with Multiple Samples Biometrics 62, 10 18 March 2006 DOI: 10.1111/j.1541-0420.2005.00397.x Bayesian Robust Inference for Differential Gene Expression in Microarrays with Multiple Samples Raphael Gottardo, 1, Adrian E. Raftery,

More information

Gene Expression Data Analysis (I)

Gene Expression Data Analysis (I) Gene Expression Data Analysis (I) Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Bioinformatics tasks Biological question Experiment design Microarray experiment

More information

Designing Complex Omics Experiments

Designing Complex Omics Experiments Designing Complex Omics Experiments Xiangqin Cui Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham 6/15/2015 Some slides are from previous lectures given by

More information

Outline. Analysis of Microarray Data. Most important design question. General experimental issues

Outline. Analysis of Microarray Data. Most important design question. General experimental issues Outline Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization Introduction to microarrays Experimental design Data normalization Other data transformation Exercises George Bell,

More information

Optimal alpha reduces error rates in gene expression studies: a meta-analysis approach

Optimal alpha reduces error rates in gene expression studies: a meta-analysis approach Mudge et al. BMC Bioinformatics (2017) 18:312 DOI 10.1186/s12859-017-1728-3 METHODOLOGY ARTICLE Open Access Optimal alpha reduces error rates in gene expression studies: a meta-analysis approach J. F.

More information

Introduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics

Introduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics Introduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics abedi777@ymail.com Outlines Technology Basic concepts Data analysis Printed Microarrays In Situ-Synthesized

More information

Normalization. Getting the numbers comparable. DNA Microarray Bioinformatics - #27612

Normalization. Getting the numbers comparable. DNA Microarray Bioinformatics - #27612 Normalization Getting the numbers comparable The DNA Array Analysis Pipeline Question Experimental Design Array design Probe design Sample Preparation Hybridization Buy Chip/Array Image analysis Expression

More information

FEATURE-LEVEL EXPLORATION OF THE CHOE ET AL. AFFYMETRIX GENECHIP CONTROL DATASET

FEATURE-LEVEL EXPLORATION OF THE CHOE ET AL. AFFYMETRIX GENECHIP CONTROL DATASET Johns Hopkins University, Dept. of Biostatistics Working Papers 3-17-2006 FEATURE-LEVEL EXPLORATION OF THE CHOE ET AL. AFFYETRIX GENECHIP CONTROL DATASET Rafael A. Irizarry Johns Hopkins Bloomberg School

More information

MulCom: a Multiple Comparison statistical test for microarray data in Bioconductor.

MulCom: a Multiple Comparison statistical test for microarray data in Bioconductor. MulCom: a Multiple Comparison statistical test for microarray data in Bioconductor. Claudio Isella, Tommaso Renzulli, Davide Corà and Enzo Medico May 3, 2016 Abstract Many microarray experiments compare

More information

Meta-analysis combines Affymetrix microarray results across laboratories

Meta-analysis combines Affymetrix microarray results across laboratories Comparative and Functional Genomics Comp Funct Genom 2005; 6: 116 122. Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cfg.460 Conference Paper Meta-analysis combines

More information

Background and Normalization:

Background and Normalization: Background and Normalization: Investigating the effects of preprocessing on gene expression estimates Ben Bolstad Group in Biostatistics University of California, Berkeley bolstad@stat.berkeley.edu http://www.stat.berkeley.edu/~bolstad

More information

A Statistical Framework for the Analysis of Microarray Probe-Level Data

A Statistical Framework for the Analysis of Microarray Probe-Level Data Johns Hopkins University, Dept. of Biostatistics Working Papers 3-1-2005 A Statistical Framework for the Analysis of Microarray Probe-Level Data Zhijin Wu Department of Biostatistics, Johns Hopkins Bloomberg

More information

Exam 1 from a Past Semester

Exam 1 from a Past Semester Exam from a Past Semester. Provide a brief answer to each of the following questions. a) What do perfect match and mismatch mean in the context of Affymetrix GeneChip technology? Be as specific as possible

More information

STATISTICAL ANALYSIS OF 70-MER OLIGONUCLEOTIDE MICROARRAY DATA FROM POLYPLOID EXPERIMENTS USING REPEATED DYE-SWAPS

STATISTICAL ANALYSIS OF 70-MER OLIGONUCLEOTIDE MICROARRAY DATA FROM POLYPLOID EXPERIMENTS USING REPEATED DYE-SWAPS STATISTICAL ANALYSIS OF 7-MER OLIGONUCLEOTIDE MICROARRAY DATA FROM POLYPLOID EXPERIMENTS USING REPEATED DYE-SWAPS Hongmei Jiang 1, Jianlin Wang, Lu Tian, Z. Jeffrey Chen, and R.W. Doerge 1 1 Department

More information

Bootstrapping Cluster Analysis: Assessing the Reliability of Conclusions from Microarray Experiments

Bootstrapping Cluster Analysis: Assessing the Reliability of Conclusions from Microarray Experiments Bootstrapping Cluster Analysis: Assessing the Reliability of Conclusions from Microarray Experiments M. Kathleen Kerr The Jackson Laboratory Bar Harbor, Maine 469 U.S.A. mkk@jax.org Gary A. Churchill 1

More information

Bayesian Analysis of Comparative Microarray Experiments by Model Averaging

Bayesian Analysis of Comparative Microarray Experiments by Model Averaging Bayesian Analysis (2006) 1, Number 4, pp. 707 732 Bayesian Analysis of Comparative Microarray Experiments by Model Averaging Paola Sebastiani, Hui Xie and Marco F Ramoni Abstract. A major challenge to

More information

V10-8. Gene Expression

V10-8. Gene Expression V10-8. Gene Expression - Regulation of Gene Transcription at Promoters - Experimental Analysis of Gene Expression - Statistics Primer - Preprocessing of Data - Differential Expression Analysis Fri, May

More information

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA.

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA. QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA www.biostat.jhsph.edu/ kbroman Outline Experiments, data, and goals Models ANOVA at marker

More information

Experimental Design Day 2

Experimental Design Day 2 Experimental Design Day 2 Experiment Graphics Exploratory Data Analysis Final analytic approach Experiments with a Single Factor Example: Determine the effects of temperature on process yields Case I:

More information

Article: Differential Expression with the Bioconductor Project

Article: Differential Expression with the Bioconductor Project Section: Computational Methods for High Throughput Genetic Analysis - Expression profiling Article: Differential Expression with the Bioconductor Project Anja von Heydebreck 1, Wolfgang Huber 2, Robert

More information

New Statistical Algorithms for Monitoring Gene Expression on GeneChip Probe Arrays

New Statistical Algorithms for Monitoring Gene Expression on GeneChip Probe Arrays GENE EXPRESSION MONITORING TECHNICAL NOTE New Statistical Algorithms for Monitoring Gene Expression on GeneChip Probe Arrays Introduction Affymetrix has designed new algorithms for monitoring GeneChip

More information

Preprocessing Methods for Two-Color Microarray Data

Preprocessing Methods for Two-Color Microarray Data Preprocessing Methods for Two-Color Microarray Data 1/15/2011 Copyright 2011 Dan Nettleton Preprocessing Steps Background correction Transformation Normalization Summarization 1 2 What is background correction?

More information

Joint Estimation of Calibration and Expression for High-Density Oligonucleotide Arrays

Joint Estimation of Calibration and Expression for High-Density Oligonucleotide Arrays Joint Estimation of Calibration and Expression for High-Density Oligonucleotide Arrays Ann L. Oberg, Douglas W. Mahoney, Karla V. Ballman, Terry M. Therneau Department of Health Sciences Research, Division

More information

Microarray analysis challenges.

Microarray analysis challenges. Microarray analysis challenges. While not quite as bad as my hobby of ice climbing you, need the right equipment! T. F. Smith Bioinformatics Boston Univ. Experimental Design Issues Reference and Controls

More information

Lecture 2: March 8, 2007

Lecture 2: March 8, 2007 Analysis of DNA Chips and Gene Networks Spring Semester, 2007 Lecture 2: March 8, 2007 Lecturer: Rani Elkon Scribe: Yuri Solodkin and Andrey Stolyarenko 1 2.1 Low Level Analysis of Microarrays 2.1.1 Introduction

More information

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA.

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA. QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA www.biostat.jhsph.edu/ kbroman Outline Experiments, data, and goals Models ANOVA at marker

More information

Oligonucleotide microarray data are not normally distributed

Oligonucleotide microarray data are not normally distributed Oligonucleotide microarray data are not normally distributed Johanna Hardin Jason Wilson John Kloke Abstract Novel techniques for analyzing microarray data are constantly being developed. Though many of

More information

Introduction to Bioinformatics. Fabian Hoti 6.10.

Introduction to Bioinformatics. Fabian Hoti 6.10. Introduction to Bioinformatics Fabian Hoti 6.10. Analysis of Microarray Data Introduction Different types of microarrays Experiment Design Data Normalization Feature selection/extraction Clustering Introduction

More information

Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison. CodeLink compatible

Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison. CodeLink compatible Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison CodeLink compatible Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood

More information

EECS730: Introduction to Bioinformatics

EECS730: Introduction to Bioinformatics EECS730: Introduction to Bioinformatics Lecture 14: Microarray Some slides were adapted from Dr. Luke Huan (University of Kansas), Dr. Shaojie Zhang (University of Central Florida), and Dr. Dong Xu and

More information

Comparative analysis of RNA-Seq data with DESeq2

Comparative analysis of RNA-Seq data with DESeq2 Comparative analysis of RNA-Seq data with DESeq2 Simon Anders EMBL Heidelberg Two applications of RNA-Seq Discovery find new transcripts find transcript boundaries find splice junctions Comparison Given

More information

Analysis of Microarray Data

Analysis of Microarray Data Analysis of Microarray Data Lecture 3: Visualization and Functional Analysis George Bell, Ph.D. Senior Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Review

More information

THREE LEVEL HIERARCHICAL BAYESIAN ESTIMATION IN CONJOINT PROCESS

THREE LEVEL HIERARCHICAL BAYESIAN ESTIMATION IN CONJOINT PROCESS Please cite this article as: Paweł Kopciuszewski, Three level hierarchical Bayesian estimation in conjoint process, Scientific Research of the Institute of Mathematics and Computer Science, 2006, Volume

More information

From CEL files to lists of interesting genes. Rafael A. Irizarry Department of Biostatistics Johns Hopkins University

From CEL files to lists of interesting genes. Rafael A. Irizarry Department of Biostatistics Johns Hopkins University From CEL files to lists of interesting genes Rafael A. Irizarry Department of Biostatistics Johns Hopkins University Contact Information e-mail Personal webpage Department webpage Bioinformatics Program

More information

Some Statistical Issues in Microarray Gene Expression Data

Some Statistical Issues in Microarray Gene Expression Data From the SelectedWorks of Jeffrey S. Morris June, 2006 Some Statistical Issues in Microarray Gene Expression Data Matthew S. Mayo, University of Kansas Medical Center Byron J. Gajewski, University of Kansas

More information

Analysis of Microarray Data

Analysis of Microarray Data Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization George Bell, Ph.D. Senior Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Introduction

More information

Combining ANOVA and PCA in the analysis of microarray data

Combining ANOVA and PCA in the analysis of microarray data Combining ANOVA and PCA in the analysis of microarray data Lutgarde Buydens IMM, Analytical chemistry Radboud University Nijmegen, the Netherlands Scientific Staff: PhD students: External PhD: Post doc:

More information

Multiple Testing in RNA-Seq experiments

Multiple Testing in RNA-Seq experiments Multiple Testing in RNA-Seq experiments O. Muralidharan et al. 2012. Detecting mutations in mixed sample sequencing data using empirical Bayes. Bernd Klaus Institut für Medizinische Informatik, Statistik

More information

Raking and Selection of Differentially Expressed Genes from Microarray Data

Raking and Selection of Differentially Expressed Genes from Microarray Data Proceedings of the 6 WSEAS International Conference on Mathematical Biology and Ecology, Miami, Florida, USA, January 8-, 6 (pp4-45) Raking and Selection of Differentially Expressed Genes from Microarray

More information

Statistical Methods in Bioinformatics

Statistical Methods in Bioinformatics Statistical Methods in Bioinformatics CS 594/680 Arnold M. Saxton Department of Animal Science UT Institute of Agriculture Bioinformatics: Interaction of Biology/Genetics/Evolution/Genomics Computer Science/Algorithms/Database

More information

Gene expression data analysis in clinical cancer research

Gene expression data analysis in clinical cancer research Gene expression data analysis in clinical cancer research L analisi dell espressione genica nella ricerca oncologica Philippe Broët 1 INSERM U47 and Faculty of Medicine Paris-Sud broet@vjf.inserm.fr Summary:

More information

A note on oligonucleotide expression values not being normally distributed

A note on oligonucleotide expression values not being normally distributed Biostatistics (2009), 10, 3, pp. 446 450 doi:10.1093/biostatistics/kxp003 Advance Access publication on March 10, 2009 A note on oligonucleotide expression values not being normally distributed JOHANNA

More information

Parameter Estimation for the Exponential-Normal Convolution Model

Parameter Estimation for the Exponential-Normal Convolution Model Parameter Estimation for the Exponential-Normal Convolution Model Monnie McGee & Zhongxue Chen cgee@smu.edu, zhongxue@smu.edu. Department of Statistical Science Southern Methodist University ENAR Spring

More information

Adjusting batch effects in microarray expression data using empirical Bayes methods

Adjusting batch effects in microarray expression data using empirical Bayes methods Biostatistics (2007), 8, 1, pp. 118 127 doi:10.1093/biostatistics/kxj037 Advance Access publication on April 21, 2006 Adjusting batch effects in microarray expression data using empirical Bayes methods

More information

Affymetrix GeneChip Arrays. Lecture 3 (continued) Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy

Affymetrix GeneChip Arrays. Lecture 3 (continued) Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy Affymetrix GeneChip Arrays Lecture 3 (continued) Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy Affymetrix GeneChip Design 5 3 Reference sequence TGTGATGGTGGGGAATGGGTCAGAAGGCCTCCGATGCGCCGATTGAGAAT

More information

University of Groningen

University of Groningen University of Groningen Evaluation of an Affymetrix High-density Oligonucleotide Microarray Platform as a Measurement System van den Heuvel, Edwin; Geeven, Geert; Bauerschmidt, Susanne; Polman, Jan E.M.

More information

Feature selection methods for SVM classification of microarray data

Feature selection methods for SVM classification of microarray data Feature selection methods for SVM classification of microarray data Mike Love December 11, 2009 SVMs for microarray classification tasks Linear support vector machines have been used in microarray experiments

More information

Analysis of Microarray Data

Analysis of Microarray Data Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization George Bell, Ph.D. Senior Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Introduction

More information

Bioconductor Project

Bioconductor Project Bioconductor Project Bioconductor Project Working Papers Year 2004 Paper 7 Differential Expression with the Bioconductor Project Anja von Heydebreck Wolfgang Huber Robert Gentleman Department of Computational

More information

Comparison of Affymetrix GeneChip Expression Measures

Comparison of Affymetrix GeneChip Expression Measures Johns Hopkins University, Dept. of Biostatistics Working Papers 9-1-2005 Comparison of Affymetrix GeneChip Expression Measures Rafael A. Irizarry Johns Hopkins Bloomberg School of Public Health, Department

More information

FACTORS CONTRIBUTING TO VARIABILITY IN DNA MICROARRAY RESULTS: THE ABRF MICROARRAY RESEARCH GROUP 2002 STUDY

FACTORS CONTRIBUTING TO VARIABILITY IN DNA MICROARRAY RESULTS: THE ABRF MICROARRAY RESEARCH GROUP 2002 STUDY FACTORS CONTRIBUTING TO VARIABILITY IN DNA MICROARRAY RESULTS: THE ABRF MICROARRAY RESEARCH GROUP 2002 STUDY K. L. Knudtson 1, C. Griffin 2, A. I. Brooks 3, D. A. Iacobas 4, K. Johnson 5, G. Khitrov 6,

More information

Nature Biotechnology: doi: /nbt Supplementary Figure 1. MBQC base beta diversity, major protocol variables, and taxonomic profiles.

Nature Biotechnology: doi: /nbt Supplementary Figure 1. MBQC base beta diversity, major protocol variables, and taxonomic profiles. Supplementary Figure 1 MBQC base beta diversity, major protocol variables, and taxonomic profiles. A) Multidimensional scaling of MBQC sample Bray-Curtis dissimilarities (see Fig. 1). Labels indicate centroids

More information

Analysis of a Proposed Universal Fingerprint Microarray

Analysis of a Proposed Universal Fingerprint Microarray Analysis of a Proposed Universal Fingerprint Microarray Michael Doran, Raffaella Settimi, Daniela Raicu, Jacob Furst School of CTI, DePaul University, Chicago, IL Mathew Schipma, Darrell Chandler Bio-detection

More information

Mixed effects model for assessing RNA degradation in Affymetrix GeneChip experiments

Mixed effects model for assessing RNA degradation in Affymetrix GeneChip experiments Mixed effects model for assessing RNA degradation in Affymetrix GeneChip experiments Kellie J. Archer, Ph.D. Suresh E. Joel Viswanathan Ramakrishnan,, Ph.D. Department of Biostatistics Virginia Commonwealth

More information

Microarray Technique. Some background. M. Nath

Microarray Technique. Some background. M. Nath Microarray Technique Some background M. Nath Outline Introduction Spotting Array Technique GeneChip Technique Data analysis Applications Conclusion Now Blind Guess? Functional Pathway Microarray Technique

More information

Hidden Markov Models for Microarray Time Course Data in Multiple Biological Conditions

Hidden Markov Models for Microarray Time Course Data in Multiple Biological Conditions Hidden Markov Models for Microarray Time Course Data in Multiple Biological Conditions Ming YUAN and Christina KENDZIORSKI Among the first microarray experiments were those measuring expression over time,

More information

CHAPTER 8 PERFORMANCE APPRAISAL OF A TRAINING PROGRAMME 8.1. INTRODUCTION

CHAPTER 8 PERFORMANCE APPRAISAL OF A TRAINING PROGRAMME 8.1. INTRODUCTION 168 CHAPTER 8 PERFORMANCE APPRAISAL OF A TRAINING PROGRAMME 8.1. INTRODUCTION Performance appraisal is the systematic, periodic and impartial rating of an employee s excellence in matters pertaining to

More information

A learned comparative expression measure for Affymetrix GeneChip DNA microarrays

A learned comparative expression measure for Affymetrix GeneChip DNA microarrays Proceedings of the Computational Systems Bioinformatics Conference, August 8-11, 2005, Stanford, CA. pp. 144-154. A learned comparative expression measure for Affymetrix GeneChip DNA microarrays Will Sheffler

More information

Expression summarization

Expression summarization Expression Quantification: Affy Affymetrix Genechip is an oligonucleotide array consisting of a several perfect match (PM) and their corresponding mismatch (MM) probes that interrogate for a single gene.

More information

AS A SERVICE TO THE RESEARCH COMMUNITY, GENOME BIOLOGY PROVIDES A 'PREPRINT' DEPOSITORY

AS A SERVICE TO THE RESEARCH COMMUNITY, GENOME BIOLOGY PROVIDES A 'PREPRINT' DEPOSITORY This information has not been peer-reviewed. Responsibility for the findings rests solely with the author(s). Deposited research article A non-parametric approach for identifying differentially expressed

More information

RNA

RNA RNA sequencing Michael Inouye Baker Heart and Diabetes Institute Univ of Melbourne / Monash Univ Summer Institute in Statistical Genetics 2017 Integrative Genomics Module Seattle @minouye271 www.inouyelab.org

More information

Bayesian Variable Selection and Data Integration for Biological Regulatory Networks

Bayesian Variable Selection and Data Integration for Biological Regulatory Networks Bayesian Variable Selection and Data Integration for Biological Regulatory Networks Shane T. Jensen Department of Statistics The Wharton School, University of Pennsylvania stjensen@wharton.upenn.edu Gary

More information

Running head: Empirical estimates suggest most published research is true

Running head: Empirical estimates suggest most published research is true Running head: Empirical estimates suggest most published research is true Title: Empirical estimates suggest most published medical research is true Authors: Leah R. Jager 1 and Jeffrey T. Leek 2 * Affiliations:

More information

The essentials of microarray data analysis

The essentials of microarray data analysis The essentials of microarray data analysis (from a complete novice) Thanks to Rafael Irizarry for the slides! Outline Experimental design Take logs! Pre-processing: affy chips and 2-color arrays Clustering

More information

Introduction to gene expression microarray data analysis

Introduction to gene expression microarray data analysis Introduction to gene expression microarray data analysis Outline Brief introduction: Technology and data. Statistical challenges in data analysis. Preprocessing data normalization and transformation. Useful

More information

A Discussion of Statistical Methods for Design and Analysis of Microarray Experiments for Plant Scientists

A Discussion of Statistical Methods for Design and Analysis of Microarray Experiments for Plant Scientists The Plant Cell, Vol. 18, 2112 2121, September 2006, www.plantcell.org ª 2006 American Society of Plant Biologists SPECIAL SERIES ON LARGE-SCALE BIOLOGY A Discussion of Statistical Methods for Design and

More information

Estoril Education Day

Estoril Education Day Estoril Education Day -Experimental design in Proteomics October 23rd, 2010 Peter James Note Taking All the Powerpoint slides from the Talks are available for download from: http://www.immun.lth.se/education/

More information

Near-Balanced Incomplete Block Designs with An Application to Poster Competitions

Near-Balanced Incomplete Block Designs with An Application to Poster Competitions Near-Balanced Incomplete Block Designs with An Application to Poster Competitions arxiv:1806.00034v1 [stat.ap] 31 May 2018 Xiaoyue Niu and James L. Rosenberger Department of Statistics, The Pennsylvania

More information

GCTA/GREML. Rebecca Johnson. March 30th, 2017

GCTA/GREML. Rebecca Johnson. March 30th, 2017 GCTA/GREML Rebecca Johnson March 30th, 2017 1 / 12 Motivation for method We know from twin studies and other methods that genetic variation contributes to complex traits like height, BMI, educational attainment,

More information

SAS Microarray Solution for the Analysis of Microarray Data. Susanne Schwenke, Schering AG Dr. Richardus Vonk, Schering AG

SAS Microarray Solution for the Analysis of Microarray Data. Susanne Schwenke, Schering AG Dr. Richardus Vonk, Schering AG for the Analysis of Microarray Data Susanne Schwenke, Schering AG Dr. Richardus Vonk, Schering AG Overview Challenges in Microarray Data Analysis Software for Microarray Data Analysis SAS Scientific Discovery

More information

Comparative Analysis using the Illumina DASL assay with FFPE tissue. Wendell Jones, PhD Vice President, Statistics and Bioinformatics

Comparative Analysis using the Illumina DASL assay with FFPE tissue. Wendell Jones, PhD Vice President, Statistics and Bioinformatics TM Comparative Analysis using the Illumina DASL assay with FFPE tissue Wendell Jones, PhD Vice President, Statistics and Bioinformatics Background EA has examined several protocol assay possibilities for

More information

Analysis of Microarray Data

Analysis of Microarray Data Analysis of Microarray Data Lecture 3: Visualization and Functional Analysis George Bell, Ph.D. Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Review Visualizing

More information

Preprocessing of Microarray data I: Normalization and Missing Values. For example, a list of possible sources in spotted arrays:

Preprocessing of Microarray data I: Normalization and Missing Values. For example, a list of possible sources in spotted arrays: Normalization: Preprocessing of Microarray data I: Normalization and Missing Values Comparability across two (experimental condition vs control) or more (many experimental conditions) sets of measurements.

More information

Mixture modeling for genome-wide localization of transcription factors

Mixture modeling for genome-wide localization of transcription factors Mixture modeling for genome-wide localization of transcription factors Sündüz Keleş Department of Statistics and Department of Biostatistics & Medical Informatics 1300 University Avenue, 1245B Medical

More information

Discriminant models for high-throughput proteomics mass spectrometer data

Discriminant models for high-throughput proteomics mass spectrometer data Proteomics 2003, 3, 1699 1703 DOI 10.1002/pmic.200300518 1699 Short Communication Parul V. Purohit David M. Rocke Center for Image Processing and Integrated Computing, University of California, Davis,

More information

Microarray Gene Expression Analysis at CNIO

Microarray Gene Expression Analysis at CNIO Microarray Gene Expression Analysis at CNIO Orlando Domínguez Genomics Unit Biotechnology Program, CNIO 8 May 2013 Workflow, from samples to Gene Expression data Experimental design user/gu/ubio Samples

More information

Bioinformatics Advance Access published February 10, A New Summarization Method for Affymetrix Probe Level Data

Bioinformatics Advance Access published February 10, A New Summarization Method for Affymetrix Probe Level Data Bioinformatics Advance Access published February 10, 2006 BIOINFORMATICS A New Summarization Method for Affymetrix Probe Level Data Sepp Hochreiter, Djork-Arné Clevert, and Klaus Obermayer Department of

More information

SECTION 11 ACUTE TOXICITY DATA ANALYSIS

SECTION 11 ACUTE TOXICITY DATA ANALYSIS SECTION 11 ACUTE TOXICITY DATA ANALYSIS 11.1 INTRODUCTION 11.1.1 The objective of acute toxicity tests with effluents and receiving waters is to identify discharges of toxic effluents in acutely toxic

More information