SPH 247 Statistical Analysis of Laboratory Data

Size: px
Start display at page:

Download "SPH 247 Statistical Analysis of Laboratory Data"

Transcription

1 SPH 247 Statistical Analysis of Laboratory Data April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 1

2 Basic Design of Expression Arrays For each gene that is a target for the array, we have a known DNA sequence. mrna is reverse transcribed to DNA, and if a complementary sequence is on the on a chip, the DNA will be more likely to stick The DNA is labeled with a dye that will fluoresce and generate a signal that is monotonic in the amount in the sample April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 2

3 Exon Intron TAAATCGATACGCATTAGTTCGACCTATCGAAGACCCAACACGGATTCGATACGTTAATATGACTACCTGCGCAACCCTAACGTCCATGTATCTAATACG ATTTAGCTATGCGTAATCAAGCTGGATAGCTTCTGGGTTGTGCCTAAGCTATGCAATTATACTGATGGACGCGTTGGGATTGCAGGTACATAGATTATGC Probe Sequence cdna arrays use variable length probes derived from expressed sequence tags Spotted and almost always used with two color methods Can be used in species with an unsequenced genome Long oligoarrays use 60-70mers Agilent two-color arrays Illumina Bead Arrays Usually use computationally derived probes but can use probes from sequenced EST s April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 3

4 Affymetrix GeneChipsuse multiple 25-mers For each gene, one or more sets of 8-20 distinct probes May overlap May cover more than one exon Affymetrix chips also use mismatch (MM) probes that have the same sequence as perfect match probes except for the middle base which is changed to inhibit binding. This is supposed to act as a control, but often instead binds to another mrna species, so many analysts do not use them April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 4

5 Illumina Bead Arrays Beads are coated with many copies of a 50-mer gene specific probe and a 29-mer address sequence Multiple beads per probe, random, but around 20 Each chip of the Ref-8 contains 8 arrays with ~ 25,000 targets, plus controls Each chip of the WG-6 contains 6 arrays with ~ 50,000 targets, plus controls Each chip of the HT-12 chip contains 12 arrays with ~ 50,000 targets and controls April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 5

6 Probe Design A good probe sequence should match the chosen gene or exon from a gene and should not match any other gene in the genome. Melting temperature depends on the GC content and should be similar on all probes on an array since the hybridization must be conducted at a single temperature. April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 6

7 The affinity of a given piece of DNA for the probe sequence can depend on many things, including secondary and tertiary structure as well as GC content. This means that the relationship between the concentration of the RNA species in the original sample and the brightness of the spot on the array can be very different for different probes for the same gene. Thus only comparisons of intensity within the same probe across arrays makes sense. A higher signal for one gene than another on the same array does not mean that the copy number is higher April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 7

8 Affymetrix GeneChips For each probe set, there are 8-20 perfect match (PM) probes which may overlap or not and which target the same gene There are also mismatch (MM) probes which are supposed to serve as a control, but do so rather badly Most of us ignore the MM probes April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 8

9 Expression Indices A key issue with Affymetrix chips is how to summarize the multiple data values on a chip for each probe set (aka gene). There have been a large number of suggested methods. Generally, the worst ones are those from Affy, by a long way; worse means less able to detect real differences Summary of Illumina beads is simpler, but there are still issues. April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 9

10 Usable Methods Li and Wong s dchip and follow on work is demonstrably better than MAS 4.0 and MAS 5.0, but not as good as RMA and GLA The RMA method of Irizarry et al. is available in Bioconductor. The GLA method (Durbin, Rocke, Zhou) is also available in Bioconductor/CRAN as part of the LMGene R package April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 10

11 Bioconductor Documentation > library(affy) Loading required package: Biobase Loading required package: tools Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openvignette()'. To cite Bioconductor, see 'citation("biobase")' and for packages 'citation(pkgname)'. Loading required package: affyio Loading required package: preprocesscore April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 11

12 Bioconductor Documentation > openvignette() Please select a vignette: 1: affy - 1. Primer 2: affy - 2. Built-in Processing Methods 3: affy - 3. Custom Processing Methods 4: affy - 4. Import Methods 5: affy - 5. Automatic downloading of CDF packages 6: Biobase - An introduction to Biobase and ExpressionSets 7: Biobase - Bioconductor Overview 8: Biobase - esapply Introduction 9: Biobase - Notes for eset developers 10: Biobase - Notes for writing introductory 'how to' documents 11: Biobase - quick views of eset instances Selection: April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 12

13 Reading Affy Data into R The CEL files contain the data from an array. We will look at data from an older type of array, the U95A which contains 12,625 probe sets and 409,600 probes. The CDF file contains information relating probe pair sets to locations on the array. These are built into the affy package for standard types. April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 13

14 Example Data Set Data from Robert Rice s lab on twelve keratinocyte cell lines, at six different stages. Affymetrix HG U95A GeneChips. For each gene, we will run a one-way ANOVA with two observations per cell. For this illustration, we will use RMA. April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 14

15 Files for the Analysis.CDF file has U95A chip definition (which probe is where on the chip). Built in to the affy package..cel files contain the raw data after pixel level analysis, one number for each spot. Files are called LN0A.CEL, LN0B.CEL LN5B.CEL and are on the web site. 409,600 probe values in 12,625 probe sets. April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 15

16 The ReadAffy function ReadAffy() function reads all of the CEL files in the current working directory into an object of class AffyBatch, which is itself an object of class ExpressionSet ReadAffy(widget=T) does so in a GUI that allows entry of other characteristics of the dataset You can also specify filenames, phenotype or experimental data, and MIAME information April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 16

17 rrdata <- ReadAffy() > class(rrdata) [1] "AffyBatch" attr(,"package") [1] "affy > dim(exprs(rrdata)) [1] > colnames(exprs(rrdata)) [1] "LN0A.CEL" "LN0B.CEL" "LN1A.CEL" "LN1B.CEL" "LN2A.CEL" "LN2B.CEL" [7] "LN3A.CEL" "LN3B.CEL" "LN4A.CEL" "LN4B.CEL" "LN5A.CEL" "LN5B.CEL" > length(probenames(rrdata)) [1] > length(unique(probenames(rrdata))) [1] > length((featurenames(rrdata))) [1] > featurenames(rrdata)[1:5] [1] "100_g_at" "1000_at" "1001_at" "1002_f_at" "1003_s_at" April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 17

18 The ExpressionSet class An object of class ExpressionSet has several slots the most important of which is an assaydata object, containing one or more matrices. The best way to extract parts of this is using appropriate methods. exprs() extracts an expression matrix featurenames() extracts the names of the probe sets. April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 18

19 Expression Indices The 409,600 rows of the expression matrix in the AffyBatch object Data each correspond to a probe (25- mer) Ordinarily to use this we need to combine the probe level data for each probe set into a single expression number This has conceptually several steps April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 19

20 Steps in Expression Index Construction Background correction is the process of adjusting the signals so that the zero point is similar on all parts of all arrays. We like to manage this so that zero signal after background correction corresponds approximately to zero amount of the mrna species that is the target of the probe set. April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 20

21 Data transformation is the process of changing the scale of the data so that it is more comparable from high to low. Common transformations are the logarithm and generalized logarithm Normalization is the process of adjusting for systematic differences from one array to another. Normalization may be done before or after transformation, and before or after probe set summarization. April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 21

22 One may use only the perfect match (PM) probes, or may subtract or otherwise use the mismatch (MM) probes There are many ways to summarize 20 PM probes and 20 MM probes on 10 arrays (total of 200 numbers) into 10 expression index numbers April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 22

23 Probe intensities for LASP1 in a radiation dose-response experiment Mean _at _at _at _at _at _at _at _at _at _at _at Expression Index April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 23

24 Log probe intensities for LASP1 in a radiation dose-response experiment Mean _at _at _at _at _at _at _at _at _at _at _at Expression Index April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 24

25 The RMA Method Background correction that does not make 0 signal correspond to 0 amount Quantile normalization Log 2 transform Median polish summary of PM probes April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 25

26 > eset <- rma(rrdata) trying URL ' Content type 'application/zip' length bytes (1.3 Mb) opened URL downloaded 1.3 Mb package 'hgu95av2cdf' successfully unpacked and MD5 sums checked The downloaded packages are in C:\Documents and Settings\dmrocke\Local Settings updating HTML package descriptions Background correcting Normalizing Calculating Expression > class(eset) [1] "ExpressionSet" attr(,"package") [1] "Biobase" > dim(exprs(eset)) [1] > featurenames(eset)[1:5] [1] "100_g_at" "1000_at" "1001_at" "1002_f_at" "1003_s_at" April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 26

27 > exprs(eset)[1:5,] LN0A.CEL LN0B.CEL LN1A.CEL LN1B.CEL LN2A.CEL LN2B.CEL LN3A.CEL 100_g_at _at _at _f_at _s_at LN3B.CEL LN4A.CEL LN4B.CEL LN5A.CEL LN5B.CEL 100_g_at _at _at _f_at _s_at April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 27

28 > summary(exprs(eset)) LN0A.CEL LN0B.CEL LN1A.CEL LN1B.CEL Min. : Min. : Min. : Min. : st Qu.: st Qu.: st Qu.: st Qu.: Median : Median : Median : Median : Mean : Mean : Mean : Mean : rd Qu.: rd Qu.: rd Qu.: rd Qu.: Max. : Max. : Max. : Max. : LN2A.CEL LN2B.CEL LN3A.CEL LN3B.CEL Min. : Min. : Min. : Min. : st Qu.: st Qu.: st Qu.: st Qu.: Median : Median : Median : Median : Mean : Mean : Mean : Mean : rd Qu.: rd Qu.: rd Qu.: rd Qu.: Max. : Max. : Max. : Max. : LN4A.CEL LN4B.CEL LN5A.CEL LN5B.CEL Min. : Min. : Min. : Min. : st Qu.: st Qu.: st Qu.: st Qu.: Median : Median : Median : Median : Mean : Mean : Mean : Mean : rd Qu.: rd Qu.: rd Qu.: rd Qu.: Max. : Max. : Max. : Max. : April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 28

29 Probe Sets not Genes It is unavoidable to refer to a probe set as measuring a gene, but nevertheless it can be deceptive The annotation of a probe set may be based on homology with a gene of possibly known function in a different organism Only a relatively few probe sets correspond to genes with known function and known structure in the organism being studied April 14, 2015 SPH 247 Statistical Analysis of Laboratory Data 29

AFFYMETRIX c Technology and Preprocessing Methods

AFFYMETRIX c Technology and Preprocessing Methods Analysis of Genomic and Proteomic Data AFFYMETRIX c Technology and Preprocessing Methods bhaibeka@ulb.ac.be Université Libre de Bruxelles Institut Jules Bordet Table of Contents AFFYMETRIX c Technology

More information

Expression summarization

Expression summarization Expression Quantification: Affy Affymetrix Genechip is an oligonucleotide array consisting of a several perfect match (PM) and their corresponding mismatch (MM) probes that interrogate for a single gene.

More information

Introduction to gene expression microarray data analysis

Introduction to gene expression microarray data analysis Introduction to gene expression microarray data analysis Outline Brief introduction: Technology and data. Statistical challenges in data analysis. Preprocessing data normalization and transformation. Useful

More information

Exploration, Normalization, Summaries, and Software for Affymetrix Probe Level Data

Exploration, Normalization, Summaries, and Software for Affymetrix Probe Level Data Exploration, Normalization, Summaries, and Software for Affymetrix Probe Level Data Rafael A. Irizarry Department of Biostatistics, JHU March 12, 2003 Outline Review of technology Why study probe level

More information

DNA Microarray Data Oligonucleotide Arrays

DNA Microarray Data Oligonucleotide Arrays DNA Microarray Data Oligonucleotide Arrays Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor Short Course 2003 Copyright 2002, all rights reserved Biological question Experimental

More information

Image Analysis. Based on Information from Terry Speed s Group, UC Berkeley. Lecture 3 Pre-Processing of Affymetrix Arrays. Affymetrix Terminology

Image Analysis. Based on Information from Terry Speed s Group, UC Berkeley. Lecture 3 Pre-Processing of Affymetrix Arrays. Affymetrix Terminology Image Analysis Lecture 3 Pre-Processing of Affymetrix Arrays Stat 697K, CS 691K, Microbio 690K 2 Affymetrix Terminology Probe: an oligonucleotide of 25 base-pairs ( 25-mer ). Based on Information from

More information

Affymetrix GeneChip Arrays. Lecture 3 (continued) Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy

Affymetrix GeneChip Arrays. Lecture 3 (continued) Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy Affymetrix GeneChip Arrays Lecture 3 (continued) Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy Affymetrix GeneChip Design 5 3 Reference sequence TGTGATGGTGGGGAATGGGTCAGAAGGCCTCCGATGCGCCGATTGAGAAT

More information

Normalization. Getting the numbers comparable. DNA Microarray Bioinformatics - #27612

Normalization. Getting the numbers comparable. DNA Microarray Bioinformatics - #27612 Normalization Getting the numbers comparable The DNA Array Analysis Pipeline Question Experimental Design Array design Probe design Sample Preparation Hybridization Buy Chip/Array Image analysis Expression

More information

Pre-processing DNA Microarray Data

Pre-processing DNA Microarray Data Pre-processing DNA Microarray Data Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor Short Course Winter 2002 Copyright 2002, all rights reserved Biological question Experimental

More information

Description of Logit-t: Detecting Differentially Expressed Genes Using Probe-Level Data

Description of Logit-t: Detecting Differentially Expressed Genes Using Probe-Level Data Description of Logit-t: Detecting Differentially Expressed Genes Using Probe-Level Data Tobias Guennel October 22, 2008 Contents 1 Introduction 2 2 What s new in this version 3 3 Preparing data for use

More information

Microarray Data Analysis. Normalization

Microarray Data Analysis. Normalization Microarray Data Analysis Normalization Outline General issues Normalization for two colour microarrays Normalization and other stuff for one color microarrays 2 Preprocessing: normalization The word normalization

More information

Introduction to Bioinformatics! Giri Narasimhan. ECS 254; Phone: x3748

Introduction to Bioinformatics! Giri Narasimhan. ECS 254; Phone: x3748 Introduction to Bioinformatics! Giri Narasimhan ECS 254; Phone: x3748 giri@cs.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs11.html Reading! The following slides come from a series of talks by Rafael Irizzary

More information

Preprocessing Affymetrix GeneChip Data. Affymetrix GeneChip Design. Terminology TGTGATGGTGGGGAATGGGTCAGAAGGCCTCCGATGCGCCGATTGAGAAT

Preprocessing Affymetrix GeneChip Data. Affymetrix GeneChip Design. Terminology TGTGATGGTGGGGAATGGGTCAGAAGGCCTCCGATGCGCCGATTGAGAAT Preprocessing Affymetrix GeneChip Data Credit for some of today s materials: Ben Bolstad, Leslie Cope, Laurent Gautier, Terry Speed and Zhijin Wu Affymetrix GeneChip Design 5 3 Reference sequence TGTGATGGTGGGGAATGGGTCAGAAGGCCTCCGATGCGCCGATTGAGAAT

More information

Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.

Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies. Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies. References Summaries of Affymetrix Genechip Probe Level Data,

More information

Pre-processing DNA Microarray Data

Pre-processing DNA Microarray Data Pre-processing DNA Microarray Data Short course: Practical Analysis of DNA Microarray Data Instructors: Vince Carey & Sandrine Dudoit KolleKolle, Denmark October 26-28, 2003 1 Slides from Short Courses

More information

Lecture #1. Introduction to microarray technology

Lecture #1. Introduction to microarray technology Lecture #1 Introduction to microarray technology Outline General purpose Microarray assay concept Basic microarray experimental process cdna/two channel arrays Oligonucleotide arrays Exon arrays Comparing

More information

STATC 141 Spring 2005, April 5 th Lecture notes on Affymetrix arrays. Materials are from

STATC 141 Spring 2005, April 5 th Lecture notes on Affymetrix arrays. Materials are from STATC 141 Spring 2005, April 5 th Lecture notes on Affymetrix arrays Materials are from http://www.ohsu.edu/gmsr/amc/amc_technology.html The GeneChip high-density oligonucleotide arrays are fabricated

More information

Outline. Analysis of Microarray Data. Most important design question. General experimental issues

Outline. Analysis of Microarray Data. Most important design question. General experimental issues Outline Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization Introduction to microarrays Experimental design Data normalization Other data transformation Exercises George Bell,

More information

Integrative Genomics 1a. Introduction

Integrative Genomics 1a. Introduction 2016 Course Outline Integrative Genomics 1a. Introduction ggibson.gt@gmail.com http://www.cig.gatech.edu 1a. Experimental Design and Hypothesis Testing (GG) 1b. Normalization (GG) 2a. RNASeq (MI) 2b. Clustering

More information

Microarrays The technology

Microarrays The technology Microarrays The technology Goal Goal: To measure the amount of a specific (known) DNA molecule in parallel. In parallel : do this for thousands or millions of molecules simultaneously. Main components

More information

Introduction to biology and measurement of gene expression

Introduction to biology and measurement of gene expression Introduction to biology and measurement of gene expression Statistical analysis of gene expression data with R and Bioconductor University of Copenhagen, 17-21 August, 2009 Margaret Taub University of

More information

Normalizing Affy microarray data

Normalizing Affy microarray data Normalizing Affy microarray data All product names are given as examples only and they are not endorsed by the USDA or the University of Illinois. INTRODUCTION The following is an interactive demo describing

More information

From hybridization theory to microarray data analysis: performance evaluation

From hybridization theory to microarray data analysis: performance evaluation RESEARCH ARTICLE Open Access From hybridization theory to microarray data analysis: performance evaluation Fabrice Berger * and Enrico Carlon * Abstract Background: Several preprocessing methods are available

More information

CS-E5870 High-Throughput Bioinformatics Microarray data analysis

CS-E5870 High-Throughput Bioinformatics Microarray data analysis CS-E5870 High-Throughput Bioinformatics Microarray data analysis Harri Lähdesmäki Department of Computer Science Aalto University September 20, 2016 Acknowledgement for J Salojärvi and E Czeizler for the

More information

Package sscore. R topics documented: October 4, Version Date

Package sscore. R topics documented: October 4, Version Date Version 1.32.0 Date 2009-04-11 Package sscore October 4, 2013 Title S-Score Algorithm for Affymetrix Oligonucleotide Microarrays Author Richard Kennedy , based on C++ code from

More information

GS Analysis of Microarray Data

GS Analysis of Microarray Data GS01 0163 Analysis of Microarray Data Keith Baggerly and Kevin Coombes Section of Bioinformatics Department of Biostatistics and Applied Mathematics UT M. D. Anderson Cancer Center kabagg@mdanderson.org

More information

Analysis of Microarray Data

Analysis of Microarray Data Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization George Bell, Ph.D. Senior Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Introduction

More information

Background and Normalization:

Background and Normalization: Background and Normalization: Investigating the effects of preprocessing on gene expression estimates Ben Bolstad Group in Biostatistics University of California, Berkeley bolstad@stat.berkeley.edu http://www.stat.berkeley.edu/~bolstad

More information

Mixed effects model for assessing RNA degradation in Affymetrix GeneChip experiments

Mixed effects model for assessing RNA degradation in Affymetrix GeneChip experiments Mixed effects model for assessing RNA degradation in Affymetrix GeneChip experiments Kellie J. Archer, Ph.D. Suresh E. Joel Viswanathan Ramakrishnan,, Ph.D. Department of Biostatistics Virginia Commonwealth

More information

Background Correction and Normalization. Lecture 3 Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy

Background Correction and Normalization. Lecture 3 Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy Background Correction and Normalization Lecture 3 Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy Feature Level Data Outline Affymetrix GeneChip arrays Two

More information

6. GENE EXPRESSION ANALYSIS MICROARRAYS

6. GENE EXPRESSION ANALYSIS MICROARRAYS 6. GENE EXPRESSION ANALYSIS MICROARRAYS BIOINFORMATICS COURSE MTAT.03.239 16.10.2013 GENE EXPRESSION ANALYSIS MICROARRAYS Slides adapted from Konstantin Tretyakov s 2011/2012 and Priit Adlers 2010/2011

More information

Introduction to Bioinformatics and Gene Expression Technology

Introduction to Bioinformatics and Gene Expression Technology Vocabulary Introduction to Bioinformatics and Gene Expression Technology Utah State University Spring 2014 STAT 5570: Statistical Bioinformatics Notes 1.1 Gene: Genetics: Genome: Genomics: hereditary DNA

More information

DNA Arrays Affymetrix GeneChip System

DNA Arrays Affymetrix GeneChip System DNA Arrays Affymetrix GeneChip System chip scanner Affymetrix Inc. hybridization Affymetrix Inc. data analysis Affymetrix Inc. mrna 5' 3' TGTGATGGTGGGAATTGGGTCAGAAGGACTGTGGGCGCTGCC... GGAATTGGGTCAGAAGGACTGTGGC

More information

Biology 644: Bioinformatics

Biology 644: Bioinformatics Measure of the linear correlation (dependence) between two variables X and Y Takes a value between +1 and 1 inclusive 1 = total positive correlation 0 = no correlation 1 = total negative correlation. When

More information

Humboldt Universität zu Berlin. Grundlagen der Bioinformatik SS Microarrays. Lecture

Humboldt Universität zu Berlin. Grundlagen der Bioinformatik SS Microarrays. Lecture Humboldt Universität zu Berlin Microarrays Grundlagen der Bioinformatik SS 2017 Lecture 6 09.06.2017 Agenda 1.mRNA: Genomic background 2.Overview: Microarray 3.Data-analysis: Quality control & normalization

More information

Gene Expression Technology

Gene Expression Technology Gene Expression Technology Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Gene expression Gene expression is the process by which information from a gene

More information

Exam 1 from a Past Semester

Exam 1 from a Past Semester Exam from a Past Semester. Provide a brief answer to each of the following questions. a) What do perfect match and mismatch mean in the context of Affymetrix GeneChip technology? Be as specific as possible

More information

Package pumadata. July 24, 2018

Package pumadata. July 24, 2018 Type Package Package pumadata July 24, 2018 Title Various data sets for use with the puma package Version 2.16.0 Date 2015-5-30 Author Richard Pearson Maintainer Richard Pearson

More information

Microarray Data Analysis Workshop. Preprocessing and normalization A trailer show of the rest of the microarray world.

Microarray Data Analysis Workshop. Preprocessing and normalization A trailer show of the rest of the microarray world. Microarray Data Analysis Workshop MedVetNet Workshop, DTU 2008 Preprocessing and normalization A trailer show of the rest of the microarray world Carsten Friis Media glna tnra GlnA TnrA C2 glnr C3 C5 C6

More information

Computational Biology I

Computational Biology I Computational Biology I Microarray data acquisition Gene clustering Practical Microarray Data Acquisition H. Yang From Sample to Target cdna Sample Centrifugation (Buffer) Cell pellets lyse cells (TRIzol)

More information

Microarray Informatics

Microarray Informatics Microarray Informatics Donald Dunbar MSc Seminar 31 st January 2007 Aims To give a biologist s view of microarray experiments To explain the technologies involved To describe typical microarray experiments

More information

Analysis of Microarray Data

Analysis of Microarray Data Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization George Bell, Ph.D. Senior Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Introduction

More information

DNA Microarray Technology

DNA Microarray Technology 2 DNA Microarray Technology 2.1 Overview DNA microarrays are assays for quantifying the types and amounts of mrna transcripts present in a collection of cells. The number of mrna molecules derived from

More information

From CEL files to lists of interesting genes. Rafael A. Irizarry Department of Biostatistics Johns Hopkins University

From CEL files to lists of interesting genes. Rafael A. Irizarry Department of Biostatistics Johns Hopkins University From CEL files to lists of interesting genes Rafael A. Irizarry Department of Biostatistics Johns Hopkins University Contact Information e-mail Personal webpage Department webpage Bioinformatics Program

More information

A REVIEW OF GENE EXPRESSION ANALYSIS ON MICROARRAY DATASETS OF BREAST CELLS USING R LANGUAGE

A REVIEW OF GENE EXPRESSION ANALYSIS ON MICROARRAY DATASETS OF BREAST CELLS USING R LANGUAGE International Journal of Computer Engineering & Technology (IJCET) Volume 8, Issue 6, Nov-Dec 2017, pp. 36 44, Article ID: IJCET_08_06_004 Available online at http://www.iaeme.com/ijcet/issues.asp?jtype=ijcet&vtype=8&itype=6

More information

Measuring and Understanding Gene Expression

Measuring and Understanding Gene Expression Measuring and Understanding Gene Expression Dr. Lars Eijssen Dept. Of Bioinformatics BiGCaT Sciences programme 2014 Why are genes interesting? TRANSCRIPTION Genome Genomics Transcriptome Transcriptomics

More information

Analyzing DNA Microarray Data Using Bioconductor

Analyzing DNA Microarray Data Using Bioconductor Analyzing DNA Microarray Data Using Bioconductor Sandrine Dudoit and Rafael Irizarry Short Course on Mathematical Approaches to the Analysis of Complex Phenotypes The Jackson Laboratory, Bar Harbor, Maine

More information

The essentials of microarray data analysis

The essentials of microarray data analysis The essentials of microarray data analysis (from a complete novice) Thanks to Rafael Irizarry for the slides! Outline Experimental design Take logs! Pre-processing: affy chips and 2-color arrays Clustering

More information

Exercise on Microarray data analysis

Exercise on Microarray data analysis Exercise on Microarray data analysis Aim The aim of this exercise is to introduce basic data analysis of transcriptome data using the statistical software R. The exercise is divided in two parts. First,

More information

Microarray Informatics

Microarray Informatics Microarray Informatics Donald Dunbar MSc Seminar 4 th February 2009 Aims To give a biologistʼs view of microarray experiments To explain the technologies involved To describe typical microarray experiments

More information

Measuring gene expression (Microarrays) Ulf Leser

Measuring gene expression (Microarrays) Ulf Leser Measuring gene expression (Microarrays) Ulf Leser This Lecture Gene expression Microarrays Idea Technologies Problems Quality control Normalization Analysis next week! 2 http://learn.genetics.utah.edu/content/molecules/transcribe/

More information

A Distribution Free Summarization Method for Affymetrix GeneChip Arrays

A Distribution Free Summarization Method for Affymetrix GeneChip Arrays A Distribution Free Summarization Method for Affymetrix GeneChip Arrays Zhongxue Chen 1,2, Monnie McGee 1,*, Qingzhong Liu 3, and Richard Scheuermann 2 1 Department of Statistical Science, Southern Methodist

More information

Introduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics

Introduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics Introduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics abedi777@ymail.com Outlines Technology Basic concepts Data analysis Printed Microarrays In Situ-Synthesized

More information

Computing with large data sets

Computing with large data sets Computing with large data sets Richard Bonneau, spring 2009 Lecture 16 (week 10): bioconductor: an example R multi-developer project Acknowledgments and other sources: Ben Bolstad, Biostats lectures, Berkely

More information

Microarray. Key components Array Probes Detection system. Normalisation. Data-analysis - ratio generation

Microarray. Key components Array Probes Detection system. Normalisation. Data-analysis - ratio generation Microarray Key components Array Probes Detection system Normalisation Data-analysis - ratio generation MICROARRAY Measures Gene Expression Global - Genome wide scale Why Measure Gene Expression? What information

More information

3.1.4 DNA Microarray Technology

3.1.4 DNA Microarray Technology 3.1.4 DNA Microarray Technology Scientists have discovered that one of the differences between healthy and cancer is which genes are turned on in each. Scientists can compare the gene expression patterns

More information

PLM Extensions. B. M. Bolstad. October 30, 2013

PLM Extensions. B. M. Bolstad. October 30, 2013 PLM Extensions B. M. Bolstad October 30, 2013 1 Algorithms 1.1 Probe Level Model - robust (PLM-r) The goal is to dynamically select rows and columns for down-weighting. As with the standard PLM approach,

More information

2007/04/21.

2007/04/21. 2007/04/21 hmwu@stat.sinica.edu.tw http://idv.sinica.edu.tw/hmwu 1 GeneChip Expression Array Design Assay and Analysis Flow Chart Quality Assessment Low Level Analysis (from probe level data to expression

More information

Lecture 2: March 8, 2007

Lecture 2: March 8, 2007 Analysis of DNA Chips and Gene Networks Spring Semester, 2007 Lecture 2: March 8, 2007 Lecturer: Rani Elkon Scribe: Yuri Solodkin and Andrey Stolyarenko 1 2.1 Low Level Analysis of Microarrays 2.1.1 Introduction

More information

The Affymetrix platform for gene expression analysis Affymetrix recommended QA procedures The RMA model for probe intensity data Application of the

The Affymetrix platform for gene expression analysis Affymetrix recommended QA procedures The RMA model for probe intensity data Application of the 1 The Affymetrix platform for gene expression analysis Affymetrix recommended QA procedures The RMA model for probe intensity data Application of the fitted RMA model to quality assessment 2 3 Probes are

More information

Introduction to Microarray Analysis

Introduction to Microarray Analysis Introduction to Microarray Analysis Methods Course: Gene Expression Data Analysis -Day One Rainer Spang Microarrays Highly parallel measurement devices for gene expression levels 1. How does the microarray

More information

Rafael A Irizarry, Department of Biostatistics JHU

Rafael A Irizarry, Department of Biostatistics JHU Getting Usable Data from Microarrays it s not as easy as you think Rafael A Irizarry, Department of Biostatistics JHU rafa@jhu.edu http://www.biostat.jhsph.edu/~ririzarr http://www.bioconductor.org Acknowledgements

More information

Gene Signal Estimates from Exon Arrays

Gene Signal Estimates from Exon Arrays Gene Signal Estimates from Exon Arrays I. Introduction: With exon arrays like the GeneChip Human Exon 1.0 ST Array, researchers can examine the transcriptional profile of an entire gene (Figure 1). Being

More information

Soybean Microarrays. An Introduction. By Steve Clough. November Common Microarray platforms

Soybean Microarrays. An Introduction. By Steve Clough. November Common Microarray platforms Soybean Microarrays Microarray construction An Introduction By Steve Clough November 2005 Common Microarray platforms cdna: spotted collection of PCR products from different cdna clones, each representing

More information

Affymetrix Quality Assessment and Analysis Tool

Affymetrix Quality Assessment and Analysis Tool Affymetrix Quality Assessment and Analysis Tool Xiwei Wu and Xuejun Arthur Li October 30, 2018 1 Introduction Affymetrix GeneChip is a commonly used tool to study gene expression profiles. The purpose

More information

Outline. Array platform considerations: Comparison between the technologies available in microarrays

Outline. Array platform considerations: Comparison between the technologies available in microarrays Microarray overview Outline Array platform considerations: Comparison between the technologies available in microarrays Differences in array fabrication Differences in array organization Applications of

More information

Bioinformatics III Structural Bioinformatics and Genome Analysis. PART II: Genome Analysis. Chapter 7. DNA Microarrays

Bioinformatics III Structural Bioinformatics and Genome Analysis. PART II: Genome Analysis. Chapter 7. DNA Microarrays Bioinformatics III Structural Bioinformatics and Genome Analysis PART II: Genome Analysis Chapter 7. DNA Microarrays 7.1 Motivation 7.2 DNA Microarray History and current states 7.3 DNA Microarray Techniques

More information

2. (So) get (fragments with gene) R / required gene. Accept: allele for gene / same gene 2

2. (So) get (fragments with gene) R / required gene. Accept: allele for gene / same gene 2 M.(a). Cut (DNA) at same (base) sequence / (recognition) sequence; Accept: cut DNA at same place. (So) get (fragments with gene) R / required gene. Accept: allele for gene / same gene (b). Each has / they

More information

Preprocessing Methods for Two-Color Microarray Data

Preprocessing Methods for Two-Color Microarray Data Preprocessing Methods for Two-Color Microarray Data 1/15/2011 Copyright 2011 Dan Nettleton Preprocessing Steps Background correction Transformation Normalization Summarization 1 2 What is background correction?

More information

INTRODUCTION. The Technology of Microarrays January Hanne Jarmer

INTRODUCTION. The Technology of Microarrays January Hanne Jarmer INTRODUCTION The Technology of Microarrays January 2009 - Hanne Jarmer The Concept gene mrna gene specific DNA probes labeled target Spotted arrays High-density arrays 13-16 micron features ~60 micron

More information

What you still might want to know about microarrays. Brixen 2011 Wolfgang Huber EMBL

What you still might want to know about microarrays. Brixen 2011 Wolfgang Huber EMBL What you still might want to know about microarrays Brixen 2011 Wolfgang Huber EMBL Brief history Late 1980s: Lennon, Lehrach: cdnas spotted on nylon membranes 1990s: Affymetrix adapts microchip production

More information

Probe-Level Analysis of Affymetrix GeneChip Microarray Data

Probe-Level Analysis of Affymetrix GeneChip Microarray Data Probe-Level Analysis of Affymetrix GeneChip Microarray Data Ben Bolstad http://www.stat.berkeley.edu/~bolstad Michigan State University February 15, 2005 Outline for Today's Talk A brief introduction to

More information

GS Analysis of Microarray Data

GS Analysis of Microarray Data GS01 0163 Analysis of Microarray Data Keith Baggerly and Bradley Broom Department of Bioinformatics and Computational Biology UT M. D. Anderson Cancer Center kabagg@mdanderson.org bmbroom@mdanderson.org

More information

Intro to Microarray Analysis. Courtesy of Professor Dan Nettleton Iowa State University (with some edits)

Intro to Microarray Analysis. Courtesy of Professor Dan Nettleton Iowa State University (with some edits) Intro to Microarray Analysis Courtesy of Professor Dan Nettleton Iowa State University (with some edits) Some Basic Biology Genes are DNA sequences that code for proteins. (e.g. gene lengths perhaps 1000

More information

GS Analysis of Microarray Data

GS Analysis of Microarray Data GS01 0163 Analysis of Microarray Data Keith Baggerly and Bradley Broom Department of Bioinformatics and Computational Biology UT MD Anderson Cancer Center kabagg@mdanderson.org bmbroom@mdanderson.org 23

More information

Introduction to DNA microarrays. DTU - January Hanne Jarmer

Introduction to DNA microarrays. DTU - January Hanne Jarmer Introduction to DNA microarrays DTU - January 2007 - Hanne Jarmer Microarrays - The Concept Measure the level of transcript from a very large number of genes in one go Microarrays - The Concept Measure

More information

Improvements to the RMA Algorithm for Gene Expression Microarray Background Correction

Improvements to the RMA Algorithm for Gene Expression Microarray Background Correction Improvements to the RMA Algorithm for Gene Expression Microarray Background Correction Monnie McGee & Zhongxue Chen Department of Statistical Science Southern Methodist University MSU Seminar November

More information

Measuring gene expression

Measuring gene expression Measuring gene expression Grundlagen der Bioinformatik SS2018 https://www.youtube.com/watch?v=v8gh404a3gg Agenda Organization Gene expression Background Technologies FISH Nanostring Microarrays RNA-seq

More information

Quantitative Real Time PCR USING SYBR GREEN

Quantitative Real Time PCR USING SYBR GREEN Quantitative Real Time PCR USING SYBR GREEN SYBR Green SYBR Green is a cyanine dye that binds to double stranded DNA. When it is bound to D.S. DNA it has a much greater fluorescence than when bound to

More information

Mixture modeling for genome-wide localization of transcription factors

Mixture modeling for genome-wide localization of transcription factors Mixture modeling for genome-wide localization of transcription factors Sündüz Keleş 1,2 and Heejung Shim 1 1 Department of Statistics 2 Department of Biostatistics & Medical Informatics University of Wisconsin,

More information

10.1 The Central Dogma of Biology and gene expression

10.1 The Central Dogma of Biology and gene expression 126 Grundlagen der Bioinformatik, SS 09, D. Huson (this part by K. Nieselt) July 6, 2009 10 Microarrays (script by K. Nieselt) There are many articles and books on this topic. These lectures are based

More information

Predicting Microarray Signals by Physical Modeling. Josh Deutsch. University of California. Santa Cruz

Predicting Microarray Signals by Physical Modeling. Josh Deutsch. University of California. Santa Cruz Predicting Microarray Signals by Physical Modeling Josh Deutsch University of California Santa Cruz Predicting Microarray Signals by Physical Modeling p.1/39 Collaborators Shoudan Liang NASA Ames Onuttom

More information

Bioinformatics for Biologists

Bioinformatics for Biologists Bioinformatics for Biologists Microarray Data Analysis. Lecture 1. Fran Lewitter, Ph.D. Director Bioinformatics and Research Computing Whitehead Institute Outline Introduction Working with microarray data

More information

Introduction to Bioinformatics and Gene Expression Technologies

Introduction to Bioinformatics and Gene Expression Technologies Introduction to Bioinformatics and Gene Expression Technologies Utah State University Fall 2017 Statistical Bioinformatics (Biomedical Big Data) Notes 1 1 Vocabulary Gene: hereditary DNA sequence at a

More information

Introduction to Bioinformatics and Gene Expression Technologies

Introduction to Bioinformatics and Gene Expression Technologies Vocabulary Introduction to Bioinformatics and Gene Expression Technologies Utah State University Fall 2017 Statistical Bioinformatics (Biomedical Big Data) Notes 1 Gene: Genetics: Genome: Genomics: hereditary

More information

Methods of Biomaterials Testing Lesson 3-5. Biochemical Methods - Molecular Biology -

Methods of Biomaterials Testing Lesson 3-5. Biochemical Methods - Molecular Biology - Methods of Biomaterials Testing Lesson 3-5 Biochemical Methods - Molecular Biology - Chromosomes in the Cell Nucleus DNA in the Chromosome Deoxyribonucleic Acid (DNA) DNA has double-helix structure The

More information

ADVANCED STATISTICAL METHODS FOR GENE EXPRESSION DATA

ADVANCED STATISTICAL METHODS FOR GENE EXPRESSION DATA ADVANCED STATISTICAL METHODS FOR GENE EXPRESSION DATA Veera Baladandayuthapani & Kim-Anh Do University of Texas M.D. Anderson Cancer Center Houston, Texas, USA veera@mdanderson.org Course Website: http://odin.mdacc.tmc.edu/

More information

Moc/Bio and Nano/Micro Lee and Stowell

Moc/Bio and Nano/Micro Lee and Stowell Moc/Bio and Nano/Micro Lee and Stowell Moc/Bio-Lecture GeneChips Reading material http://www.gene-chips.com/ http://trueforce.com/lab_automation/dna_microa rrays_industry.htm http://www.affymetrix.com/technology/index.affx

More information

FACTORS CONTRIBUTING TO VARIABILITY IN DNA MICROARRAY RESULTS: THE ABRF MICROARRAY RESEARCH GROUP 2002 STUDY

FACTORS CONTRIBUTING TO VARIABILITY IN DNA MICROARRAY RESULTS: THE ABRF MICROARRAY RESEARCH GROUP 2002 STUDY FACTORS CONTRIBUTING TO VARIABILITY IN DNA MICROARRAY RESULTS: THE ABRF MICROARRAY RESEARCH GROUP 2002 STUDY K. L. Knudtson 1, C. Griffin 2, A. I. Brooks 3, D. A. Iacobas 4, K. Johnson 5, G. Khitrov 6,

More information

Release Notes. JMP Genomics. Version 3.1

Release Notes. JMP Genomics. Version 3.1 JMP Genomics Version 3.1 Release Notes Creativity involves breaking out of established patterns in order to look at things in a different way. Edward de Bono JMP. A Business Unit of SAS SAS Campus Drive

More information

Parameter Estimation for the Exponential-Normal Convolution Model

Parameter Estimation for the Exponential-Normal Convolution Model Parameter Estimation for the Exponential-Normal Convolution Model Monnie McGee & Zhongxue Chen cgee@smu.edu, zhongxue@smu.edu. Department of Statistical Science Southern Methodist University ENAR Spring

More information

Ning Tang ALL RIGHTS RESERVED

Ning Tang ALL RIGHTS RESERVED 2014 Ning Tang ALL RIGHTS RESERVED ROBUST GENE SET ANALYSIS AND ROBUST GENE EXPRESSION By NING TANG A dissertation submitted to the Graduate School New Brunswick Rutgers, The State University of New Jersey

More information

Identification of biological themes in microarray data from a mouse heart development time series using GeneSifter

Identification of biological themes in microarray data from a mouse heart development time series using GeneSifter Identification of biological themes in microarray data from a mouse heart development time series using GeneSifter VizX Labs, LLC Seattle, WA 98119 Abstract Oligonucleotide microarrays were used to study

More information

1. Introduction Gene regulation Genomics and genome analyses

1. Introduction Gene regulation Genomics and genome analyses 1. Introduction Gene regulation Genomics and genome analyses 2. Gene regulation tools and methods Regulatory sequences and motif discovery TF binding sites Databases 3. Technologies Microarrays Deep sequencing

More information

Introduction to Bioinformatics. Fabian Hoti 6.10.

Introduction to Bioinformatics. Fabian Hoti 6.10. Introduction to Bioinformatics Fabian Hoti 6.10. Analysis of Microarray Data Introduction Different types of microarrays Experiment Design Data Normalization Feature selection/extraction Clustering Introduction

More information

CAP BIOINFORMATICS Su-Shing Chen CISE. 10/5/2005 Su-Shing Chen, CISE 1

CAP BIOINFORMATICS Su-Shing Chen CISE. 10/5/2005 Su-Shing Chen, CISE 1 CAP 5510-9 BIOINFORMATICS Su-Shing Chen CISE 10/5/2005 Su-Shing Chen, CISE 1 Basic BioTech Processes Hybridization PCR Southern blotting (spot or stain) 10/5/2005 Su-Shing Chen, CISE 2 10/5/2005 Su-Shing

More information

affy: Built-in Processing Methods

affy: Built-in Processing Methods affy: Built-in Processing Methods Ben Bolstad October 30, 2017 Contents 1 Introduction 2 2 Background methods 2 2.1 none...................................... 2 2.2 rma/rma2...................................

More information

Then, we went on to discuss genome expression and described: Microarrays

Then, we went on to discuss genome expression and described: Microarrays In the previous lecture, we have discussed: - classical sequencing methods - newer authomatic sequencing methods - solid-phase parallel sequencing - Next Generation mass-sequencing methods Then, we went

More information

Analysis of a Tiling Regulation Study in Partek Genomics Suite 6.6

Analysis of a Tiling Regulation Study in Partek Genomics Suite 6.6 Analysis of a Tiling Regulation Study in Partek Genomics Suite 6.6 The example data set used in this tutorial consists of 6 technical replicates from the same human cell line, 3 are SP1 treated, and 3

More information

Probe-Level Data Analysis of Affymetrix GeneChip Expression Data using Open-source Software Ben Bolstad

Probe-Level Data Analysis of Affymetrix GeneChip Expression Data using Open-source Software Ben Bolstad Probe-Level Data Analysis of Affymetrix GeneChip Expression Data using Open-source Software Ben Bolstad bmb@bmbolstad.com http://bmbolstad.com August 7, 2006 1 Outline Introduction to probe-level data

More information