Exploration and Analysis of DNA Microarray Data
|
|
- Jack Sherman
- 6 years ago
- Views:
Transcription
1 Exploration and Analysis of DNA Microarray Data Dhammika Amaratunga Senior Research Fellow in Nonclinical Biostatistics Johnson & Johnson Pharmaceutical Research & Development Javier Cabrera Associate Professor in Statistics Rutgers University A short course sponsored by the New Jersey Chapter of the American Statistical Association Piscataway, New Jersey, May
2 Agenda Morning session - Dhammika Amaratunga 1. A very brief intro to molecular biology 2. Microarrays: experimental procedure 3. Preprocessing microarray data 4. Finding differentially expressed genes Afternoon session - Javier Cabrera 5. Clustering genes and/or samples 6. Class prediction 7. Software 2
3 DNA An organism s genetic information is encoded in DNA stored in its cells. double-stranded molecule strand=backbone+bases nucleotide=backbone+base bases: A,T,G,C complementary bases on each strand (A T, G C) the sequence of bases contains the genetic info 3
4 The central dogma of molecular biology A gene is a segment of DNA whose sequence of bases (nucleotides) codes for a specific protein. AKAP6: CATCATGCAGCAGGTCAAACAAGG CATCTCCTAGTATTGCATCCTACA A gene is expressed via the process: DNA mrna protein transcription translation 4
5 Differential gene expression An organism s genome is the complete set of genes in each of its cells. Given an organism, every one of its cells has a copy of the exact same genome, but different cells express different genes different genes express under different conditions differential gene expression leads to altered cell states 5
6 Principal underlying gene expression studies Obtain gene expression profile information by measuring the levels of the various mrnas in a cell in a specific state information regarding what drives cell state Applications: biological pathways and disease processes functions of specific genes and proteins drug targets and toxicity mechanisms 6 medical diagnostics and prognostics
7 DNA microarrays DNA microarray technology is one of the most promising tools for obtaining gene expression data. A DNA microarray is a tiny glass slide on which genes (purified single-stranded cdna sequences in solution) have been robotically spotted in an (approximately) rectangular array. On a cdna microarray, each spot on the array corresponds to a single gene. 7
8 Experimental procedure Manufacture DNA microarray. Prepare labeled test sample. cellular contents mrna (isolate & purify) cdna sample (reverse (add flourescent transcription) dye) 8
9 Experimental procedure (continued) Disperse the labeled sample over the microarray. Whenever there is cdna in the array complementary to cdna (mrna) in the sample, the two will hybridize. Let hybridization take place, then wash and dry the array. Scan the array with a laser microscope. 9
10 Scanned image 10
11 Interpreting the scanned image High intensity spot the DNA at that spot corresponds to some mrna in sample. Low intensity spot no mrna in sample that corresponds to the DNA at that spot. Intensity ~ mrna abundance. For any gene, can compare intensities across different samples (but shouldn t compare intensities for different genes for the same sample). 11
12 Comparing two scanned images Control vs Treatment same genes on each slide 12
13 Paradigm dissimilar spot intensity pattern difference in mrna abundance in tissue genes differentially expressed within cell altered cell state 13
14 Objectives of microarray experiments (1) Identify those genes that are differentially expressed across two or more predefined classes (can compare gene expression patterns across classes multiple genes at a time): o Which genes are expressed in which cells and under what conditions. o Which genes are expressed differently in diseased cells compared to normal cells. o Which genes are expressed differently 14 when a patient is administered a drug.
15 Objectives of microarray experiments (2) Class prediction: Develop multi-gene predictor ( signature ) of class. o breast cancer patients - staging o toxicogenomics Pattern discovery: Discover clusters among samples or genes o breast cancer patients - subtypes o genes performing similar function 15
16 Processing steps Raw image Spotted image Preprocess Data analysis Biological inference 16
17 Convert scanned image to spotted image Run initial check of data quality Adjust for background Transform data Normalize data Deal with gross outliers and other anomalies Run final check of data quality Analyze data Interpret and report findings 17
18 Raw cdna microarray image speckles (B) (A) Light Background (C) Shape 18
19 Processing the raw image Gridding: where are the spots? Segmentation: which pixels correspond to the spot (signal) and which to background? Measurement: what is the intensity at each spot? Spot intensity = average pixel intensity within the spot Background intensity = average pixel intensity immediately around the spot 19
20 Data from the image Gene Row Col Signal Background G G G G G G G G G G
21 Image plot of a good array Signal Background
22 Image plot of a defective array Signal Background
23 Technology differences pin spotting or photolithography multi-channel or single-channel almost-complete or sequences (cdna) subsequences (oligonucleotides) cdna array Affymetrix chip 23
24 Two-channel cdna microarrays Take two mrna samples, label each with a different fluorescent dye, then disperse composite sample over microarray. The two spot intensities at a spot are very different the gene at that spot is differentially expressed. Advantage: Natural matching of samples - reduces spot-related bias. Disadvantage: intensity-dependent dye 24 effect, gene-specific dye effect, logistics.
25 Designs for two-channel experiments Simple Dye-swap Reference Dye A 1 A 2 A 1 A 2 A 1 A 2 R G A A A B Ref Ref B B B A A B larger studies simple but treatment effects confounded with dye effects extra effort needed but intensity-dependent dye effects can be separated out 25
26 Oligonucleotide arrays Each gene is represented by a probe set of 20 or so 25bp-oligonucleotides called perfect matches (PM). Each PM is paired with a mismatch (MM) formed by switching the middle base of PM - MM acts as a (imperfect) control. CTGATGATCTCGAATAGCGTGCGCGAATGAT PM: ATGATCTCGAATAGCGTGCGCGAAT MM: ATGATCTCGAATTGCGTGCGCGAAT 26
27 Oligonucleotide arrays (contd) Interpretation: PM>>MM gene expressed PM MM gene not expressed. Gene expression level: Ave(PM-MM) or RobustAve(PM-MM) or Ave(PM) or, with replicates, Li-Wong or RMA or Affymetrix: Array manufactured by synthesizing oligonucleotides directly onto the surface of a silicon chip. 27
28 A comparative experiment Data: Gene expression profiles for genes in 6 mice (= 3 Control + 3 Test). Question: Which genes are differentially expressed in C vs T? Or: Could ask whether the differential gene expression profiles discriminate between C and T (class prediction). 28
29 Single-channel spot intensity data Gene C1 C2 C3 T1 T2 T3 G G G G G G G G G G G G G G *
30 Convert scanned image to spotted image Run initial check of data quality Adjust for background Transform data Normalize data Deal with gross outliers and other anomalies Run final check of data quality Analyze data Interpret and report findings 30
31 Check array quality Check consistency across arrays Spearman correlation coefficient (measures the degree of monotonicity, preservation of rank order) Concordance correlation coefficient (measures the degree of agreement) ρ S
32 Convert scanned image to spotted image Run initial check of data quality Adjust for background Transform data Normalize data Deal with gross outliers and other anomalies Run final check of data quality Analyze data Interpret and report findings 32
33 Signal The signal at a particular spot is taken to be or or X g SpotIntensity X g SpotIntensity - Background X g SpotIntensity - SmoothedBackground LOG SIGNAL LOG BACKGROUND 33
34 Thresholding Sometimes the signal may be thresholded if low intensity values are considered unreliable: X g median(t Lower,X g,t Upper ) or X g MISSING if X g is considered unreliable 34
35 Convert scanned image to spotted image Run initial check of data quality Adjust for background Transform data Normalize data Deal with gross outliers and other anomalies Run final check of data quality Analyze data Interpret and report findings 35
36 Transformation Take logs (makes the range of the data more manageable, symmetrizes the withingene distribution but does not eliminate the heterogeneity of variances across genes, reduces but does not eliminate the skewness of the across-gene distribution). X log(x+λ) with λ=
37 Convert scanned image to spotted image Run initial check of data quality Adjust for background Transform data Normalize data Deal with gross outliers and other anomalies Run final check of data quality Analyze data Interpret and report findings 37
38 Normalization Often the signals on even identical microarrays tend to be on different scales (due to quality and quantity of RNA, labeling efficiency, laser setting, experimenter effects, etc) - this can be regarded as a sort of (nonlinear but monotone) array effect. The scales need to be normalized prior to further analysis, so that the arrays are on more directly comparable scales. 38
39 Two arrays LOG SIGNAL INTENSITY LOG(C2) C1 C LOG(C1) ARRAY 39 ρ (Concordance) = 0.90, ρ (Spearman) = 0.97
40 Normalization To normalize arrays C(1),..., C(n): Calculate the median mock array M. Either use LOWESS (or spline) smoother to model the relationship between C(i) and M or fit a continuous monotone increasing function to the quantiles of C(i) vs the quantiles of M. Back-predict to obtain the normalized values of C(i). 40
41 Two arrays (after normalization) LOG SIGNAL INTENSITY LOG(C2) C1 C LOG(C1) ARRAY 41 ρ (Concordance) = 0.98, ρ (Spearman) = 0.97
42 Normalization issues The normalization procedure must be nonlinear (i.e., intensity dependent). Lowess or spline normalization could be used when corresponding values should match across arrays (e.g., across technical replicates). Quantile normalization can be used to ensure similar distributions of values across arrays (e.g., across biological replicates). 42
43 Normalization issues (contd) Normalization preserves the rank order of the genes within each array. Normalization does not (directly) affect gene-specific effects. The normalizability of a set of arrays can be assessed using Spearman s correlation coefficient. The success of a normalization can be judged by an increase in the concordance correlation coefficient. 43
44 Normalization issues (contd) Normalization is based on a function fitted to a gene set comprised mostly of constantly expressing genes - how to select this set? [all / housekeeping genes / spikein controls / rank invariant genes] Other issues: stagewise normalization (when there are multiple levels of effects), probe level normalization (for oligonucleotide arrays), spatial normalization (e.g., print tip, uneven hybridization). 44
45 Convert scanned image to spotted image Run initial check of data quality Adjust for background Transform data Normalize data Deal with gross outliers and other anomalies Run final check of data quality Analyze data Interpret and report findings 45
46 Outliers Find gross outliers among replicates: X gi = gene g on array i M g =median i {X gi } R gi = X gi - M g S g/ = lowess-predict{ R gi vsm g } FENCE g = (M g -τ S g/, M g +τ S g/ ) What to do with outliers? (1) ignore (2) exclude (3) winsorize (4) impute (5) robust analysis 46
47 Missing values Impute values for missing observations (reduces impact of missing values on downstream analysis). A k nearest neighbor procedure: For each gene with missing values, (1) find its k nearest neighbors based on Euclidean distances computed using just the columns for which that gene is not missing, (2) impute the missing elements by averaging the corresponding non-missing elements of its neighbors. 47
48 Convert scanned image to spotted image Run initial check of data quality Adjust for background Transform data Normalize data Deal with gross outliers and other anomalies Run final check of data quality Analyze data Interpret and report findings 48
49 Preprocessed data C1 C2 C3 T1 T2 T3 G G G G G G G G G G ok G G ρ S, ρ CC, G G *
50 Convert scanned image to spotted image Run initial check of data quality Adjust for background Transform data Normalize data Deal with gross outliers and other anomalies Run final check of data quality Analyze data Interpret and report findings 50
51 Identify differentially expressed genes Nonstatistical: Seek genes that exhibit a specified fold increase in mean intensity (e.g., 2-fold). Statistical: Seek genes that exhibit a statistically significant difference across the 2 groups (via e.g., t test, Welch s test, Wilcoxon test, robust t test, permutation test - or perhaps a modelbased test depending on the situation). 51
52 Analysis results Top 10 genes (sorted by t-test p-value) Gene Fold Dir p p(bonf) G D G U G U G U G D G U G D G D G D G D
53 The multiplicity issue Issue: # tests # false positives (# false discoveries ). Fix 0: Report all statistically significant genes with no multiplicity adjustment. Drawback: too many false positives. Fix 1: Control the probability of even one false positive (i.e., control the familywise error rate) using, e.g., Bonferroni (p BON i min(gp i,1)) or Holm (step-down). Drawback: too many false negatives. 53
54 The multiplicity issue (contd) Fix 2: Examine a qqplot of the test statistics. Fix 3: Model p ~ Uniform(0,1) vs p ~BetaMix Expected Observed 54 p-values
55 The multiplicity issue (contd) Fix 4: (1) Rank the genes or select a subset of genes according to their (individual) significance (test stats or p- values) for differential gene expression. (2) Associate a number with each gene (or with the selected subset) that tells us how confident one should be that including it in a list of potentially differentially expressing genes does not substantially increase the rate of false findings. (pfdr or q-values) 55
56 The positive False Discovery Rate pfdr = Average ( #FalsePositives / #Positives ) To calculate: Either Decision rule says reject if T>c h 0 permute h 1 permute h 2 h m average=h* pfdr=h*/h 0 refine Or use a recursive formula. 56
57 In the example, o o Results (contd) 9 genes with p< in 9 permutations, on average, 2.8 genes with p< o pfdr = 2.8/9 = 31% 57
58 The effect of small sample size Issue: Often the sample size per group is very small. unreliable variances (inferences) dependence between the test statistics (t g ) and the standard error estimates (s g ) 58
59
60 Fixing the small sample size effect Borrow strength across genes (LPE/EB) σ g2 = f (µ g ) Regularize the test statistics (SAM) t= ( X X )/( s ) T C P t = ( X X )/( s + s ) SAM T C P 0 (assess significance by permutation) Work with t g s g (Conditional t). 60
61 Other issues Long-tailed within-gene distribution with small signal-to-noise ratio. Highly skewed gene-to-gene distribution. Gene variance related to gene mean. Genes co-dependent in clumps. 61
62 Model Let X gij denote the preprocessed intensity measurement for gene g in array i of group j. Model: X gij = µ gj + σ g ε gij Effect of interest: τ g = µ g2 - µ g1 Error model: ε gij ~ F(location=0, scale=1) Gene mean-variance model: (µ g1,σ g ) ~ F µ,σ 62
63 Possible approaches Parametric: Assume functional forms for F and F µ,σ and apply either a Bayes or Empirical Bayes procedure. Nonparametric: Estimate F µ,σ : edf, ˆF, of {( X, s 2 µσ, g1 g )} Estimate F : edf, ˆF, of { ( X X )/ s } gij gj g Proceed via a resampling procedure. 63
64 CT Procedure (1) Draw a gene, g, at random from {1,, G}. 2 Call it g*. ( X, s g * * ) ~ ˆF 1 g µσ,. (2) Take a random sample (with replacement) of size n 1 +n 2 from ˆF * : r ~ ˆ ij F (3) Combine these to form pseudo-data: X * X s r * = + ij g * * 1 g ij (4) Calculate the pooled standard error s* and t test statistic t* for the pseudo-data {X ij * }. 64
65 CT Procedure (contd) (5) Repeat steps (1)-(4) a large number ( 10,000) of times. (6) Given α, estimate the critical envelope, t α (s g ), as the (α/2) and (1-α/2) quantile curves in the t g vs s g relationship. (7) Genes that fall outside the critical envelope defined by t α (s g ) are deemed significant at level α. (Overall unconditional Type I error rate = α) 65
66
67 Comments regarding CT The edf F is a biased estimator of F σ, ˆs particularly with small sample sizes. This can be fixed using target estimation. The overall unconditional probability of Type I error is α. Good efficiency in simulations. Implemented in DNAMR. 67
68 Linear model based approaches (1) Let X gij denote the preprocessed intensity measurement for gene g in array i of group j. Model: X gij = µ gj + τ gj + ε gij Gene-by-gene analysis by F or SAM-F, or Conditional F or Variations: Dunnett s, dose-response trend, time course, external effects. 68
69 Linear model based approaches (2) Let X gij denote the preprocessed intensity measurement for gene g in array i of group j. Model: X gij = µ + τ j + α i(j) + γ g + (γα) gi + ε gij Fit in two stages: X gij = µ + τ j + α i(j) + δ gij R gij = γ g + (γα) gi + ε gij Other effects (e.g., dye and external effects) can be incorporated into model. 69
70 Convert scanned image to spotted image Run initial check of data quality Adjust for background Transform data Normalize data Deal with gross outliers and other anomalies Run final check of data quality Analyze data Interpret and report findings 70
71 Assess biological significance Data analysis list of differentially expressed genes? Confirm by RT-PCR or similar technique. Assess relevance by incorporating known properties of genes (e.g., gene ontology (GO) information: structured vocabulary for gene annotation - biological process, molecular function, cellular component). 71
72 End of morning session 72
Exploration and Analysis of DNA Microarray Data
Exploration and Analysis of DNA Microarray Data Dhammika Amaratunga Senior Research Fellow in Nonclinical Biostatistics Johnson & Johnson Pharmaceutical Research & Development Javier Cabrera Associate
More informationNormalization. Getting the numbers comparable. DNA Microarray Bioinformatics - #27612
Normalization Getting the numbers comparable The DNA Array Analysis Pipeline Question Experimental Design Array design Probe design Sample Preparation Hybridization Buy Chip/Array Image analysis Expression
More informationIntroduction to gene expression microarray data analysis
Introduction to gene expression microarray data analysis Outline Brief introduction: Technology and data. Statistical challenges in data analysis. Preprocessing data normalization and transformation. Useful
More informationBackground Correction and Normalization. Lecture 3 Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy
Background Correction and Normalization Lecture 3 Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy Feature Level Data Outline Affymetrix GeneChip arrays Two
More informationIntroduction to Bioinformatics. Fabian Hoti 6.10.
Introduction to Bioinformatics Fabian Hoti 6.10. Analysis of Microarray Data Introduction Different types of microarrays Experiment Design Data Normalization Feature selection/extraction Clustering Introduction
More information3.1.4 DNA Microarray Technology
3.1.4 DNA Microarray Technology Scientists have discovered that one of the differences between healthy and cancer is which genes are turned on in each. Scientists can compare the gene expression patterns
More informationCS-E5870 High-Throughput Bioinformatics Microarray data analysis
CS-E5870 High-Throughput Bioinformatics Microarray data analysis Harri Lähdesmäki Department of Computer Science Aalto University September 20, 2016 Acknowledgement for J Salojärvi and E Czeizler for the
More informationSTATISTICAL CHALLENGES IN GENE DISCOVERY
STATISTICAL CHALLENGES IN GENE DISCOVERY THROUGH MICROARRAY DATA ANALYSIS 1 Central Tuber Crops Research Institute,Kerala, India 2 Dept. of Statistics, St. Thomas College, Pala, Kerala, India email:sreejyothi
More informationMicroarray Data Analysis Workshop. Preprocessing and normalization A trailer show of the rest of the microarray world.
Microarray Data Analysis Workshop MedVetNet Workshop, DTU 2008 Preprocessing and normalization A trailer show of the rest of the microarray world Carsten Friis Media glna tnra GlnA TnrA C2 glnr C3 C5 C6
More informationNing Tang ALL RIGHTS RESERVED
2014 Ning Tang ALL RIGHTS RESERVED ROBUST GENE SET ANALYSIS AND ROBUST GENE EXPRESSION By NING TANG A dissertation submitted to the Graduate School New Brunswick Rutgers, The State University of New Jersey
More informationEECS730: Introduction to Bioinformatics
EECS730: Introduction to Bioinformatics Lecture 14: Microarray Some slides were adapted from Dr. Luke Huan (University of Kansas), Dr. Shaojie Zhang (University of Central Florida), and Dr. Dong Xu and
More informationOutline. Analysis of Microarray Data. Most important design question. General experimental issues
Outline Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization Introduction to microarrays Experimental design Data normalization Other data transformation Exercises George Bell,
More informationRafael A Irizarry, Department of Biostatistics JHU
Getting Usable Data from Microarrays it s not as easy as you think Rafael A Irizarry, Department of Biostatistics JHU rafa@jhu.edu http://www.biostat.jhsph.edu/~ririzarr http://www.bioconductor.org Acknowledgements
More informationMicroarray Technique. Some background. M. Nath
Microarray Technique Some background M. Nath Outline Introduction Spotting Array Technique GeneChip Technique Data analysis Applications Conclusion Now Blind Guess? Functional Pathway Microarray Technique
More informationIntroduction to Microarray Analysis
Introduction to Microarray Analysis Methods Course: Gene Expression Data Analysis -Day One Rainer Spang Microarrays Highly parallel measurement devices for gene expression levels 1. How does the microarray
More informationExam 1 from a Past Semester
Exam from a Past Semester. Provide a brief answer to each of the following questions. a) What do perfect match and mismatch mean in the context of Affymetrix GeneChip technology? Be as specific as possible
More informationGene Expression Data Analysis
Gene Expression Data Analysis Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu BMIF 310, Fall 2009 Gene expression technologies (summary) Hybridization-based
More informationExpression summarization
Expression Quantification: Affy Affymetrix Genechip is an oligonucleotide array consisting of a several perfect match (PM) and their corresponding mismatch (MM) probes that interrogate for a single gene.
More informationImage Analysis. Based on Information from Terry Speed s Group, UC Berkeley. Lecture 3 Pre-Processing of Affymetrix Arrays. Affymetrix Terminology
Image Analysis Lecture 3 Pre-Processing of Affymetrix Arrays Stat 697K, CS 691K, Microbio 690K 2 Affymetrix Terminology Probe: an oligonucleotide of 25 base-pairs ( 25-mer ). Based on Information from
More informationDNA Microarray Technology
2 DNA Microarray Technology 2.1 Overview DNA microarrays are assays for quantifying the types and amounts of mrna transcripts present in a collection of cells. The number of mrna molecules derived from
More informationMeasuring gene expression (Microarrays) Ulf Leser
Measuring gene expression (Microarrays) Ulf Leser This Lecture Gene expression Microarrays Idea Technologies Problems Quality control Normalization Analysis next week! 2 http://learn.genetics.utah.edu/content/molecules/transcribe/
More informationMicroarray Data Analysis. Normalization
Microarray Data Analysis Normalization Outline General issues Normalization for two colour microarrays Normalization and other stuff for one color microarrays 2 Preprocessing: normalization The word normalization
More informationDavid M. Rocke Division of Biostatistics and Department of Biomedical Engineering University of California, Davis
David M. Rocke Division of Biostatistics and Department of Biomedical Engineering University of California, Davis Outline RNA-Seq for differential expression analysis Statistical methods for RNA-Seq: Structure
More informationBiology 644: Bioinformatics
Measure of the linear correlation (dependence) between two variables X and Y Takes a value between +1 and 1 inclusive 1 = total positive correlation 0 = no correlation 1 = total negative correlation. When
More informationIntroduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics
Introduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics abedi777@ymail.com Outlines Technology Basic concepts Data analysis Printed Microarrays In Situ-Synthesized
More informationComputational Biology I
Computational Biology I Microarray data acquisition Gene clustering Practical Microarray Data Acquisition H. Yang From Sample to Target cdna Sample Centrifugation (Buffer) Cell pellets lyse cells (TRIzol)
More informationData Mining for Biological Data Analysis
Data Mining for Biological Data Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Data Mining Course by Gregory-Platesky Shapiro available at www.kdnuggets.com Jiawei Han
More informationAffymetrix GeneChip Arrays. Lecture 3 (continued) Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy
Affymetrix GeneChip Arrays Lecture 3 (continued) Computational and Statistical Aspects of Microarray Analysis June 21, 2005 Bressanone, Italy Affymetrix GeneChip Design 5 3 Reference sequence TGTGATGGTGGGGAATGGGTCAGAAGGCCTCCGATGCGCCGATTGAGAAT
More informationPre-processing DNA Microarray Data
Pre-processing DNA Microarray Data Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor Short Course Winter 2002 Copyright 2002, all rights reserved Biological question Experimental
More informationAnalysis of Microarray Data
Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization George Bell, Ph.D. Senior Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Introduction
More informationAnalysis of Microarray Data
Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization George Bell, Ph.D. Senior Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Introduction
More informationPreprocessing Methods for Two-Color Microarray Data
Preprocessing Methods for Two-Color Microarray Data 1/15/2011 Copyright 2011 Dan Nettleton Preprocessing Steps Background correction Transformation Normalization Summarization 1 2 What is background correction?
More informationMeasuring and Understanding Gene Expression
Measuring and Understanding Gene Expression Dr. Lars Eijssen Dept. Of Bioinformatics BiGCaT Sciences programme 2014 Why are genes interesting? TRANSCRIPTION Genome Genomics Transcriptome Transcriptomics
More informationExpressed genes profiling (Microarrays) Overview Of Gene Expression Control Profiling Of Expressed Genes
Expressed genes profiling (Microarrays) Overview Of Gene Expression Control Profiling Of Expressed Genes Genes can be regulated at many levels Usually, gene regulation, are referring to transcriptional
More informationGene Expression Data Analysis (I)
Gene Expression Data Analysis (I) Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Bioinformatics tasks Biological question Experiment design Microarray experiment
More informationIntroduction to microarrays
Bayesian modelling of gene expression data Alex Lewin Sylvia Richardson (IC Epidemiology) Tim Aitman (IC Microarray Centre) Philippe Broët (INSERM, Paris) In collaboration with Anne-Mette Hein, Natalia
More informationIdentification of biological themes in microarray data from a mouse heart development time series using GeneSifter
Identification of biological themes in microarray data from a mouse heart development time series using GeneSifter VizX Labs, LLC Seattle, WA 98119 Abstract Oligonucleotide microarrays were used to study
More informationDNA Microarray Data Oligonucleotide Arrays
DNA Microarray Data Oligonucleotide Arrays Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor Short Course 2003 Copyright 2002, all rights reserved Biological question Experimental
More informationCOS 597c: Topics in Computational Molecular Biology. DNA arrays. Background
COS 597c: Topics in Computational Molecular Biology Lecture 19a: December 1, 1999 Lecturer: Robert Phillips Scribe: Robert Osada DNA arrays Before exploring the details of DNA chips, let s take a step
More informationHumboldt Universität zu Berlin. Grundlagen der Bioinformatik SS Microarrays. Lecture
Humboldt Universität zu Berlin Microarrays Grundlagen der Bioinformatik SS 2017 Lecture 6 09.06.2017 Agenda 1.mRNA: Genomic background 2.Overview: Microarray 3.Data-analysis: Quality control & normalization
More informationIdentification of spatial biases in Affymetrix oligonucleotide microarrays
Identification of spatial biases in Affymetrix oligonucleotide microarrays Jose Manuel Arteaga-Salas, Graham J. G. Upton, William B. Langdon and Andrew P. Harrison University of Essex, U. K. Agenda 1.
More informationIntroduction to Bioinformatics and Gene Expression Technology
Vocabulary Introduction to Bioinformatics and Gene Expression Technology Utah State University Spring 2014 STAT 5570: Statistical Bioinformatics Notes 1.1 Gene: Genetics: Genome: Genomics: hereditary DNA
More informationGene expression analysis: Introduction to microarrays
Gene expression analysis: Introduction to microarrays Adam Ameur The Linnaeus Centre for Bioinformatics, Uppsala University February 15, 2006 Overview Introduction Part I: How a microarray experiment is
More informationMicroarray. Key components Array Probes Detection system. Normalisation. Data-analysis - ratio generation
Microarray Key components Array Probes Detection system Normalisation Data-analysis - ratio generation MICROARRAY Measures Gene Expression Global - Genome wide scale Why Measure Gene Expression? What information
More informationPre-processing DNA Microarray Data
Pre-processing DNA Microarray Data Short course: Practical Analysis of DNA Microarray Data Instructors: Vince Carey & Sandrine Dudoit KolleKolle, Denmark October 26-28, 2003 1 Slides from Short Courses
More informationRecent technology allow production of microarrays composed of 70-mers (essentially a hybrid of the two techniques)
Microarrays and Transcript Profiling Gene expression patterns are traditionally studied using Northern blots (DNA-RNA hybridization assays). This approach involves separation of total or polya + RNA on
More informationThe essentials of microarray data analysis
The essentials of microarray data analysis (from a complete novice) Thanks to Rafael Irizarry for the slides! Outline Experimental design Take logs! Pre-processing: affy chips and 2-color arrays Clustering
More informationadvanced analysis of gene expression microarray data aidong zhang World Scientific State University of New York at Buffalo, USA
advanced analysis of gene expression microarray data aidong zhang State University of New York at Buffalo, USA World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI Contents
More informationBioinformatics: Microarray Technology. Assc.Prof. Chuchart Areejitranusorn AMS. KKU.
Introduction to Bioinformatics: Microarray Technology Assc.Prof. Chuchart Areejitranusorn AMS. KKU. ความจร งเก ยวก บ ความจรงเกยวกบ Cell and DNA Cell Nucleus Chromosome Protein Gene (mrna), single strand
More informationProbe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies.
Probe-Level Data Normalisation: RMA and GC-RMA Sam Robson Images courtesy of Neil Ward, European Application Engineer, Agilent Technologies. References Summaries of Affymetrix Genechip Probe Level Data,
More informationIntroduction to Bioinformatics: Chapter 11: Measuring Expression of Genome Information
HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE Introduction to Bioinformatics: Chapter 11: Measuring Expression of Genome Information Jarkko Salojärvi Lecture slides by
More informationGene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis
Gene expression analysis Biosciences 741: Genomics Fall, 2013 Week 5 Gene expression analysis From EST clusters to spotted cdna microarrays Long vs. short oligonucleotide microarrays vs. RT-PCR Methods
More informationTechnical Review. Real time PCR
Technical Review Real time PCR Normal PCR: Analyze with agarose gel Normal PCR vs Real time PCR Real-time PCR, also known as quantitative PCR (qpcr) or kinetic PCR Key feature: Used to amplify and simultaneously
More informationFrom CEL files to lists of interesting genes. Rafael A. Irizarry Department of Biostatistics Johns Hopkins University
From CEL files to lists of interesting genes Rafael A. Irizarry Department of Biostatistics Johns Hopkins University Contact Information e-mail Personal webpage Department webpage Bioinformatics Program
More information6. GENE EXPRESSION ANALYSIS MICROARRAYS
6. GENE EXPRESSION ANALYSIS MICROARRAYS BIOINFORMATICS COURSE MTAT.03.239 16.10.2013 GENE EXPRESSION ANALYSIS MICROARRAYS Slides adapted from Konstantin Tretyakov s 2011/2012 and Priit Adlers 2010/2011
More informationGene Expression Technology
Gene Expression Technology Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Gene expression Gene expression is the process by which information from a gene
More informationIntro to Microarray Analysis. Courtesy of Professor Dan Nettleton Iowa State University (with some edits)
Intro to Microarray Analysis Courtesy of Professor Dan Nettleton Iowa State University (with some edits) Some Basic Biology Genes are DNA sequences that code for proteins. (e.g. gene lengths perhaps 1000
More informationPreprocessing Affymetrix GeneChip Data. Affymetrix GeneChip Design. Terminology TGTGATGGTGGGGAATGGGTCAGAAGGCCTCCGATGCGCCGATTGAGAAT
Preprocessing Affymetrix GeneChip Data Credit for some of today s materials: Ben Bolstad, Leslie Cope, Laurent Gautier, Terry Speed and Zhijin Wu Affymetrix GeneChip Design 5 3 Reference sequence TGTGATGGTGGGGAATGGGTCAGAAGGCCTCCGATGCGCCGATTGAGAAT
More informationMicroarrays & Gene Expression Analysis
Microarrays & Gene Expression Analysis Contents DNA microarray technique Why measure gene expression Clustering algorithms Relation to Cancer SAGE SBH Sequencing By Hybridization DNA Microarrays 1. Developed
More informationMeasuring gene expression
Measuring gene expression Grundlagen der Bioinformatik SS2018 https://www.youtube.com/watch?v=v8gh404a3gg Agenda Organization Gene expression Background Technologies FISH Nanostring Microarrays RNA-seq
More informationLecture #1. Introduction to microarray technology
Lecture #1 Introduction to microarray technology Outline General purpose Microarray assay concept Basic microarray experimental process cdna/two channel arrays Oligonucleotide arrays Exon arrays Comparing
More informationIntegrative Genomics 1a. Introduction
2016 Course Outline Integrative Genomics 1a. Introduction ggibson.gt@gmail.com http://www.cig.gatech.edu 1a. Experimental Design and Hypothesis Testing (GG) 1b. Normalization (GG) 2a. RNASeq (MI) 2b. Clustering
More informationThe effect of normalization methods on the identification of differentially expressed genes in microarray data
School of Humanities and Informatics Dissertation in Bioinformatics 20p Advanced level Spring term 2006 The effect of normalization methods on the identification of differentially expressed genes in microarray
More information10.1 The Central Dogma of Biology and gene expression
126 Grundlagen der Bioinformatik, SS 09, D. Huson (this part by K. Nieselt) July 6, 2009 10 Microarrays (script by K. Nieselt) There are many articles and books on this topic. These lectures are based
More informationBioinformatics III Structural Bioinformatics and Genome Analysis. PART II: Genome Analysis. Chapter 7. DNA Microarrays
Bioinformatics III Structural Bioinformatics and Genome Analysis PART II: Genome Analysis Chapter 7. DNA Microarrays 7.1 Motivation 7.2 DNA Microarray History and current states 7.3 DNA Microarray Techniques
More informationLecture 2: March 8, 2007
Analysis of DNA Chips and Gene Networks Spring Semester, 2007 Lecture 2: March 8, 2007 Lecturer: Rani Elkon Scribe: Yuri Solodkin and Andrey Stolyarenko 1 2.1 Low Level Analysis of Microarrays 2.1.1 Introduction
More informationGene expression. What is gene expression?
Gene expression What is gene expression? Methods for measuring a single gene. Northern Blots Reporter genes Quantitative RT-PCR Operons, regulons, and stimulons. DNA microarrays. Expression profiling Identifying
More informationDNA/RNA MICROARRAYS NOTE: USE THIS KIT WITHIN 6 MONTHS OF RECEIPT.
DNA/RNA MICROARRAYS This protocol is based on the EDVOTEK protocol DNA/RNA Microarrays. 10 groups of students NOTE: USE THIS KIT WITHIN 6 MONTHS OF RECEIPT. 1. EXPERIMENT OBJECTIVE The objective of this
More informationLab 1: A review of linear models
Lab 1: A review of linear models The purpose of this lab is to help you review basic statistical methods in linear models and understanding the implementation of these methods in R. In general, we need
More informationIntroduction to Microarray Data Analysis and Gene Networks. Alvis Brazma European Bioinformatics Institute
Introduction to Microarray Data Analysis and Gene Networks Alvis Brazma European Bioinformatics Institute A brief outline of this course What is gene expression, why it s important Microarrays and how
More informationPre processing and quality control of microarray data
Pre processing and quality control of microarray data Christine Stansberg, 20.04.10 Workflow microarray experiment 1 Problem driven experimental design Wet lab experiments RNA labelling 2 Data pre processing
More informationSIMS2003. Instructors:Rus Yukhananov, Alex Loguinov BWH, Harvard Medical School. Introduction to Microarray Technology.
SIMS2003 Instructors:Rus Yukhananov, Alex Loguinov BWH, Harvard Medical School Introduction to Microarray Technology. Lecture 1 I. EXPERIMENTAL DETAILS II. ARRAY CONSTRUCTION III. IMAGE ANALYSIS Lecture
More informationGene expression profiling experiments:
Gene expression profiling experiments: Problems, pitfalls, and solutions. Heli Borg The Alternatives in Microarray Experiments bacteria - eucaryots non poly(a) + - poly(a) + oligonucleotide Affymetrix
More informationAnalysis of Microarray Data
Analysis of Microarray Data Lecture 3: Visualization and Functional Analysis George Bell, Ph.D. Senior Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Review
More informationOutline. Array platform considerations: Comparison between the technologies available in microarrays
Microarray overview Outline Array platform considerations: Comparison between the technologies available in microarrays Differences in array fabrication Differences in array organization Applications of
More informationMixed effects model for assessing RNA degradation in Affymetrix GeneChip experiments
Mixed effects model for assessing RNA degradation in Affymetrix GeneChip experiments Kellie J. Archer, Ph.D. Suresh E. Joel Viswanathan Ramakrishnan,, Ph.D. Department of Biostatistics Virginia Commonwealth
More informationIntroduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods
Introduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods Companion lecture to the textbook: Fundamentals of BioMEMS and Medical Microdevices, by Prof., http://saliterman.umn.edu/
More informationExploration, Normalization, Summaries, and Software for Affymetrix Probe Level Data
Exploration, Normalization, Summaries, and Software for Affymetrix Probe Level Data Rafael A. Irizarry Department of Biostatistics, JHU March 12, 2003 Outline Review of technology Why study probe level
More informationDeoxyribonucleic Acid DNA
Introduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods Companion lecture to the textbook: Fundamentals of BioMEMS and Medical Microdevices, by Prof., http://saliterman.umn.edu/
More informationIntroduction to Genome Wide Association Studies 2014 Sydney Brenner Institute for Molecular Bioscience/Wits Bioinformatics Shaun Aron
Introduction to Genome Wide Association Studies 2014 Sydney Brenner Institute for Molecular Bioscience/Wits Bioinformatics Shaun Aron Genotype calling Genotyping methods for Affymetrix arrays Genotyping
More informationExperimental Design for Gene Expression Microarray. Jing Yi 18 Nov, 2002
Experimental Design for Gene Expression Microarray Jing Yi 18 Nov, 2002 Human Genome Project The HGP continued emphasis is on obtaining by 2003 a complete and highly accurate reference sequence(1 error
More informationGene expression: Microarray data analysis. Copyright notice. Outline: microarray data analysis. Schedule
Gene expression: Microarray data analysis Copyright notice Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan Pevsner (ISBN -47-4-8). Copyright
More informationBIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology. Lecture 2: Microarray analysis
BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology Lecture 2: Microarray analysis Genome wide measurement of gene transcription using DNA microarray Bruce Alberts, et al., Molecular Biology
More informationDesigning a Complex-Omics Experiments. Xiangqin Cui. Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham
Designing a Complex-Omics Experiments Xiangqin Cui Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham 1/7/2015 Some slides are from previous lectures of Grier
More informationAnalyzing DNA Microarray Data Using Bioconductor
Analyzing DNA Microarray Data Using Bioconductor Sandrine Dudoit and Rafael Irizarry Short Course on Mathematical Approaches to the Analysis of Complex Phenotypes The Jackson Laboratory, Bar Harbor, Maine
More informationFACTORS CONTRIBUTING TO VARIABILITY IN DNA MICROARRAY RESULTS: THE ABRF MICROARRAY RESEARCH GROUP 2002 STUDY
FACTORS CONTRIBUTING TO VARIABILITY IN DNA MICROARRAY RESULTS: THE ABRF MICROARRAY RESEARCH GROUP 2002 STUDY K. L. Knudtson 1, C. Griffin 2, A. I. Brooks 3, D. A. Iacobas 4, K. Johnson 5, G. Khitrov 6,
More informationMicroarray Informatics
Microarray Informatics Donald Dunbar MSc Seminar 31 st January 2007 Aims To give a biologist s view of microarray experiments To explain the technologies involved To describe typical microarray experiments
More informationMicroarray pipeline & Pre-processing
Microarray pipeline & Pre-processing Solveig Mjelstad Olafsrud J Express Analysis Course November 2010 Some slides adapted from Christine Stansberg thank you Christine! The microarray pipeline The goal
More informationFeature Selection of Gene Expression Data for Cancer Classification: A Review
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 50 (2015 ) 52 57 2nd International Symposium on Big Data and Cloud Computing (ISBCC 15) Feature Selection of Gene Expression
More informationDNA Microarrays and Clustering of Gene Expression Data
DNA Microarrays and Clustering of Gene Expression Data Martha L. Bulyk mlbulyk@receptor.med.harvard.edu Biophysics 205 Spring term 2008 Traditional Method: Northern Blot RNA population on filter (gel);
More informationV10-8. Gene Expression
V10-8. Gene Expression - Regulation of Gene Transcription at Promoters - Experimental Analysis of Gene Expression - Statistics Primer - Preprocessing of Data - Differential Expression Analysis Fri, May
More informationDNA Chip Technology Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center
DNA Chip Technology Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center Why DNA Chips? Functional genomics: get information about genes that is unavailable from sequence
More informationMicroarray analysis challenges.
Microarray analysis challenges. While not quite as bad as my hobby of ice climbing you, need the right equipment! T. F. Smith Bioinformatics Boston Univ. Experimental Design Issues Reference and Controls
More informationCAP BIOINFORMATICS Su-Shing Chen CISE. 10/5/2005 Su-Shing Chen, CISE 1
CAP 5510-9 BIOINFORMATICS Su-Shing Chen CISE 10/5/2005 Su-Shing Chen, CISE 1 Basic BioTech Processes Hybridization PCR Southern blotting (spot or stain) 10/5/2005 Su-Shing Chen, CISE 2 10/5/2005 Su-Shing
More informationBioinformatics and Genomics: A New SP Frontier?
Bioinformatics and Genomics: A New SP Frontier? A. O. Hero University of Michigan - Ann Arbor http://www.eecs.umich.edu/ hero Collaborators: G. Fleury, ESE - Paris S. Yoshida, A. Swaroop UM - Ann Arbor
More informationMotivation From Protein to Gene
MOLECULAR BIOLOGY 2003-4 Topic B Recombinant DNA -principles and tools Construct a library - what for, how Major techniques +principles Bioinformatics - in brief Chapter 7 (MCB) 1 Motivation From Protein
More informationBioinformatics for Biologists
Bioinformatics for Biologists Microarray Data Analysis. Lecture 1. Fran Lewitter, Ph.D. Director Bioinformatics and Research Computing Whitehead Institute Outline Introduction Working with microarray data
More informationStandard Data Analysis Report Agilent Gene Expression Service
Standard Data Analysis Report Agilent Gene Expression Service Experiment: S534662 Date: 2011-01-01 Prepared for: Dr. Researcher Genomic Sciences Lab Prepared by S534662 Standard Data Analysis Report 2011-01-01
More informationDNA Microarray Technology
CHAPTER 1 DNA Microarray Technology All living organisms are composed of cells. As a functional unit, each cell can make copies of itself, and this process depends on a proper replication of the genetic
More informationAnnouncements. Lecture 2: DNA Microarray Overview. Gene Expression: The Central Dogma. Talks. Go to class web page
Announcements Lecture 2: DNA Microarray Overview Go to class web page http://www.cs.washington.edu/527 Add yourself to class list Check out HW1, including last year s (Some slides from Dr. Holly Dressman,
More information