Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison. CodeLink compatible

Similar documents
Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison

Seven Keys to Successful Microarray Data Analysis

Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung

Microarray analysis of gene expression in male germ cell tumors

Using 2-way ANOVA to dissect gene expression following myocardial infarction in mice

The microarray data analysis process - from raw data to biological significance

The microarray data analysis process - from raw data to biological significance. N. Eric Olson

Identification of biological themes in microarray data from a mouse heart development time series using GeneSifter

Introduction to microarray technology and data analysis

Introduction to microarray technology and data analysis

Analysis of Microarray Data

Analysis of Microarray Data

Introduction to Microarray Technique, Data Analysis, Databases Maryam Abedi PhD student of Medical Genetics

Gene Expression Data Analysis

CodeLink Human Whole Genome Bioarray

The Microarray Data Analysis Process: From Raw Data to Biological Significance

Microarray Informatics

PATHWAY ANALYSIS. Susan LM Coort, PhD Department of Bioinformatics, Maastricht University. PET course: Toxicogenomics

Microarray Informatics

Annotation. (Chapter 8)

Bioconductor. Course in Practical Microarray Analysis Berlin Slides 2002 Sandrine Dudoit, Robert Gentleman. Adapted by Wolfgang Huber.

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

Gene expression analysis: Introduction to microarrays

Standard Data Analysis Report Agilent Gene Expression Service

Understanding protein lists from proteomics studies. Bing Zhang Department of Biomedical Informatics Vanderbilt University

Measuring and Understanding Gene Expression

A WEB-BASED TOOL FOR GENOMIC FUNCTIONAL ANNOTATION, STATISTICAL ANALYSIS AND DATA MINING

Bioconductor. Course in Practical Microarray Analysis Heidelberg

How to deal with the microarray results.

Deakin Research Online

Gene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis

Agilent GeneSpring GX 10: Beyond. Pam Tangvoranuntakul Product Manager, GeneSpring October 1, 2008

Gene-centered resources at NCBI

Next-Generation Sequencing Gene Expression Analysis Using Agilent GeneSpring GX

GS Analysis of Microarray Data

Lecture #1. Introduction to microarray technology

Exploration and Analysis of DNA Microarray Data

Integrative Genomics 1a. Introduction

Ingenuity Pathway Analysis (IPA )

GS Analysis of Microarray Data

Web-based tools for Bioinformatics; A (free) introduction to (freely available) NCBI, MUSC and World-wide.

Gene Expression Data Analysis (I)

Microarray Data Analysis in GeneSpring GX 11. Month ##, 200X

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

Research Powered by Agilent s GeneSpring

Outline. Analysis of Microarray Data. Most important design question. General experimental issues

9/19/13. cdna libraries, EST clusters, gene prediction and functional annotation. Biosciences 741: Genomics Fall, 2013 Week 3

Microarray Experiment Design

BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology. Lecture 2: Microarray analysis

ChIP-seq data analysis with Chipster. Eija Korpelainen CSC IT Center for Science, Finland

Introduction to Bioinformatics and Gene Expression Technology

GS Analysis of Microarray Data

EECS730: Introduction to Bioinformatics

Biology 644: Bioinformatics

Analyzing Gene Set Enrichment

Types of Databases - By Scope

advanced analysis of gene expression microarray data aidong zhang World Scientific State University of New York at Buffalo, USA

Bioinformatics for Proteomics. Ann Loraine

Pathway Analysis. Min Kim Bioinformatics Core Facility 2/28/2018

Introduction to Bioinformatics and Gene Expression Technologies

Introduction to Bioinformatics and Gene Expression Technologies

Basic aspects of Microarray Data Analysis

Array-Ready Oligo Set for the Rat Genome Version 3.0

ELE4120 Bioinformatics. Tutorial 5

Product Applications for the Sequence Analysis Collection

Microarrays & Gene Expression Analysis

April transmart v1.2 Case Study for PredicTox

Final exam: Introduction to Bioinformatics and Genomics DUE: Friday June 29 th at 4:00 pm

Bioinformatics Analysis of Nano-based Omics Data

Gene Expression Profiling of Prokaryotic Samples using Low Input Quick Amp WT Kit

Gene Regulation Solutions. Microarrays and Next-Generation Sequencing

Array Quality Metrics. Audrey Kauffmann

Experience with Weka by Predictive Classification on Gene-Expre

PCR Arrays. An Advanced Real-time PCR Technology to Empower Your Pathway Analysis

Generating quality metrics reports for microarray data sets. Audrey Kauffmann

IPA Advanced Training Course

Bioinformatics : Gene Expression Data Analysis

Transcriptome analysis

Kyoto Encyclopedia of Genes and Genomes (KEGG)

Gene List Enrichment Analysis

Introduction to RNA-Seq in GeneSpring NGS Software

Gene Annotation and Gene Set Analysis

Entrez Gene: gene-centered information at NCBI

Gene expression: Microarray data analysis. Copyright notice. Outline: microarray data analysis. Schedule

Microarray Data Analysis Workshop. Preprocessing and normalization A trailer show of the rest of the microarray world.

Combining ANOVA and PCA in the analysis of microarray data

BGGN 213: Foundations of Bioinformatics (Fall 2017)

RNA-Seq Analysis. August Strand Genomics, Inc All rights reserved.

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

Computational Biology I

Analysis of a Tiling Regulation Study in Partek Genomics Suite 6.6

BIMM 143: Introduction to Bioinformatics (Winter 2018)

ROAD TO STATISTICAL BIOINFORMATICS CHALLENGE 1: MULTIPLE-COMPARISONS ISSUE

Soil invertebrates as a genomic model to study pollutants in the field

Course on Functional Analysis

New Features in JMP Genomics 4.1 WHITE PAPER

Non-conserved intronic motifs in human and mouse are associated with a conserved set of functions

Microarray Technique. Some background. M. Nath

Basics of RNA-Seq. (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility

Analysis of Microarray Data

Transcription:

Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison CodeLink compatible

Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison General microarry data analysis workflow From raw data to biological significance Comparison statistics and correction for multiple testing GeneSifter overview Gene Expression in Huntington's Disease Peripheral Blood Identification of biological themes Platform comparison

Analysis Workflow Raw data Normalized, scaled data Differentially expressed genes Identify and partition expression patterns Gene Summaries Biological themes (Pathways, molecular function, etc.)

Analysis Workflow Raw data Normalized, scaled data Differentially expressed genes Identify and partition expression patterns Gene Summaries Data upload Comparison statistics, correction for multiple testing Up and down regulated, magnitude, clustering Annotation (UniGene, Entrez Gene, Gene Ontologies, etc.) Biological themes (Pathways, molecular function, etc.) Ontology report, pathway report, z-score

microarraysuccess.com Experiment Design Experimental design determines what can be inferred from the data as well as determining the confidence that can be assigned to those inferences. Careful experimental design and the presence of biological replicates are essential to the successful use of microarrays. Type of experiment Two groups Three or more groups Time series Dose response Multiple treatment The type of experiment and number of groups will affect the statistical methods used to detect differential expression Replicates The more the better, but at least 3 Biological better than technical Rigorous statistical inferences cannot be made with a sample size of one. The more replicates, the stronger the inference. Supporting material - Experimental Design and Other Issues in Microarray Studies - Kathleen Kerr - http://ra.microslu.washington.edu/presentation/documents/kerrnas.pdf

microarraysuccess.com Differential Expression The fundamental goal of microarray experiments is to identify genes that are differentially expressed in the conditions being studied. Comparison statistics can be used to help identify differentially expressed genes and cluster analysis can be used to identify patterns of gene expression and to segregate a subset of genes based on these patterns. Statistical Significance Fold change Fold change does not address the reproducibility of the observed difference and cannot be used to determine the statistical significance. Comparison statistics 2 group t-test, Welch s t-test, Wilcoxon Rank Sum, 3 or more groups ANOVA, Kruskal-Wallis Comparison tests require replicates and use the variability within the replicates to assign a confidence level as to whether the gene is differentially expressed. Supporting material - Draghici S. (2002) Statistical intelligence: effective analysis of high-density microarray data. Drug Discov Today, 7(11 Suppl).: S55-63.

microarraysuccess.com Differential Expression Correction for multiple testing- Methods for adjusting the p-value from a comparison test based on the number of tests performed. These adjustments help to reduce the number of false positives in an experiment. FWER : Family Wise Error Rate (FWER) corrections adjust the p-value so that it reflects the chance of at least 1 false positive being found in the list. Bonferonni, Holm, W & Y MaxT FDR : False Discovery Rate corrections (FDR) adjust the p-value so that it reflects the frequency of false positives in the list. Benjamini and Hochberg, SAM The FWER is more conservative, but the FDR is usually acceptable for discovery experiments, i.e. where a small number of false positives is acceptable Dudoit, S., et al. (2003) Multiple hypothesis testing in microarray experiments. Statistical Science 18(1): 71-103. Reiner, A., et al. (2003) Identifying differentially expressed genes using false discovery rate controlling procedures. Bioinformatics 19(3):368-375.

GeneSifter Microarray Data Analysis Accessibility Web-based Secure Data management Data Annotation (MIAME) Multiple upload tools CodeLink Affymetrix Illumina Agilent GEO CodeLink compatible Differential Expression - Powerful, accessible tools for determining Statistical Significance R based statistics Bioconductor Comparison Tests t-test, Welch s t-test, Wilcoxon Rank sum test, ANOVA, Correction for Multiple Testing Bonferroni, Holm, Westfall and Young maxt, Benjamini and Hochberg Unsupervised Clustering PAM, CLARA, Hierarchical clustering Silhouettes

GeneSifter Microarray Data Analysis Integrated tools for determining Biological Significance One Click Gene Summary Ontology Report Pathway Report Search by ontology terms Search by KEGG terms or Chromosome

The GeneSifter Data Center Free resource Training Research Publishing 5 areas Cardiovascular Cancer Neuroscience Immunology Oral Biology Access to : Data Analysis summary Tutorials WebEx

The GeneSifter Data Center www.genesifter.net/dc

GeneSifter - Analysis Examples 2 groups (Huntingtons Blood vs Healthy Blood) Data Upload CodeLink 3 + groups (Time series, dose response, etc.) Differential expression Fold change Quality t-test False discovery rate Differential expression Fold change Quality ANOVA False discovery rate Visualization Hierarchical clustering PCA Partitioning PAM Silhouettes Biological significance Gene Annotation Ontology report Pathway report

Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison General microarry data analysis workflow From raw data to biological significance Comparison statistics and correction for multiple testing GeneSifter overview Gene Expression in Huntington's Disease Peripheral Blood Identification of biological themes Platform comparison

Background - Huntington s Disease Huntington s Disease (HD) Autosomal dominant neurodegenerative disease Motor impairment Cognitive decline Various psychiatric symptoms Onset 30-50 years Mutant Huntingtin protein (polyglutamine) Effects transcriptional regulation Transcription effects may occur outside of CNS

Background - Data Genome-wide expression profiling of human blood reveals biomarkers for Huntington's disease Borovecki F, Lovrecic L, Zhou J, Jeong H, Then F, Rosas HD, Hersch SM, Hogarth P, Bouzou B, Jensen RV, Krainc D. Proc Natl Acad Sci U S A. 2005 Aug 2;102(31):11023-8. Collected peripheral blood samples - 14 Controls 12 Symptomatic HD patients 5 Presymptomatic HD patients Identified 322 most differentially expressed genes (Con. Vs Symptomatic HD) using U133A array. Used CodeLink 20K to confirm genes identifed using Affymetrix platform Focused on 12 genes that showed most significant difference between Control and HD Data available from GEO

Pairwise Analysis Human blood expression for Huntington s disease versus control, CodeLink CodeLink Human 20K Bioarray Borovecki F, Lovrecic L, Zhou J, Jeong H, Then F, Rosas HD, Hersch SM, Hogarth P, Bouzou B, Jensen RV, Krainc D. Genome-wide expression profiling of human blood reveals biomarkers for Huntington's disease. Proc Natl Acad Sci U S A. 2005 Aug 2;102(31):11023-8.

Pairwise Analysis Select group 1 14 normal Select group 2 12 Huntingtons

Pairwise Analysis Already normalized (median) t-test Quality filter 0.75 (filters out genes with signal less than 0.75) Benjamini and Hochberg (FDR) Log transform data

Pairwise Analysis Gene List

One-Click Gene Summary

Biological Significance Gene Annotation Sources UniGene - organizes GenBank sequences into a non-redundant set of gene-oriented clusters. Gene titles are assigned to the clusters and these titles are commonly used by researchers to refer to that particular gene. LocusLink (Entrez Gene) - provides a single query interface to curated sequence and descriptive information, including function, about genes. Gene Ontologies The Gene Ontology Consortium provides controlled vocabularies for the description of the molecular function, biological process and cellular component of gene products. KEGG - Kyoto Encyclopedia of Genes and Genomes provides information about both regulatory and metabolic pathways for genes. Reference Sequences- The NCBI Reference Sequence project (RefSeq) provides reference sequences for both the mrna and protein products of included genes. GeneSifter maintains its own copies of these databases and updates them automatically.

Pairwise Analysis Gene List

Ontology Report

Ontology Report : z-score R = total number of genes meeting selection criteria N = total number of genes measured r = number of genes meeting selection criteria with the specified GO term n = total number of genes measured with the specific GO term Reference: Scott W Doniger, Nathan Salomonis, Kam D Dahlquist, Karen Vranizan, Steven C Lawlor and Bruce R Conklin; MAPPFinder: usig Gene Ontology and GenMAPP to create a global gene-expression profile from microarray data, Genome Biology 2003, 4:R7

Z-score Report

Z-score Report

KEGG Report

Pairwise Analysis - Summary Human blood expression for Huntington s disease versus control 12 HD 14 Control t-test, Benjamini and Hochberg (FDR) Pattern selection 2606 increased In HD Z-scores Biological processes Protein biosynthesis (104) Ubiquitin cycle (123) RNA splicing (53) KEGG Oxidataive phosphorylation (35) Apoptosis (22) ~20,000 genes 5684 genes 3078 decreased In HD Biological processes Neurogenesis (90) Cell adhesion (120) Sodium ion transport (29) G-protein coupled receptor signaling (114) KEGG Neuroactive ligand-receptor interaction (56)

Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison General microarry data analysis workflow From raw data to biological significance Comparison statistics and correction for multiple testing GeneSifter overview Gene Expression in Huntington's Disease Peripheral Blood Identification of biological themes Platform comparison

Pairwise Analysis Human blood expression for Huntington s disease versus control, Affymetrix U133A Human Genome Array Borovecki F, Lovrecic L, Zhou J, Jeong H, Then F, Rosas HD, Hersch SM, Hogarth P, Bouzou B, Jensen RV, Krainc D. Genome-wide expression profiling of human blood reveals biomarkers for Huntington's disease. Proc Natl Acad Sci U S A. 2005 Aug 2;102(31):11023-8.

Pairwise Analysis - Affymetrix Already normalized (median) t-test Quality filter 50 (filters out genes with signal less than 50) Benjamini and Hochberg (FDR) Log transform data

Pairwise Analysis Gene List Human blood expression for Huntington s disease versus control, Affymetrix

Gene Lists Common and Unique Genes

Platform comparison Biological themes Affymetrix

Platform comparison Biological themes CodeLink

GeneSifter - Analysis Examples 2 groups (Huntingtons Blood vs Healthy Blood) Data Upload CodeLink 3 + groups (Time series, dose response, etc.) Differential expression Fold change Quality t-test False discovery rate Differential expression Fold change Quality ANOVA False discovery rate Visualization Hierarchical clustering PCA Partitioning PAM Silhouettes Biological significance Gene Annotation Ontology report Pathway report

Project Analysis - Clustering

Cluster by Samples All Genes CodeLink Affymetrix

Cluster by Samples? CodeLink Affymetrix

Cluster by Samples Y Chrom. Genes CodeLink Affymetrix

Platform Comparison - Summary CodeLink Affymetrix Transcripts Total 19729 22283 Increased in HD 2606 1976 Overlap (LL genes) 41% 65% Top BP Ontologies Ubiquitin cycle RNA splicing Regulation of translation Apoptosis Clustering of samples

Platform Comparison - Summary CodeLink Affymetrix Increased in HD 2606 1976 Decreased in HD 3708 986 Unique ontology Oxidative Phos. IL-6 Biosynthesis

The GeneSifter Data Center www.genesifter.net/dc

MicroarraySuccess.com Seven Keys to Successful Microarray Data Analysis Experiment Design Platform Selection Data Management System Access Differential Expression Biological Significance Data Publication Type of experiment Two groups Time series Dose Response Multiple treatments Replicates The more the better Technical vs. biological Platforms cdna Oligo One color Two color Feature Extraction Software File formats Databases Raw Data Storing Retrieving Experiment Annotation Samples Protocols Usability Intuitive Special training System Access Single user desktop Single user server Web-based Sharing data In the lab Collaboration Normalization Differential Expression Fold change Comparison statistics FWER/FDR Pattern Identification Clustering Visualization Partitioning Gene Annotation UniGene LocusLink Gene Ontology KEGG OMIM Single Genes Gene Summaries Gene Lists Ontology Report Pathway Report MIAME What is it? Publication Public databases GEO ArrayExpress SMD Using public data Meta analysis Academic partner University of Washington

Thank You CodeLink compatible www.genesifter.net Trial account, tutorials, sample data and Data Center Eric Olson eric@genesifter.net 206.283.4363