Introduction to Bioinformatics and Gene Expression Technologies

Size: px
Start display at page:

Download "Introduction to Bioinformatics and Gene Expression Technologies"

Transcription

1 Introduction to Bioinformatics and Gene Expression Technologies Utah State University Fall 2017 Statistical Bioinformatics (Biomedical Big Data) Notes 1 1

2 Vocabulary Gene: hereditary DNA sequence at a specific location on chromosome (that does something ) Genetics: Genome: study of heredity & variation in organisms an organism s total genetic content (full DNA sequence) Genomics: study of organisms in terms of their genome 2

3 Vocabulary Protein: sequence of amino acids that does something Proteomics: Phylogeny: Phylogenetics: study of all of the proteins that can come from an organisms genome the evolutionary or historical development of an organism (or its DNA sequence) the study of an organism s phylogeny Phenotype: the physical characteristic of interest in each individual for example, plant height, disease status, or embryo type 3

4 Vocabulary Bioinformatics: the collection, organization, & analysis of largescale, complex biological data Statistical Bioinformatics: the application of statistical approaches to bioinformatics, especially in identifying significant changes (in sequences, expression patterns, etc.) that are biologically relevant (especially in affecting the phenotype) 4

5 Central Dogma of Molecular Biology 5

6 A road map to bioinformatics Central Dogma Technology Gene Genome Sequencing Genomic Hypothesis Genotype QTL Type of Study or Analysis mrna transcript Transcript Profiling Transcriptome Microarrays or Next-Gen Sequencing (Epigenetics / methylation) Protein Protein quantification and function Proteome Protein Microarrays or Proteomics Phenotype (From introductory lecture by RW Doerge at 2013 Joint Statistical Meetings) 6

7 Alphabets DNA sequences defined by nucleotides (4) DNA sequence mrna sequence Protein sequence Protein sequences defined by amino acids (20) 7

8 General assumption of gene expression technology Use mrna transcript abundance level as a measure of the level of expression for the corresponding gene Proportional to degree of gene expression Side note: a methylated gene is silenced (no expression) 8

9 How to measure mrna abundance? Several different approaches with similar themes: Affymetrix GeneChip Nimblegen array Two-color cdna array More modern: next-generation sequencing (NGS) Representation of genes on slide Small portion of gene ( oligo ) Larger sequence of gene Blank slate (NGS) oligonucleotide arrays 9

10 General DNA sequencing Sanger 1970 s today most reliable, but expensive Next-generation [high-throughput] (NGS): Genome Sequencer FLC (GS FLX, by 454 Sequencing) Illumina s Solexa Genome Analyzer Applied Biosystems SOLiD platform others Key aspect: sequence (and identify) all sequences present 10

11 Common features of NGS technologies (1) fragment prepared genomic material biological system s RNA molecules RNA-Seq DNA or RNA interaction regions ChIP-Seq, HITS-CLIP others sequence these fragments (at least partially) produces HUGE data files (~10 million fragments sequenced) 11

12 Common features of NGS technologies (2) align sequenced fragments with reference sequence usually, a known target genome (gigo ) alignment tools: ELAND, MAQ, SOAP, Bowtie, others often done with command-line tools still a major computational challenge count number of fragments mapping to certain regions usually, genes these read counts linearly approximate target transcript abundance 12

13 Here, RNA-Seq: recall central dogma: DNA mrna protein action quantify [mrna] transcript abundance Isolate RNA from cells, fragment at random positions, and copy into cdna Attach adapters to ends of cdna fragments, and bind to flow cell (Illumina has glass slide with 8 such lanes so can process 8 samples on one slide) Amplify cdna fragments in certain size range (e.g., bases) using PCR clusters of same fragment Sequence base-by-base for all clusters in parallel 13

14 (originally illumina.com download) 14

15 (originally illumina.com download) 15

16 (orginally illumina.com download) 16

17 (orginally illumina.com download) 17

18 Then align and map For sequence at each cluster, compare to [align with] reference genome; file format: millions of clusters per lane approx. 1 GB file size per lane For regions of interest in reference genome (genes, here), count number of clusters mapping there requires well-studied and well-documented genome 18

19 RNA-Seq Example: 8 patients, 56,621 genes 8 heart tissue samples 4 control (no heart disease) 4 cardiomyopathy (heart disease) 2 restrictive (contracts okay, relaxes abnormally) 2 dilated (enlarged left ventricle) These Naples data made public Nov 2015 by Institute of Genetics and Biophysics (Naples, Italy) Ctrl_3 RCM_3 Ctrl_4 DCM_4 Ctrl_5 RCM_5 Ctrl_6 DCM_6 ENSG ENSG ENSG ENSG ENSG ENSG

20 Common statistical research objectives Test each gene (row) for differential expression between conditions Ctrl vs. non-ctrl Dilated vs. Restrictive Restrictive vs. Ctrl etc. Test specific groups of genes (with a known common function) for overall expression differences between conditions Which functions are differentially active between Ctrl and non-ctrl, for example? 20

21 A short word on bioinformatic technologies Never marry a technology, because it will always leave you. Scott Tingey, Director of Genetic Discovery at DuPont (shared in RW Doerge 2013 introductory overview lecture at 2013 JSM) In this class, we will discuss only a couple of technologies, emphasizing their recurring statistical issues These are perpetual (and compounding) 21

22 A Rough Timeline of Technologies (1995+) Microarrays require probes fixed in advance only set up to detect those (2005+) Next-Generation Sequencing (NGS) typically involves amplification of genomic material (PCR) (2010+) Third-Generation Sequencing next-next-generation Pac Bio, Ion Torrent no amplification needed can sequence single molecule longer reads possible; still (2013 ; 2016) showing high errors (2012+) Nanopore-Based Sequencing [very promising] Oxford Nanopore, Genia, others bases identified as whole molecule slips through nanoscale hole (like threading a needle); coupled with disposable cartridges; still (2013 ; 2016) under development (?+) more Differ in how sequencing done; subsequent postalignment statistical analysis basically same (see 2016 Goodwin et al. paper on Canvas course page, in Files) 22

23 Affymetrix Technology GeneChip Each gene is represented by a unique set of probe pairs (usually probe pairs per probe set) Each spot on array represents a single probe (with millions of copies) These probes are fixed to the array (Image courtesy Affymetrix, 23

24 Affymetrix Technology Expression A tissue sample is prepared so that its mrna has fluorescent tags; wait for hybridization; scan to light tag (Images courtesy Affymetrix, 24

25 Affymetrix GeneChip Image courtesy Affymetrix, 25

26 Cartoon Representations (originally from Affymetrix outreach) Animation 1: GeneChip structure (1 min.) Animation 2: Measuring gene expression (2.5 min) 26

27 Images; Affymetrix data is probe intensity Full Array Image Close-up of Array Image Images courtesy Affymetrix, 27

28 How to analyze data meaningfully? Consider (for any technology): Data quality Data distribution Data format & organization Appropriateness of measurement methods (& variance) Sources of variability (and their types) Appropriate models to account for sources of variability and address question of interest Meaning of P-values and appropriate tests of significance Statistical significance vs. biological relevance Appropriate and useful representation of results Many useful tools available from Bioconductor 28

29 The Bioconductor Project Bioconductor is an open source and open development software project for the analysis and comprehension of genomic data Not just for RNA-Seq or microarray data Like a living family of software packages, changing with needs Core team mainly at Fred Hutchinson Cancer Research, plus many other U.S. and international institutions Source: 29

30 Main Features of the Bioconductor Project Use of R Documentation and reproducible research Statistical and graphical methods Annotation Short courses Open source Open development Source: 30

31 What will we do in this class? Learn basics of a few major Bioconductor tools Focus on statistical issues Discuss recent developments Learn to discuss all of this 31

Functional Genomics Overview RORY STARK PRINCIPAL BIOINFORMATICS ANALYST CRUK CAMBRIDGE INSTITUTE 18 SEPTEMBER 2017

Functional Genomics Overview RORY STARK PRINCIPAL BIOINFORMATICS ANALYST CRUK CAMBRIDGE INSTITUTE 18 SEPTEMBER 2017 Functional Genomics Overview RORY STARK PRINCIPAL BIOINFORMATICS ANALYST CRUK CAMBRIDGE INSTITUTE 18 SEPTEMBER 2017 Agenda What is Functional Genomics? RNA Transcription/Gene Expression Measuring Gene

More information

Gene Expression Technology

Gene Expression Technology Gene Expression Technology Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Gene expression Gene expression is the process by which information from a gene

More information

Lecture #1. Introduction to microarray technology

Lecture #1. Introduction to microarray technology Lecture #1 Introduction to microarray technology Outline General purpose Microarray assay concept Basic microarray experimental process cdna/two channel arrays Oligonucleotide arrays Exon arrays Comparing

More information

Bioinformatics Advice on Experimental Design

Bioinformatics Advice on Experimental Design Bioinformatics Advice on Experimental Design Where do I start? Please refer to the following guide to better plan your experiments for good statistical analysis, best suited for your research needs. Statistics

More information

Introduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods

Introduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods Introduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods Companion lecture to the textbook: Fundamentals of BioMEMS and Medical Microdevices, by Prof., http://saliterman.umn.edu/

More information

Next Gen Sequencing. Expansion of sequencing technology. Contents

Next Gen Sequencing. Expansion of sequencing technology. Contents Next Gen Sequencing Contents 1 Expansion of sequencing technology 2 The Next Generation of Sequencing: High-Throughput Technologies 3 High Throughput Sequencing Applied to Genome Sequencing (TEDed CC BY-NC-ND

More information

Outline. General principles of clonal sequencing Analysis principles Applications CNV analysis Genome architecture

Outline. General principles of clonal sequencing Analysis principles Applications CNV analysis Genome architecture The use of new sequencing technologies for genome analysis Chris Mattocks National Genetics Reference Laboratory (Wessex) NGRL (Wessex) 2008 Outline General principles of clonal sequencing Analysis principles

More information

Recent technology allow production of microarrays composed of 70-mers (essentially a hybrid of the two techniques)

Recent technology allow production of microarrays composed of 70-mers (essentially a hybrid of the two techniques) Microarrays and Transcript Profiling Gene expression patterns are traditionally studied using Northern blots (DNA-RNA hybridization assays). This approach involves separation of total or polya + RNA on

More information

3.1.4 DNA Microarray Technology

3.1.4 DNA Microarray Technology 3.1.4 DNA Microarray Technology Scientists have discovered that one of the differences between healthy and cancer is which genes are turned on in each. Scientists can compare the gene expression patterns

More information

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio

More information

DNA Arrays Affymetrix GeneChip System

DNA Arrays Affymetrix GeneChip System DNA Arrays Affymetrix GeneChip System chip scanner Affymetrix Inc. hybridization Affymetrix Inc. data analysis Affymetrix Inc. mrna 5' 3' TGTGATGGTGGGAATTGGGTCAGAAGGACTGTGGGCGCTGCC... GGAATTGGGTCAGAAGGACTGTGGC

More information

Next Generation Sequencing: An Overview

Next Generation Sequencing: An Overview Next Generation Sequencing: An Overview Cavan Reilly November 13, 2017 Table of contents Next generation sequencing NGS and microarrays Study design Quality assessment Burrows Wheeler transform Next generation

More information

Microarray Technique. Some background. M. Nath

Microarray Technique. Some background. M. Nath Microarray Technique Some background M. Nath Outline Introduction Spotting Array Technique GeneChip Technique Data analysis Applications Conclusion Now Blind Guess? Functional Pathway Microarray Technique

More information

Next-Generation Sequencing. Technologies

Next-Generation Sequencing. Technologies Next-Generation Next-Generation Sequencing Technologies Sequencing Technologies Nicholas E. Navin, Ph.D. MD Anderson Cancer Center Dept. Genetics Dept. Bioinformatics Introduction to Bioinformatics GS011062

More information

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio

More information

EECS730: Introduction to Bioinformatics

EECS730: Introduction to Bioinformatics EECS730: Introduction to Bioinformatics Lecture 14: Microarray Some slides were adapted from Dr. Luke Huan (University of Kansas), Dr. Shaojie Zhang (University of Central Florida), and Dr. Dong Xu and

More information

Welcome to the NGS webinar series

Welcome to the NGS webinar series Welcome to the NGS webinar series Webinar 1 NGS: Introduction to technology, and applications NGS Technology Webinar 2 Targeted NGS for Cancer Research NGS in cancer Webinar 3 NGS: Data analysis for genetic

More information

What we ll do today. Types of stem cells. Do engineered ips and ES cells have. What genes are special in stem cells?

What we ll do today. Types of stem cells. Do engineered ips and ES cells have. What genes are special in stem cells? Do engineered ips and ES cells have similar molecular signatures? What we ll do today Research questions in stem cell biology Comparing expression and epigenetics in stem cells asuring gene expression

More information

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist Whole Transcriptome Analysis of Illumina RNA- Seq Data Ryan Peters Field Application Specialist Partek GS in your NGS Pipeline Your Start-to-Finish Solution for Analysis of Next Generation Sequencing Data

More information

Introductory Next Gen Workshop

Introductory Next Gen Workshop Introductory Next Gen Workshop http://www.illumina.ucr.edu/ http://www.genomics.ucr.edu/ Workshop Objectives Workshop aimed at those who are new to Illumina sequencing and will provide: - a basic overview

More information

Do engineered ips and ES cells have similar molecular signatures?

Do engineered ips and ES cells have similar molecular signatures? Do engineered ips and ES cells have similar molecular signatures? Comparing expression and epigenetics in stem cells George Bell, Ph.D. Bioinformatics and Research Computing 2012 Spring Lecture Series

More information

Methods of Biomaterials Testing Lesson 3-5. Biochemical Methods - Molecular Biology -

Methods of Biomaterials Testing Lesson 3-5. Biochemical Methods - Molecular Biology - Methods of Biomaterials Testing Lesson 3-5 Biochemical Methods - Molecular Biology - Chromosomes in the Cell Nucleus DNA in the Chromosome Deoxyribonucleic Acid (DNA) DNA has double-helix structure The

More information

Outline. Analysis of Microarray Data. Most important design question. General experimental issues

Outline. Analysis of Microarray Data. Most important design question. General experimental issues Outline Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization Introduction to microarrays Experimental design Data normalization Other data transformation Exercises George Bell,

More information

Gene Regulation Solutions. Microarrays and Next-Generation Sequencing

Gene Regulation Solutions. Microarrays and Next-Generation Sequencing Gene Regulation Solutions Microarrays and Next-Generation Sequencing Gene Regulation Solutions The Microarrays Advantage Microarrays Lead the Industry in: Comprehensive Content SurePrint G3 Human Gene

More information

Research school methods seminar Genomics and Transcriptomics

Research school methods seminar Genomics and Transcriptomics Research school methods seminar Genomics and Transcriptomics Stephan Klee 19.11.2014 2 3 4 5 Genetics, Genomics what are we talking about? Genetics and Genomics Study of genes Role of genes in inheritence

More information

DNA Microarray Data Oligonucleotide Arrays

DNA Microarray Data Oligonucleotide Arrays DNA Microarray Data Oligonucleotide Arrays Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor Short Course 2003 Copyright 2002, all rights reserved Biological question Experimental

More information

Third Generation Sequencing

Third Generation Sequencing Third Generation Sequencing By Mohammad Hasan Samiee Aref Medical Genetics Laboratory of Dr. Zeinali History of DNA sequencing 1953 : Discovery of DNA structure by Watson and Crick 1973 : First sequence

More information

Genome Sequencing. I: Methods. MMG 835, SPRING 2016 Eukaryotic Molecular Genetics. George I. Mias

Genome Sequencing. I: Methods. MMG 835, SPRING 2016 Eukaryotic Molecular Genetics. George I. Mias Genome Sequencing I: Methods MMG 835, SPRING 2016 Eukaryotic Molecular Genetics George I. Mias Department of Biochemistry and Molecular Biology gmias@msu.edu Sequencing Methods Cost of Sequencing Wetterstrand

More information

Introduction to Microarray Data Analysis and Gene Networks. Alvis Brazma European Bioinformatics Institute

Introduction to Microarray Data Analysis and Gene Networks. Alvis Brazma European Bioinformatics Institute Introduction to Microarray Data Analysis and Gene Networks Alvis Brazma European Bioinformatics Institute A brief outline of this course What is gene expression, why it s important Microarrays and how

More information

Technical Review. Real time PCR

Technical Review. Real time PCR Technical Review Real time PCR Normal PCR: Analyze with agarose gel Normal PCR vs Real time PCR Real-time PCR, also known as quantitative PCR (qpcr) or kinetic PCR Key feature: Used to amplify and simultaneously

More information

Analysis of Microarray Data

Analysis of Microarray Data Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization George Bell, Ph.D. Senior Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Introduction

More information

Goals of pharmacogenomics

Goals of pharmacogenomics Goals of pharmacogenomics Use drugs better and use better drugs! People inherit/exhibit differences in drug: Absorption Metabolism and degradation of the drug Transport of drug to the target molecule Excretion

More information

Human genome sequence

Human genome sequence NGS: the basics Human genome sequence June 26th 2000: official announcement of the completion of the draft of the human genome sequence (truly finished in 2004) Francis Collins Craig Venter HGP: 3 billion

More information

Analysing genomes and transcriptomes using Illumina sequencing

Analysing genomes and transcriptomes using Illumina sequencing Analysing genomes and transcriptomes using Illumina uencing Dr. Heinz Himmelbauer Centre for Genomic Regulation (CRG) Ultrauencing Unit Barcelona The Sequencing Revolution High-Throughput Sequencing 2000

More information

Next Generation Sequencing. Jeroen Van Houdt - Leuven 13/10/2017

Next Generation Sequencing. Jeroen Van Houdt - Leuven 13/10/2017 Next Generation Sequencing Jeroen Van Houdt - Leuven 13/10/2017 Landmarks in DNA sequencing 1953 Discovery of DNA double helix structure 1977 A Maxam and W Gilbert "DNA seq by chemical degradation" F Sanger"DNA

More information

Measuring transcriptomes with RNA-Seq

Measuring transcriptomes with RNA-Seq Measuring transcriptomes with RNA-Seq BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2017 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC

More information

Philippe Hupé 1,2. The R User Conference 2009 Rennes

Philippe Hupé 1,2. The R User Conference 2009 Rennes A suite of R packages for the analysis of DNA copy number microarray experiments Application in cancerology Philippe Hupé 1,2 1 UMR144 Institut Curie, CNRS 2 U900 Institut Curie, INSERM, Mines Paris Tech

More information

CSC Assignment1SequencingReview- 1109_Su N_NEXT_GENERATION_SEQUENCING.docx By Anonymous. Similarity Index

CSC Assignment1SequencingReview- 1109_Su N_NEXT_GENERATION_SEQUENCING.docx By Anonymous. Similarity Index Page 1 of 6 Document Viewer TurnitinUK Originality Report Processed on: 05-Dec-20 10:49 AM GMT ID: 13 Word Count: 1587 Submitted: 1 CSC8313-201 - Assignment1SequencingReview- 1109_Su N_NEXT_GENERATION_SEQUENCING.docx

More information

Sequencing techniques and applications

Sequencing techniques and applications I519 Introduction to Bioinformatics Sequencing techniques and applications Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Contents Sequencing techniques Sanger sequencing Next generation

More information

less sensitive than RNA-seq but more robust analysis pipelines expensive but quantitiatve standard but typically not high throughput

less sensitive than RNA-seq but more robust analysis pipelines expensive but quantitiatve standard but typically not high throughput Chapter 11: Gene Expression The availability of an annotated genome sequence enables massively parallel analysis of gene expression. The expression of all genes in an organism can be measured in one experiment.

More information

An introduction to RNA-Seq. Brian J. Knaus USDA Forest Service Pacific Northwest Research Station

An introduction to RNA-Seq. Brian J. Knaus USDA Forest Service Pacific Northwest Research Station An introduction to RNA-Seq Brian J. Knaus USDA Forest Service Pacific Northwest Research Station 1 Doerge, R.W. 2002. Nature Reviews Genetics 3: 43-53. An introduction to RNA-Seq Tissue collection RNA

More information

Class Information. Introduction to Genome Biology and Microarray Technology. Biostatistics Rafael A. Irizarry. Lecture 1

Class Information. Introduction to Genome Biology and Microarray Technology. Biostatistics Rafael A. Irizarry. Lecture 1 This work is licensed under a Creative Commons ttribution-noncommercial-sharelike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Introductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie. Sander van Boheemen Medical Microbiology

Introductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie. Sander van Boheemen Medical Microbiology Introductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie Sander van Boheemen Medical Microbiology Next-generation sequencing Next-generation sequencing (NGS), also known as

More information

QIAGEN s NGS Solutions for Biomarkers NGS & Bioinformatics team QIAGEN (Suzhou) Translational Medicine Co.,Ltd

QIAGEN s NGS Solutions for Biomarkers NGS & Bioinformatics team QIAGEN (Suzhou) Translational Medicine Co.,Ltd QIAGEN s NGS Solutions for Biomarkers NGS & Bioinformatics team QIAGEN (Suzhou) Translational Medicine Co.,Ltd 1 Our current NGS & Bioinformatics Platform 2 Our NGS workflow and applications 3 QIAGEN s

More information

Computational Biology I LSM5191

Computational Biology I LSM5191 Computational Biology I LSM5191 Lecture 5 Notes: Genetic manipulation & Molecular Biology techniques Broad Overview of: Enzymatic tools in Molecular Biology Gel electrophoresis Restriction mapping DNA

More information

Next Generation Sequencing Lecture Saarbrücken, 19. March Sequencing Platforms

Next Generation Sequencing Lecture Saarbrücken, 19. March Sequencing Platforms Next Generation Sequencing Lecture Saarbrücken, 19. March 2012 Sequencing Platforms Contents Introduction Sequencing Workflow Platforms Roche 454 ABI SOLiD Illumina Genome Anlayzer / HiSeq Problems Quality

More information

Next Generation Sequencing (NGS) Market Size, Growth and Trends ( )

Next Generation Sequencing (NGS) Market Size, Growth and Trends ( ) Next Generation Sequencing (NGS) Market Size, Growth and Trends (2014-2020) July, 2017 4 th edition Information contained in this market report is believed to be reliable at the time of publication. DeciBio

More information

Engineering Genetic Circuits

Engineering Genetic Circuits Engineering Genetic Circuits I use the book and slides of Chris J. Myers Lecture 0: Preface Chris J. Myers (Lecture 0: Preface) Engineering Genetic Circuits 1 / 19 Samuel Florman Engineering is the art

More information

Feature Selection of Gene Expression Data for Cancer Classification: A Review

Feature Selection of Gene Expression Data for Cancer Classification: A Review Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 50 (2015 ) 52 57 2nd International Symposium on Big Data and Cloud Computing (ISBCC 15) Feature Selection of Gene Expression

More information

Typical probes. Slides per pack Aminosilane. Long oligo- Slide AStar None D surface. nucleotides

Typical probes. Slides per pack Aminosilane. Long oligo- Slide AStar None D surface. nucleotides Aminosilane coating Nexterion Slide A+ and Slide AStar Overview Type of coating Immobilization method Typical probes Ordering information Nexterion product Barcode option Item number Slides per pack Aminosilane

More information

Functional Genomics in Plants

Functional Genomics in Plants Functional Genomics in Plants Jeffrey L Bennetzen, Purdue University, West Lafayette, Indiana, USA Functional genomics refers to a suite of genetic technologies that will contribute to a comprehensive

More information

Next Generation Sequencing Technologies. Some slides are modified from Robi Mitra s lecture notes

Next Generation Sequencing Technologies. Some slides are modified from Robi Mitra s lecture notes Next Generation Sequencing Technologies Some slides are modified from Robi Mitra s lecture notes What will you do to understand a disease? What will you do to understand a disease? Genotype Phenotype Hypothesis

More information

Humboldt Universität zu Berlin. Grundlagen der Bioinformatik SS Microarrays. Lecture

Humboldt Universität zu Berlin. Grundlagen der Bioinformatik SS Microarrays. Lecture Humboldt Universität zu Berlin Microarrays Grundlagen der Bioinformatik SS 2017 Lecture 6 09.06.2017 Agenda 1.mRNA: Genomic background 2.Overview: Microarray 3.Data-analysis: Quality control & normalization

More information

NPTEL VIDEO COURSE PROTEOMICS PROF. SANJEEVA SRIVASTAVA

NPTEL VIDEO COURSE PROTEOMICS PROF. SANJEEVA SRIVASTAVA LECTURE-03 GENOMICS AND TRANSCRIPTOMICS: WHY PROTEOMICS? TRANSCRIPT Welcome to the proteomics course. Today, we will talk about Genomics and Transcriptomics and then we will talk about why to study proteomics?

More information

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI)

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI) DNA-Sequencing Technologies & Devices Matthias Platzer Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI) Genome analysis DNA sequencing platforms ABI 3730xl 4/2004 & 6/2006 1 Mb/day,

More information

Data Sheet. GeneChip Human Genome U133 Arrays

Data Sheet. GeneChip Human Genome U133 Arrays GeneChip Human Genome Arrays AFFYMETRIX PRODUCT FAMILY > ARRAYS > Data Sheet GeneChip Human Genome U133 Arrays The Most Comprehensive Coverage of the Human Genome in Two Flexible Formats: Single-array

More information

Introduction to Molecular Biology

Introduction to Molecular Biology Introduction to Molecular Biology Bioinformatics: Issues and Algorithms CSE 308-408 Fall 2007 Lecture 2-1- Important points to remember We will study: Problems from bioinformatics. Algorithms used to solve

More information

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI)

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI) DNA-Sequencing Technologies & Devices Matthias Platzer Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI) Genome analysis DNA sequencing platforms ABI 3730xl 4/2004 & 6/2006 1 Mb/day,

More information

Introduction to Bioinformatics. Fabian Hoti 6.10.

Introduction to Bioinformatics. Fabian Hoti 6.10. Introduction to Bioinformatics Fabian Hoti 6.10. Analysis of Microarray Data Introduction Different types of microarrays Experiment Design Data Normalization Feature selection/extraction Clustering Introduction

More information

Chapter 15 Gene Technologies and Human Applications

Chapter 15 Gene Technologies and Human Applications Chapter Outline Chapter 15 Gene Technologies and Human Applications Section 1: The Human Genome KEY IDEAS > Why is the Human Genome Project so important? > How do genomics and gene technologies affect

More information

Bioinformatics of Transcriptional Regulation

Bioinformatics of Transcriptional Regulation Bioinformatics of Transcriptional Regulation Carl Herrmann IPMB & DKFZ c.herrmann@dkfz.de Wechselwirkung von Maßnahmen und Auswirkungen Einflussmöglichkeiten in einem Dialog From genes to active compounds

More information

Sequence Assembly and Alignment. Jim Noonan Department of Genetics

Sequence Assembly and Alignment. Jim Noonan Department of Genetics Sequence Assembly and Alignment Jim Noonan Department of Genetics james.noonan@yale.edu www.yale.edu/noonanlab The assembly problem >>10 9 sequencing reads 36 bp - 1 kb 3 Gb Outline Basic concepts in genome

More information

Ultrasequencing: Methods and Applications of the New Generation Sequencing Platforms

Ultrasequencing: Methods and Applications of the New Generation Sequencing Platforms Ultrasequencing: Methods and Applications of the New Generation Sequencing Platforms Laura Moya Andérico Master in Advanced Genetics Genomics Class December 16 th, 2015 Brief Overview First-generation

More information

Sanger vs Next-Gen Sequencing

Sanger vs Next-Gen Sequencing Tools and Algorithms in Bioinformatics GCBA815/MCGB815/BMI815, Fall 2017 Week-8: Next-Gen Sequencing RNA-seq Data Analysis Babu Guda, Ph.D. Professor, Genetics, Cell Biology & Anatomy Director, Bioinformatics

More information

MICROARRAYS+SEQUENCING

MICROARRAYS+SEQUENCING MICROARRAYS+SEQUENCING The most efficient way to advance genomics research Down to a Science. www.affymetrix.com/downtoascience Affymetrix GeneChip Expression Technology Complementing your Next-Generation

More information

Advances in analytical biochemistry and systems biology: Proteomics

Advances in analytical biochemistry and systems biology: Proteomics Advances in analytical biochemistry and systems biology: Proteomics Brett Boghigian Department of Chemical & Biological Engineering Tufts University July 29, 2005 Proteomics The basics History Current

More information

Genetic Engineering & Recombinant DNA

Genetic Engineering & Recombinant DNA Genetic Engineering & Recombinant DNA Chapter 10 Copyright The McGraw-Hill Companies, Inc) Permission required for reproduction or display. Applications of Genetic Engineering Basic science vs. Applied

More information

Microarray Gene Expression Analysis at CNIO

Microarray Gene Expression Analysis at CNIO Microarray Gene Expression Analysis at CNIO Orlando Domínguez Genomics Unit Biotechnology Program, CNIO 8 May 2013 Workflow, from samples to Gene Expression data Experimental design user/gu/ubio Samples

More information

Program overview. SciLifeLab - a short introduction. Advanced Light Microscopy. Affinity Proteomics. Bioinformatics.

Program overview. SciLifeLab - a short introduction. Advanced Light Microscopy. Affinity Proteomics. Bioinformatics. Open House Program SciLifeLab Open House in Stockholm November 4, 2015 09:00-16:00 Contact: events@scilifelab.se Program overview 09:00-09:30-10:00-10:30-11:00-11:30-12:00-12:30-13:00-13:30-14:00-14:30-15:00-15:30-09:30

More information

Expression Array System

Expression Array System Integrated Science for Gene Expression Applied Biosystems Expression Array System Expression Array System SEE MORE GENES The most complete, most sensitive system for whole genome expression analysis. The

More information

measuring gene expression December 5, 2017

measuring gene expression December 5, 2017 measuring gene expression December 5, 2017 transcription a usually short-lived RNA copy of the DNA is created through transcription RNA is exported to the cytoplasm to encode proteins some types of RNA

More information

Serial Analysis of Gene Expression

Serial Analysis of Gene Expression Serial Analysis of Gene Expression Cloning of Tissue-Specific Genes Using SAGE and a Novel Computational Substraction Approach. Genomic (2001) Hung-Jui Shih Outline of Presentation SAGE EST Article TPE

More information

AGRO/ANSC/BIO/GENE/HORT 305 Fall, 2016 Overview of Genetics Lecture outline (Chpt 1, Genetics by Brooker) #1

AGRO/ANSC/BIO/GENE/HORT 305 Fall, 2016 Overview of Genetics Lecture outline (Chpt 1, Genetics by Brooker) #1 AGRO/ANSC/BIO/GENE/HORT 305 Fall, 2016 Overview of Genetics Lecture outline (Chpt 1, Genetics by Brooker) #1 - Genetics: Progress from Mendel to DNA: Gregor Mendel, in the mid 19 th century provided the

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics IMBB 2017 RAB, Kigali - Rwanda May 02 13, 2017 Joyce Nzioki Plan for the Week Introduction to Bioinformatics Raw sanger sequence data Introduction to CLC Bio Quality Control

More information

Proteomics And Cancer Biomarker Discovery. Dr. Zahid Khan Institute of chemical Sciences (ICS) University of Peshawar. Overview. Cancer.

Proteomics And Cancer Biomarker Discovery. Dr. Zahid Khan Institute of chemical Sciences (ICS) University of Peshawar. Overview. Cancer. Proteomics And Cancer Biomarker Discovery Dr. Zahid Khan Institute of chemical Sciences (ICS) University of Peshawar Overview Proteomics Cancer Aims Tools Data Base search Challenges Summary 1 Overview

More information

Genetics Lecture 21 Recombinant DNA

Genetics Lecture 21 Recombinant DNA Genetics Lecture 21 Recombinant DNA Recombinant DNA In 1971, a paper published by Kathleen Danna and Daniel Nathans marked the beginning of the recombinant DNA era. The paper described the isolation of

More information

Bioinformatics. Ingo Ruczinski. Some selected examples... and a bit of an overview

Bioinformatics. Ingo Ruczinski. Some selected examples... and a bit of an overview Bioinformatics Some selected examples... and a bit of an overview Department of Biostatistics Johns Hopkins Bloomberg School of Public Health July 19, 2007 @ EnviroHealth Connections Bioinformatics and

More information

SIMS2003. Instructors:Rus Yukhananov, Alex Loguinov BWH, Harvard Medical School. Introduction to Microarray Technology.

SIMS2003. Instructors:Rus Yukhananov, Alex Loguinov BWH, Harvard Medical School. Introduction to Microarray Technology. SIMS2003 Instructors:Rus Yukhananov, Alex Loguinov BWH, Harvard Medical School Introduction to Microarray Technology. Lecture 1 I. EXPERIMENTAL DETAILS II. ARRAY CONSTRUCTION III. IMAGE ANALYSIS Lecture

More information

Chromosomes. Chromosomes. Genes. Strands of DNA that contain all of the genes an organism needs to survive and reproduce

Chromosomes. Chromosomes. Genes. Strands of DNA that contain all of the genes an organism needs to survive and reproduce Chromosomes Chromosomes Strands of DNA that contain all of the genes an organism needs to survive and reproduce Genes Segments of DNA that specify how to build a protein genes may specify more than one

More information

High Cross-Platform Genotyping Concordance of Axiom High-Density Microarrays and Eureka Low-Density Targeted NGS Assays

High Cross-Platform Genotyping Concordance of Axiom High-Density Microarrays and Eureka Low-Density Targeted NGS Assays High Cross-Platform Genotyping Concordance of Axiom High-Density Microarrays and Eureka Low-Density Targeted NGS Assays Ali Pirani and Mohini A Patil ISAG July 2017 The world leader in serving science

More information

Gene expression bioinformatics: Part 1. High-throughput quantitative genomics

Gene expression bioinformatics: Part 1. High-throughput quantitative genomics Gene expression bioinformatics: Part 1. High-throughput quantitative genomics September 25, 2017 Alvin T. Kho Boston Children's Hospital alvin_kho@hms.harvard.edu Created with Outline (Biology is the science

More information

Molecular Cell Biology - Problem Drill 11: Recombinant DNA

Molecular Cell Biology - Problem Drill 11: Recombinant DNA Molecular Cell Biology - Problem Drill 11: Recombinant DNA Question No. 1 of 10 1. Which of the following statements about the sources of DNA used for molecular cloning is correct? Question #1 (A) cdna

More information

Péter Antal Ádám Arany Bence Bolgár András Gézsi Gergely Hajós Gábor Hullám Péter Marx András Millinghoffer László Poppe Péter Sárközy BIOINFORMATICS

Péter Antal Ádám Arany Bence Bolgár András Gézsi Gergely Hajós Gábor Hullám Péter Marx András Millinghoffer László Poppe Péter Sárközy BIOINFORMATICS Péter Antal Ádám Arany Bence Bolgár András Gézsi Gergely Hajós Gábor Hullám Péter Marx András Millinghoffer László Poppe Péter Sárközy BIOINFORMATICS The Bioinformatics book covers new topics in the rapidly

More information

Multi-omics in biology: integration of omics techniques

Multi-omics in biology: integration of omics techniques 31/07/17 Летняя школа по биоинформатике 2017 Multi-omics in biology: integration of omics techniques Konstantin Okonechnikov Division of Pediatric Neurooncology German Cancer Research Center (DKFZ) 2 Short

More information

Year III Pharm.D Dr. V. Chitra

Year III Pharm.D Dr. V. Chitra Year III Pharm.D Dr. V. Chitra 1 Genome entire genetic material of an individual Transcriptome set of transcribed sequences Proteome set of proteins encoded by the genome 2 Only one strand of DNA serves

More information

Genome Sequence Assembly

Genome Sequence Assembly Genome Sequence Assembly Learning Goals: Introduce the field of bioinformatics Familiarize the student with performing sequence alignments Understand the assembly process in genome sequencing Introduction:

More information

What is a microarray

What is a microarray DNA Microarrays What is a microarray A surface on which sequences from thousands of different genes are covalently attached to fixed locations (probes). Glass slides Silicon chips Utilize the selective

More information

O C. 5 th C. 3 rd C. the national health museum

O C. 5 th C. 3 rd C. the national health museum Elements of Molecular Biology Cells Cells is a basic unit of all living organisms. It stores all information to replicate itself Nucleus, chromosomes, genes, All living things are made of cells Prokaryote,

More information

Roche Molecular Biochemicals Technical Note No. LC 10/2000

Roche Molecular Biochemicals Technical Note No. LC 10/2000 Roche Molecular Biochemicals Technical Note No. LC 10/2000 LightCycler Overview of LightCycler Quantification Methods 1. General Introduction Introduction Content Definitions This Technical Note will introduce

More information

CMPS 3110 : Bioinformatics. High-Throughput Sequencing and Applications

CMPS 3110 : Bioinformatics. High-Throughput Sequencing and Applications CMPS 3110 : Bioinformatics High-Throughput Sequencing and Applications Sanger (1982) introduced chaintermination sequencing. Main idea: Obtain fragments of all possible lengths, ending in A, C, T, G. Using

More information

Molecular Markers CRITFC Genetics Workshop December 9, 2014

Molecular Markers CRITFC Genetics Workshop December 9, 2014 Molecular Markers CRITFC Genetics Workshop December 9, 2014 Molecular Markers Tools that allow us to collect information about an individual, a population, or a species Application in fisheries mating

More information

Measuring transcriptomes with RNA-Seq. BMI/CS 776 Spring 2016 Anthony Gitter

Measuring transcriptomes with RNA-Seq. BMI/CS 776  Spring 2016 Anthony Gitter Measuring transcriptomes with RNA-Seq BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostat.wisc.edu Overview RNA-Seq technology The RNA-Seq quantification problem Generative

More information

Identifying Candidate Informative Genes for Biomarker Prediction of Liver Cancer

Identifying Candidate Informative Genes for Biomarker Prediction of Liver Cancer Identifying Candidate Informative Genes for Biomarker Prediction of Liver Cancer Nagwan M. Abdel Samee 1, Nahed H. Solouma 2, Mahmoud Elhefnawy 3, Abdalla S. Ahmed 4, Yasser M. Kadah 5 1 Computer Engineering

More information

Long and short/small RNA-seq data analysis

Long and short/small RNA-seq data analysis Long and short/small RNA-seq data analysis GEF5, 4.9.2015 Sami Heikkinen, PhD, Dos. Topics 1. RNA-seq in a nutshell 2. Long vs short/small RNA-seq 3. Bioinformatic analysis work flows GEF5 / Heikkinen

More information

Introduction to the UCSC genome browser

Introduction to the UCSC genome browser Introduction to the UCSC genome browser Dominik Beck NHMRC Peter Doherty and CINSW ECR Fellow, Senior Lecturer Lowy Cancer Research Centre, UNSW and Centre for Health Technology, UTS SYDNEY NSW AUSTRALIA

More information

Next Generation Sequencing. Dylan Young Biomedical Engineering

Next Generation Sequencing. Dylan Young Biomedical Engineering Next Generation Sequencing Dylan Young Biomedical Engineering What is DNA? Molecule composed of Adenine (A) Guanine (G) Cytosine (C) Thymine (T) Paired as either AT or CG Provides genetic instructions

More information

Multiple choice questions (numbers in brackets indicate the number of correct answers)

Multiple choice questions (numbers in brackets indicate the number of correct answers) 1 Multiple choice questions (numbers in brackets indicate the number of correct answers) February 1, 2013 1. Ribose is found in Nucleic acids Proteins Lipids RNA DNA (2) 2. Most RNA in cells is transfer

More information

Molecular Biology Primer. CptS 580, Computational Genomics, Spring 09

Molecular Biology Primer. CptS 580, Computational Genomics, Spring 09 Molecular Biology Primer pts 580, omputational enomics, Spring 09 Starting 19 th century What do we know of cellular biology? ell as a fundamental building block 1850s+: ``DNA was discovered by Friedrich

More information

Introduction to BIOINFORMATICS

Introduction to BIOINFORMATICS Introduction to BIOINFORMATICS Antonella Lisa CABGen Centro di Analisi Bioinformatica per la Genomica Tel. 0382-546361 E-mail: lisa@igm.cnr.it http://www.igm.cnr.it/pagine-personali/lisa-antonella/ What

More information