Research school methods seminar Genomics and Transcriptomics

Similar documents
High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday June 16, 2014

Overview of Next Generation Sequencing technologies. Céline Keime

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday September 15, 2014

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Tuesday December 16, 2014

Third Generation Sequencing

Next Generation Sequencing. Jeroen Van Houdt - Leuven 13/10/2017

Next Generation Sequencing (NGS)

Deep Sequencing technologies

Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing Workshop

Matthew Tinning Australian Genome Research Facility. July 2012

The Journey of DNA Sequencing. Chromosomes. What is a genome? Genome size. H. Sunny Sun

High Throughput Sequencing Technologies. UCD Genome Center Bioinformatics Core Monday 15 June 2015

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI)

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI)

Next Generation Sequencing. Tobias Österlund

Next generation sequencing techniques" Toma Tebaldi Centre for Integrative Biology University of Trento

DNA-Sequencing. Technologies & Devices

Next-generation sequencing Technology Overview

Chapter 7. DNA Microarrays

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Introduction to Next Generation Sequencing (NGS)

DNA-Sequencing. Technologies & Devices

Contact us for more information and a quotation

Next Generation Sequencing Lecture Saarbrücken, 19. March Sequencing Platforms

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Wheat CAP Gene Expression with RNA-Seq

High Throughput Sequencing the Multi-Tool of Life Sciences. Lutz Froenicke DNA Technologies and Expression Analysis Cores UCD Genome Center

Bioinformatics Advice on Experimental Design

Next Gen Sequencing. Expansion of sequencing technology. Contents

INTRODUCCIÓ A LES TECNOLOGIES DE 'NEXT GENERATION SEQUENCING'

Sequencing techniques and applications

Sequencing technologies

High Throughput Sequencing the Multi-Tool of Life Sciences. Lutz Froenicke DNA Technologies and Expression Analysis Cores UCD Genome Center

Human genome sequence

The Expanded Illumina Sequencing Portfolio New Sample Prep Solutions and Workflow

Next-Generation Sequencing. Technologies

Gene Expression Technology

Genome Resequencing. Rearrangements. SNPs, Indels CNVs. De novo genome Sequencing. Metagenomics. Exome Sequencing. RNA-seq Gene Expression

Wet-lab Considerations for Illumina data analysis

NGS technologies: a user s guide. Karim Gharbi & Mark Blaxter

2/5/16. Honeypot Ants. DNA sequencing, Transcriptomics and Genomics. Gene sequence changes? And/or gene expression changes?

Genome Sequencing. I: Methods. MMG 835, SPRING 2016 Eukaryotic Molecular Genetics. George I. Mias

1. Introduction Gene regulation Genomics and genome analyses

The New Genome Analyzer IIx Delivering more data, faster, and easier than ever before. Jeremy Preston, PhD Marketing Manager, Sequencing

Concepts and methods in sequencing and genome assembly

Genome 373: High- Throughput DNA Sequencing. Doug Fowler


Ultrasequencing: methods and applications of the new generation sequencing platforms

Welcome to the NGS webinar series

Sequence Assembly and Next Generation Sequencing Informatics CBPS7711

DNA-Sequenzierung. Technologien & Geräte

CM581A2: NEXT GENERATION SEQUENCING PLATFORMS AND LIBRARY GENERATION

Biochemistry 412. New Strategies, Technologies, & Applications For DNA Sequencing. 12 February 2008

Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

How much sequencing do I need? Emily Crisovan Genomics Core

Introduction to Bioinformatics and Gene Expression Technologies

Introduction to Bioinformatics and Gene Expression Technologies

How much sequencing do I need? Emily Crisovan Genomics Core September 26, 2018

Functional Genomics Research Stream. Research Meetings: November 2 & 3, 2009 Next Generation Sequencing

High throughput DNA Sequencing. An Equal Opportunity University!

Genetics and Genomics in Medicine Chapter 3. Questions & Answers

Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

Understanding the science and technology of whole genome sequencing

Using New ThiNGS on Small Things. Shane Byrne

resequencing storage SNP ncrna metagenomics private trio de novo exome ncrna RNA DNA bioinformatics RNA-seq comparative genomics

The Genome Analysis Centre. Building Excellence in Genomics and Computa5onal Bioscience

Basics of RNA-Seq. (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility

Lecture 8: Sequencing and SNP. Sept 15, 2006

RNA Sequencing. Next gen insight into transcriptomes , Elio Schijlen

Transcriptomics analysis with RNA seq: an overview Frederik Coppens

High throughput sequencing technologies

CSC Assignment1SequencingReview- 1109_Su N_NEXT_GENERATION_SEQUENCING.docx By Anonymous. Similarity Index

DNA Sequencing. Happiness Kumburu BSU- workshop Nov, 2016

Applications of Next Generation Sequencing in Metagenomics Studies

Application of NGS (nextgeneration. for studying RNA regulation. Sung Wook Chi. Sungkyunkwan University (SKKU) Samsung Medical Center (SMC)

Outline. General principles of clonal sequencing Analysis principles Applications CNV analysis Genome architecture

RNA-Seq data analysis course September 7-9, 2015

DNA and genome sequencing. Matthew Hudson Dept of Crop Sciences University of Illinois

Next Generation Sequencing: An Overview

NEXT GENERATION SEQUENCING. Farhat Habib

Galaxy Workshop

Design. Construction. Characterization

TREE CODE PRODUCT BROCHURE

FGCZ NEWSLETTER FALL Next Generation Sequencing at the Functional Genomics Center Zurich

Introduction to NGS. Josef K Vogt Slides by: Simon Rasmussen Next Generation Sequencing Analysis

Next Generation Sequencing. Simon Rasmussen Assistant Professor Center for Biological Sequence analysis Technical University of Denmark

Advanced Technology in Phytoplasma Research

NextGen Sequencing and Target Enrichment

A Crash Course in NGS for GI Pathologists. Sandra O Toole

Selected Techniques Part I

NextGen Sequencing Technologies Sequencing overview

HLA-Typing Strategies

DNA Sequencing by Ion Torrent. Marc Lavergne CHEM 4590

Ultrasequencing: Methods and Applications of the New Generation Sequencing Platforms

Next-generation sequencing technologies

Molecular Biology and Functional Genomic Core Facility

you can see that if if you look into the you know the capability kilobases per day, per machine kind of calculation if you do.

DNA Sequencing and Assembly

Principles of Sequencing and Pla3orms

Transcription:

Research school methods seminar Genomics and Transcriptomics Stephan Klee 19.11.2014

2

3

4

5

Genetics, Genomics what are we talking about? Genetics and Genomics Study of genes Role of genes in inheritence Study of single genes and their effects/resulting disease Although both look from different angles, both need to be considered to fully understand the whole picture Study of all of a person s genes and the interplay of the genes Role of interaction of genes with each other and the environment (nongenetic factors) Study of complex diseases such as heart and lung diseases, diabetes and cancer Offers new options to personlized medicine (influence of risk factors) 6

How similar are we to. Humans are 99.5 to 99.9% similar to each other (not relatives!) 7

8

Genomics - What we will deal with in this presentation 1. Methods of sequencing (from the beginnings to next generation to next next generation sequencing) 2. Applications (what can we do with the sequencing tools we ve seen?) 3. (Analyzing your data) ask your bioinformatician of choice :D 9

Genomics the 3 Generations of sequencing First generation: Chain-Termination Sequencing (Sanger sequencing) Shotgun sequencing Second generation (next generation sequencing): Roche 454 Sequencing (GS Junior System/ GS FLX+ System) Applied Biosystems SOLID (5500 System/ 5500xl System) Solexa Illumina (HiSeq System/ Genome analyzer Iix/ MySeq) Pacific Biosciences (PacBio RS) Third generation (next next generation sequencing): Oxford Nanopore Technologies (GridION System/ MinION) Helicos (Genetic Analysis System) 10

Genomics the 3 Generations of sequencing 11

Genomics methods of the first generation - chain termination sequencing (Sanger sequencing) - 1) Denaturation of the dsdna to ssdna 2) Requires initial primer 3) 4 seperate reaction mixes (only differences is different ddntp either A, C, G or T) 4) Dideoxynucleotides lead to early chain termination 5) Seperation on a polyacrylamide gel (each reaction different lane) Gels were radioactively labeled Feasable technique for read length of 100 to 1000bp advances: radioactively labeled ddntps were exchanged for flourescent ddntps (capillary based sequencing) 12

Genomics methods of the first generation - shotgun sequencing - Relies on Sanger sequencing, however is capable of sequencing genomes High throughput sequencing technique that can collect a large amount of data at a fast rate. Works by partially digesting a genome or big strand of DNA into small overlapping fragments These small fragments are sequenced and fragments that overlap are matched together 13

Genomics methods of the second generation - Roche 454 sequencing - Oldest of the NGS technologies Current: `GS FLX Titanium` since late 2008 Technology is canceled has wide spread user base and niche applications FAST sequencing (<6h per run) Read-length 300-1000bp (modal length ~700bp) http://www.youtube.com/watch?v=bfnjxkhp8jc 14

Genomics methods of the second generation - Roche 454 sequencing - Fragmentation of DNA (600-800bp) and adapter ligation (red + green) Deposition in microreactors together with a bead sporting adapter sequences 15

Genomics methods of the second generation - Roche 454 sequencing - Binding of fragment onto bead Replication of fragments in the microreactor (polymerase etc in solution) replicas bind to free bead-adapters Lysis of microreactors and extraction of fragment covered beads 16

Genomics methods of the second generation - Roche 454 sequencing - Placement of beads in the PicoTiterPlate Filling of the wells with bound reagents Especially reagents responsible for creating the luminous signals (luciferase) 17

Genomics methods of the second generation - Roche 454 sequencing - Washing of the plate/wells with dntps, one at a time Recording of the intensity of the pyrophosphat activity 18

Genomics methods of the second generation - Roche 454 sequencing - Image interpretation `Flow chart` Conversion to textual representation of sequence-read per well 19

Genomics methods of the second generation - Roche 454 sequencing - Advantages: FAST sequencing (<6h per run) Read-length 300-600bp (modal length ~500bp) Throughput: ~ 1 mio reads 400-600 MBases per run (after quality filtering) Areas of application: Whole genome seq Targeted resequencing Sequencing-based Transcriptome Analysis Metagenomics Disadvantages: Poly-NTP errors are common (require specific errorhandling) Low throughput of 400-600 Mbases per run More expensive than competitors 20

Different Plattforms: Genomics methods of the second generation - Illumina - HiSeq (1000/1500/2000/2500/ X ) MiSeq NextSeq They differ in capabilities and throughput, technology is the same NextSeq: up to 150bp paired end (PE) and 120Gbases / 1.5 days MiSeq: up to 300bp PE and 15Gbases / 3 days HiSeq 2000: up to 125bp PE and 600Gbases / 11 days HiSeq 2500: up to 125bp PE and 1Tbases / 6days HiSeq X: up to 150bp PE and 1.8 Tbases / 3 days 21 Maridis Annu. Rev. Genome. Human Genet. 2008

Genomics methods of the second generation - Illumina - Fragmentation of sample + ligation of adapters (2 types) + size selection Binding of fragments onto cell surface + initial replication a) Adapter I; b) Adapter II; c) orig. fragment; d) unbound adapters on surface 22

Genomics methods of the second generation - Illumina - Bridge formation and polymerase activity using unlabeled dntps Final double stranded bridge a) Full (surface bound) Adapters; b) incomplete Adapters (from aborted polymerase activity: no more space!) 23

Genomics methods of the second generation - Illumina - Denaturing of double stranded bridge a) identical (+/- strand) surface bound copies of DNA-fragment Repetition of bridging, amplification, denaturation until a `forest of fragments exists 24

Genomics methods of the second generation - Illumina - Removal of adapter I bound fragments, Addition of ddntp like labeled bases + primer (adapter I): Sequencing of base 1 (Laser excitation, recording of fluorescence activity) 25

Genomics methods of the second generation - Illumina - Removal sequence elongation terminator Addition of ddntp like labeled bases Sequencing of base 2 Processing of all recorded images into textual format 26

Genomics methods of the second generation - Illumina - Advantages: Low error rate Lowest cost per base Tons of data Disadvantages: Must run at very large scale Short read length (50-150bp) Runs take multiple days High startup costs De Novo sequencing difficult Areas of application: DNA sequencing Gene regulation analysis Sequencing-based Transcriptome analysis SNPs and SVs discovery Cytogenetic Analysis ChIP-sequencing Small RNA discovery analysis 27

Genomics methods of the second generation - Pacific Bioscience SMRT sequencing - Real time, bound polymerase chain reaction using labeled dntps Pacific Biosciences SMRT (Single Molecule Real Time) Special labeling: fluorescent is situated at the terminal phosphate Incorporation with DNA polymerase releases the label, leaving a natural DNA strand behind. Generates up to 4TB of raw data (per 30 minutes (!!)) Single-molecule sequencing has been developed to circumvent the 2 main biases of PCRdependent sequencing (like 454, Illumina): 1) PCR introduces an uncontrolled bias in template representation because its efficiencies vary as a function of template properties 2) PCR introduces errors (generating false-positive SNPs) 28

Genomics methods of the second generation - Pacific Bioscience SMRT sequencing - 29

Genomics methods of the second generation - Pacific Bioscience SMRT sequencing - ZMV zero mode waveguide 30

Genomics methods of the second generation - Pacific Bioscience SMRT sequencing - Advantages: Very fast Areas of application: Can deliver really long reads de-novo assembly (mean read-length is >5500bp, longest reads can reach Targeted sequencing 30kb) 1 run is not really expensive (~400$ per run) Disadvantages: Only ~30000-50000 reads per SMRT Cell Need many runs for higher coverage high startup costs http://www.youtube.com/watch?v=_ B_cUZ8hSYU 31

Third generation sequencing - Oxford Nanopore MinION & GridION - Sequencing without fluorescent labels, without fragmenting the DNA Pipetting ions and the entire DNA through a small nanopore located in a synthetic polymer-membrane using a voltage difference A nanopore is the only possibility for current to cross the membrane only small sample volume is required 32

Third generation sequencing - Oxford Nanopore MinION & GridION - The inside of the nanopore is engineered for enhanced sensing For each triplet of nucleotides, a characteristic electrical signal caused by the ion-flow is detected The current change can be directly measured Signal for each triplet (overlapping!) is recorded 33

Advantages: Third generation sequencing - Oxford Nanopore MinION & GridION - Minimal sample preparation no requirement for polymerase or ligase potential of very long read-lenghts it might well achieve the $1000 per mammalian genome goal the instrument is inexpensive Challenges/Disadavantages: slowing down DNA translocation improving signal/noise ratio Potentially high error rate 34

Applications DNA/RNA sequencing can be used for a variety of applications, including: Genome sequencing - De novo sequencing of genomes - resequencing of genomes Detection of variants (SNPs) and mutations exome sequencing Confirmation of clone constructs Detection of methylation events Gene expression studies (transcriptomics) - Whole transcriptome - RNA seq/ small RNA Chip-Seq/PAR-CLIP 35

Applications - ChIP sequencing - ChIP-Seq is short for 'chromatin immunoprecipitationsequencing Used to determine the influences of chromatin-associated proteins and transcription factors on the actual transcription. DNA is usually wound up around 'chromatin' Unwinding is necessary for transcription (accessibility) Carried out/aided by transcription-factors and associated proteins How do these work? Where do they bind? => ChIP-Seq tries to deliver the answers. 36

Applications - ChIP sequencing - 'Cross-linking' or 'binding' of proteins to DNA 37

Applications - ChIP sequencing - Lysate the cell containing the cross-linked proteins 38

Applications - ChIP sequencing - Pulldown (magnetic beads) Wash off Undo cross-linking Sequence the DNA (Illumina, SMRT) Rough area (region 200-400bp) where protein bound to /interacted with DNA 39

Applications - gene expression studies transcriptomics - Transcriptome - set of all mrnas present in certain cell, tissue, organ, - mrna level results from intensity of transcription and mrna stability Transcriptomics - analysis of differences in expression of gene populations under different conditions (treatment, development, disease) - also called expression profiling 40

Applications - gene expression studies transcriptomics - Different types of RNA mrna Coding RNA t-rna rrna hnrna sirna mirna Coding RNA Approximately 10.000 to 15.000 genes are active in every single cell Different abundance comparing the single mrnas low abundant mrnas about 1 copy per cell highly abundant mrnas more than 3.000 copies per cells 41

Applications - gene expression studies transcriptomics - Why analyzing the transcriptome? How to analyze the transcriptome? allows to analyze expression changes in: using different methods depending on Different cell types the different source of sample/question In different conditions of the to answer: environment/development Microarrays (Affymetrix, Illumina allows to compare healthy vs. Beadchip technology) diseased state RNA-seq (454, Illumina) identification of new genes Real-time quantitative PCR serial analysis of gene expression (SAGE) massively parallel signature sequencing (MPSS) 42

Applications - gene expression studies transcriptomics - Workflow of a DNA microarray 43

Applications - gene expression studies transcriptomics - Result of a microarray: the heatmap 1 column= 1 sample 1 row = 1 gene Color key: indicating th relative expression 44

Applications - gene expression studies transcriptomics - Advantages of microarrays Advantages of RNA-seq affordable fast technique no specific equipment required high resolution Disadvantages of microarrays cross-hybrydisation possible sequences of the targeted genes need to be known material intensive (solutions, RNA) variation among laboratories high-throughput high coverage of mrna compared to microarrays new unidentified exons can be detected can handle all RNA isoforms Disadvantages of RNA-seq high costs computational complexities sample preparation can potentially induce bias (temperature-increased shredding of mrna) 45

Thank you! 46