Thema Gentechnologie. Erwin R. Schmidt Institut für Molekulargenetik Vorlesung #

Similar documents
Matthew Tinning Australian Genome Research Facility. July 2012

Overview of Next Generation Sequencing technologies. Céline Keime

Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing Workshop

Third Generation Sequencing

Next Generation Sequencing. Jeroen Van Houdt - Leuven 13/10/2017

Chapter 7. DNA Microarrays

DNA-Sequencing. Technologies & Devices

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI)

DNA-Sequencing. Technologies & Devices

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI)

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday September 15, 2014

Next Generation Sequencing (NGS)

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday June 16, 2014

Next-generation sequencing Technology Overview

The Journey of DNA Sequencing. Chromosomes. What is a genome? Genome size. H. Sunny Sun

Next Generation Sequencing Lecture Saarbrücken, 19. March Sequencing Platforms

Next generation sequencing techniques" Toma Tebaldi Centre for Integrative Biology University of Trento

Research school methods seminar Genomics and Transcriptomics

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Tuesday December 16, 2014

Functional Genomics Research Stream. Research Meetings: November 2 & 3, 2009 Next Generation Sequencing

DNA-Sequenzierung. Technologien & Geräte

Concepts and methods in sequencing and genome assembly

Human genome sequence

Genome Sequencing. I: Methods. MMG 835, SPRING 2016 Eukaryotic Molecular Genetics. George I. Mias

Contact us for more information and a quotation

Sequencing technologies

DNA Sequencing by Ion Torrent. Marc Lavergne CHEM 4590

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

BIOINFORMATICS 1 SEQUENCING TECHNOLOGY. DNA story. DNA story. Sequencing: infancy. Sequencing: beginnings 26/10/16. bioinformatic challenges

Ultrasequencing: methods and applications of the new generation sequencing platforms

CSC Assignment1SequencingReview- 1109_Su N_NEXT_GENERATION_SEQUENCING.docx By Anonymous. Similarity Index

TREE CODE PRODUCT BROCHURE

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Welcome to the NGS webinar series

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

NGS technologies: a user s guide. Karim Gharbi & Mark Blaxter

High Throughput Sequencing Technologies. UCD Genome Center Bioinformatics Core Monday 15 June 2015

Introduction to Next Generation Sequencing (NGS)

Course summary. Today. PCR Polymerase chain reaction. Obtaining molecular data. Sequencing. DNA sequencing. Genome Projects.

CHEM 4420 Exam I Spring 2013 Page 1 of 6

Deep Sequencing technologies

Biochemistry 412. New Strategies, Technologies, & Applications For DNA Sequencing. 12 February 2008

Transcriptomics analysis with RNA seq: an overview Frederik Coppens

Next Generation Sequencing. Simon Rasmussen Assistant Professor Center for Biological Sequence analysis Technical University of Denmark

Genome 373: High- Throughput DNA Sequencing. Doug Fowler

A Crash Course in NGS for GI Pathologists. Sandra O Toole

Next-Generation Sequencing. Technologies

Bio(tech) Interlude. 3 Nobel Prizes: PCR: Kary Mullis, 1993 Electrophoresis: A.W.K. Tiselius, 1948 DNA Sequencing: Frederick Sanger, 1980

High throughput DNA Sequencing. An Equal Opportunity University!

2 nd Genera-on ( NextGen ) Sequencing Technologies

Introductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie. Sander van Boheemen Medical Microbiology

Opportunities offered by new sequencing technologies

Next Gen Sequencing. Expansion of sequencing technology. Contents

NEXT-GENERATION SEQUENCING AND BIOINFORMATICS

Next Generation Sequencing Technologies

Applications of Next Generation Sequencing in Metagenomics Studies

Next-generation sequencing technologies

Next Generation Sequencing. Tobias Österlund

Ultrasequencing: Methods and Applications of the New Generation Sequencing Platforms

solid S Y S T E M s e q u e n c i n g See the Difference Discover the Quality Genome

Biochemistry 412. New Strategies & Technologies For DNA Sequencing. 2 February 2007

RNA Sequencing. Next gen insight into transcriptomes , Elio Schijlen

SEQUENCING TARU SINGH UCMS&GTBH

Recitation CHAPTER 9 DNA Technologies

INTRODUCCIÓ A LES TECNOLOGIES DE 'NEXT GENERATION SEQUENCING'

FGCZ NEWSLETTER FALL Next Generation Sequencing at the Functional Genomics Center Zurich

Get to Know Your DNA. Every Single Fragment.

High throughput sequencing technologies

DNA and genome sequencing. Matthew Hudson Dept of Crop Sciences University of Illinois

Outline General NGS background and terms 11/14/2016 CONFLICT OF INTEREST. HLA region targeted enrichment. NGS library preparation methodologies

Outline. General principles of clonal sequencing Analysis principles Applications CNV analysis Genome architecture

Galaxy Workshop

II. Integrative Genomics interactions between molecules and genes

FUTURE PROSPECTS IN MOLECULAR INFECTIOUS DISEASES DIAGNOSIS

I. Structure of Genome Structural genomics II. Expression of Genome Functional genomics. a. Transcriptomics b. Proteomics

How is genome sequencing done?

Sequence Assembly and Next Generation Sequencing Informatics CBPS7711

Sequencing techniques and applications

Genome Sequencing Technologies. Jutta Marzillier, Ph.D. Lehigh University Department of Biological Sciences Iacocca Hall

The Use of Pyrosequencing for Genomic Sequence Determination

High Throughput Sequencing the Multi-Tool of Life Sciences. Lutz Froenicke DNA Technologies and Expression Analysis Cores UCD Genome Center

The New Genome Analyzer IIx Delivering more data, faster, and easier than ever before. Jeremy Preston, PhD Marketing Manager, Sequencing

TECH NOTE Pushing the Limit: A Complete Solution for Generating Stranded RNA Seq Libraries from Picogram Inputs of Total Mammalian RNA

Lecture 7. Next-generation sequencing technologies

CM581A2: NEXT GENERATION SEQUENCING PLATFORMS AND LIBRARY GENERATION

Application of NGS (next-generation sequencing) for studying RNA regulation

HLA-Typing Strategies

NPTEL VIDEO COURSE PROTEOMICS PROF. SANJEEVA SRIVASTAVA

Introduction to NGS. Josef K Vogt Slides by: Simon Rasmussen Next Generation Sequencing Analysis

Molecular Biology. DNA structure and function. Recombinant DNA technology. DNA amplification. DNA sequencing

Sequencing techniques

Genome Resequencing. Rearrangements. SNPs, Indels CNVs. De novo genome Sequencing. Metagenomics. Exome Sequencing. RNA-seq Gene Expression

Understanding the science and technology of whole genome sequencing

Wheat CAP Gene Expression with RNA-Seq

Next Generation Sequences & Chloroplast Assembly. 8 June, 2012 Jongsun Park

NB536: Bioinformatics

GENOTYPING-BY-SEQUENCING USING CUSTOM ION AMPLISEQ TECHNOLOGY AS A TOOL FOR GENOMIC SELECTION IN ATLANTIC SALMON

Targeted Sequencing in the NBS Laboratory

Transcription:

Thema Gentechnologie Erwin R. Schmidt Institut für Molekulargenetik Vorlesung #10 01. 07. 2014

Pyrosequenzierung

The Pyrosequencing technology is a relatively new DNA sequencing method originally developed here at KTH at the Department of Biotechnology. The technology has been commercialized and is today marketed by Biotage AB. The technique utilizes the cooperativity between four different enzymes and the phenomenon of bioluminescence to monitor the incorporation of nucleotides into the DNA. A short description of the steps in the Pyrosequencing process is given below. Initial step The reaction mixture consists of the four enzymes (DNA polymerase, ATP sulfurylase, luciferase and apyrase), different substrates needed for the reactions and the single stranded DNA to be sequenced. Step 1 - Polymerase One of the four nucleotides dntp (datp, dctp, dgtp, dttp) is added to the reaction mixture. If the added nucleotide is complementary to the base in the DNA strand, it is incorporated and inorganic pyrophosphate (PP i ) is released. Step 2 - ATP sulfurylase The PP i is converted into ATP by the enzyme ATP sulfurylase. Step 3 - Luciferase The luciferase catalyzes a reaction where ATP is used to generate light. The amount of light is proportional to the amount of ATP, and hence also proportional to the amount of incorporated nucleotides via the PP i. The light is then detected by a CCD camera. Step 4 - Apyrase Remaining dntp and ATP are degraded by the apyrase before the next nucleotide in the iterative cycle is added to the reaction mixture. My research is devoted to developing a good mathematical model of the reaction system. This will help us to understand the mechanisms governing the system in detail. Once a satisfactory model has been developed, it can be used to optimize the method with respect to substrate and enzyme concentrations as well as the choice of enzymes (kinetic parameters). As the demand for even better DNA sequencing techniques is steadily increasing, as new applications arise, there is a lot to gain by optimization.

Next Generation Sequencing (NGS) Erwin R. Schmidt Institut für Molekulargenetik Johannes Gutenberg Universität Mainz

DNA-Sequencing A brief historical overview The different platforms of NGS Benchtop versus High Output Cost and Reliabilty Future technologies Summary

NGS: Short History of (Nucleotide) Sequencing How many generations do we have?

First Generation Sequencing Nearest neighbor technology Combined with sequence or base specific nuclease digestion

The first nucleotide sequence of a complete biomolecule was the Alanine trna of Yeast by Robert W Holley et al. in 1964 Nobel Prize in Physiology and Medicine 1968 (for 77 nt)

Generation 2: The real breakthrough! 1975-1977 the Sanger Sequencing sequencing by synthesis Nobel Prize Chemistry 1980 1977 the Maxam and Gilbert Sequencing - sequencing by chemical degradation Nobel Prize Chemistry 1980

Development of the Sanger Sequencing 1975: Sanger and Coulson published the +/- method Sanger F, Coulson AR. A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase. J Mol Biol. 1975 May 25;94(3):441 448. 1977: Sanger, Nicklen and Coulson published the chain terminator method F. Sanger, S. Nicklen, and A. R. Coulson DNA sequencing with chain-terminating inhibitors Proc Natl Acad Sci U S Dec 1977 Av.74(12); 5463-5467

The Maxam and Gilbert Method: based on base specific chemical degradation of endlabelled DNA-restriction fragments In the same year (1977) but 10 months before Sanger published the chain terminator Method Maxam and Gilbert published their DNA sequencing method based on chemical degradation of end-labelled DNA restriction fragments! A M Maxam, W Gilbert A new method for sequencing DNA. Proc Natl Acad Sci U S A. 1977 February; 74(2): 560 564.

Which sequencing method was Maxam and Gilbert sequencing: superior? A very robust method Not sensitive to secondary structures Shows base modification But: Requires work with with strong carcinogens and milicuries of radioisotopes Is not automatable Very laborious and requires long exposition times? A M Maxam, W Gilbert A new method for sequencing DNA. Proc Natl Acad Sci U S A. 1977 February; 74(2): 560 564.

Which sequencing method was Sanger sequencing: superior? Reading the sequence easier, No carcinogenic chemicals involved Exposure times were only a few hours the sequencing reactions could be done by the technician but the natural DNA-Polymerases are sensitive to secondary structures and stretches of homopolymeric nucleotides. This changed only when the sequenases were invented

Sanger-sequencing has won the race: Maxam and Gilbert: Sanger, Nicklen and Coulson: Number of citations: - 7690 times Number of citations: 62757 times Source: Google Scholar

Generation 3: on line sequencing - number of different techniques, - all based on fluorescently labelled DNA framents, which could be detected and tranferred automatically to a computer - automated base calling

Classical on line sequencing is still in use: The demand is still increasing Results are robust, low error rate < 1/1000-1/10000 bp Up to 1500 nt readable in a row Cost per sample ~ 3-5 (0.14 Cent Bp,ds) Comprehensive service available commercially

Generation # 4: Next generation sequencing (NGS) 2007: NGS selected by Nature as the method of the year introduces a new dimension in sequence determination Several platforms exist providing different possibilities

The advent of NGS is reflected by the number of genome projects and data base entries http://www.genomesonline.org/cgi-bin/gold/index.cgi?page_requested=statistics

In particular bacterial genome projects boost since 2008 http://www.genomesonline.org/cgi-bin/gold/index.cgi?page_requested=statistics

Complete Genome Projects: 12725 Archaeal: 317 Bacterial: 12096 Eukaryal: 312 Finished: 2876 Permanent Draft: 9849 Last updated 2014-01-24 Source: http://genomesonline.org/cgi-bin/gold/index.cgi

Genome Projects http://www.genomesonline.org/cgi-bin/gold/index.cgi Incomplete Genome Projects: 27988 Archaeal: 457 Bacterial: 19494 Eukaryal: 6413 Last updated 2014-01-24 Source: GOLD = Genomes Online Database at the DOE Joint Genome Institute

NGS has revolutionized genome science: Reduction of costs Reduction of time Reduction of labour Increase in bioinformatical challenge

The different platforms: The genome scale 454/Roche GenomeSequencer FLX ABI SOLiD Sequencing System Illumina/Solexa Hi- Seq2000/2500 Ion Torrent Proton Pacific Bioscience (Helicos) The bench top scale 454 GS Junior/Roche Illumina MiSeq Illumina NextSeq500 Ion Torrent PGM/Life Technologies

454/Roche GS FLX: The basis is Emulsion PCR and Pyrosequencing sst-dna: single-stranded template DNA

The number of sequences is depending on the number of wells in plate!

ÂPS = Adenosinephosphosulfate 454/Roche GS FLX: Pyrosequencing

Pyrosequencing is not suitable for sequencing oligopolymers n>6-7

GS FLX+ System Sequencing Kit New! GS FLX Titanium XL+ GS FLX Titanium XLR70 Read Length Up to 1,000 bp Up to 600 bp Mode Read Length 700 bp 450 bp Throughput Profile - 85% of total bases from reads >500 bp - 45% of total bases from reads >700 bp - 85% of total bases from reads > 300 bp - 20% of total bases from reads > 500 bp Typical Throughput 700 Mb 450 Mb Reads per Run ~1,000,000 shotgun ~1,000,000 shotgun, ~700,000 amplicon Consensus Accuracy* 99.997% 99.995% Run Time 23 hours 10 hours Sample Input gdna or cdna gdna, cdna, or amplicons (PCR products) Multiplexing Multiplex Identifiers (MIDs): 132 Gaskets: 2, 4, 8, 16 regions Data from Roche: http://454.com/products/gs-flx-system/

454/Roche GS FLX Titanium Advantages Long read length >400 nt up to 1000 Low error rate, but sensitive to homooligomers! Disadvantages Data output < 0,7 Gb Cost per Gigabase is highest among all systems

Applied Biosystems SOLiD TM -Sequencing SOLiD = Sequencing by Oligonucleotide Ligation and Detection Template preparation: Emulsion PCR Sequencing: Hybridization and ligation

By successive rounds labelled oligonucleotide ligation to the template each base in the template is determined twice

Process of SOLiD Sequencing Figure from Clinical Chemistry April 2009 vol. 55 no. 4 641-658

Each base is sequenced twice!

Applied Biosystems SOLiD TM Sequencing Advantage Very good data quality, since every base sequenced twice (99.99% correct) High data output ~ Solid4 TM hq 300 Gb/run/ 14d; High number of possible multiplexing (up to 1.536 sample per run) Cost effective: 2000 /human genome Disadvantage Maximum read length is 75 bases 14 days run time for 2x75 bases Data from: http://www3.appliedbiosystems.com/cms/groups/mcb_marketing/documents/generaldocuments/cms_061241.pdf

Illumina/Solexa TM -Sequencing Sequencing by Synthesis Modified chain terminating method Bridge amplification Paired end and mate pair libraries possible

Illumina/Solexa TM -Sequencing Clustering and sequencing

Illumina/Solexa TM -Sequencing Advantages (Hi-Seq TM 2000/2500) Very high data output > 400 Mio reads PE/lane; ~ 600 Gigabase/run; Read length PE 2x150 bases (increasing) Cost per Gb ~ <50 or 1500 /human genome Disadvantages Hardware investment is high (~600.000 plus periphery) Medium high error rate (~0.5%, increasing with read length) High maintenance costs (service contract >80.000 /year)

Life Technologies/Ion Torrent- Sequencing by ph Monitoring Based on sequencing by Synthesis Available since 2010 Emulsion PCR for library construction Beads with amplified molecules are primed with an adapter Beads are put in an Ion Chip, that is sensitive for H + -Ions Incorporation of a nucleotide produces an H + - Ion, which is measured by the chip

Ion Torrent: NGS by ph-change Measurement on a Semiconductor Chip G A T C Figure modified by E. R. Schmidt Annual Reviews

Life Technologies/Ion Torrent- Sequencing by ph Monitoring Advantages Very cost efficient (human genome < 1000 ) Read length 200 bases (increasing) Very short running times (~ 2-4 hrs) Hardware investment is low ( ~ 80.000 US $) Disadvantages High error rate (>1.0 %, increasing with read length) Especially sensitive to oligopolymer stretches, leading to a high rate of deletions Data output medium (depending on chip, e.g. Proton PII = 32 Gb)

Pacific Biosciences/Single molecule real time (SMRT)-sequencing Based on sequencing by synthesis on single molecules Available since 2010 Special library construction leading to circular molecules (enables multiple sequencing of the same molecule) Binding of engineered DNA-Polymerase in zeromode waveguide manufactured on a silicon wafer (SMRT TM -cell) fluorescence labelled dntp are measured in real time during incorporation

Zero-mode waveguide

Pacific Biosciences/Single molecule polymerase active site monitoring Advantages Read length up to 10.000 bases (average > 1000 b) Very short running times (~ 2hrs) Low running cost; acc. to the company a genome human equivalent a few hundred dollars Disadvantages High error rate (>10-15 %, for single pass sequencing, repeated sequencing lowers error rate to 2-3%) Significant investment in hardware (>600 k )

Helicos TM -Sequencing (16. November 2012 bancruptcy protection chapter 11) Sequencing by Synthesis with single molecules as templates Modified chain terminating method Bridge amplification

Benchtop NGS Sequencing Illumina Mi-Seq TM Roche 454 Junior TM Ion Torrent PGM TM

Costs and Performance of benchtop NGS* Table 1: Price comparison of benchtop instruments and sequencing runs Platform List price ApproximatMinimum e cost per throughput run (read length) Run time Cost/Mb Mb/h 454 GS Junior $108,000 $1,100 35 Mb (400 bases) 8 h $31 4.4 Ion Torrent PGM (314 chip) $80,490 a,b $225 c 10 Mb (100 bases) 3 h $22.5 3.3 (316 chip) $425 100 Mb d (100 bases) (318 chip) $625 1,000 Mb (100 bases) MiSeq $125,000 $750 1,500 Mb (2 150 bases) 3 h $4.25 33.3 3 h $0.63 333.3 27 h $0.5 55.5 * From: Performance comparison of benchtop high-throughput sequencing platform Nicholas J Loman, Raju V Misra Timothy J Dallman, Chrystala Constantinidou, Saheer Gharbia, John Wain & Mark J Pallen Nature Biotechnology30,434 439 (2012) doi:10.1038/nbt.2198

Updating benchtop sequencing performance comparison Sebastian Jünemann, Fritz Joachim Sedlazeck, Karola Prior, Andreas Albersmeier, Uwe John, Jörn Kalinowski, Alexander Mellmann, Alexander Goesmann, Arndt von Haeseler, Jens Stoye & Dag Harmsen Nature Biotechnology 31, 294 296 (2013) doi:10.1038/nbt.2522 Published online 05 April 2013

http://www.genome.gov/images/content/cost_per_megabase.jpg

Generation 5: Future Technology: Nanopore Sequencing

From: https://www.nanoporetech.com/home In Oxford Nanopore's 'strand sequencing' method, a single-stranded DNA polymer is passed through a protein nanopore, and individual DNA bases on the strand are identified in sequence as the DNA molecule passes through. When a DNA polymer passes through a nanopore, a number of individual DNA bases occupy the aperture of the nanopore at any time. A successful method of DNA sequencing must identify the sequence of individual bases within this strand. Oxford Nanopore has engineered bespoke nanopores, and data analysis algorithms are used to translate the characteristic electronic signals into DNA sequence data. A method of controlled translocation of the strand through the nanopore is needed. Oxford Nanopore uses proprietary, highly processive enzymes to ratchet DNA through the nanopore. Watch our movie for more information. Oxford Nanopore has not disclosed the proprietary nanopore and enzyme machinery used in its GridION and MinION system. Oxford Nanopore has not signed a commercialisation agreement for strand sequencing and intends to commercialise strand sequencing products independently.

Applications of NGS Genome Resequencing and SNP Detection Genome De novo sequencing Transcriptome sequencing ChIP-sequencing; Histonmethylation Bisulfate-sequencing for methylation analysis Exome enrichment sequencing Small RNA sequencing Genotyping by Sequencing (GBS; RAD); reduced complexity sequencing Ribosome profiling

No summary table! Equipment and technologies are too diverse, so a good advice would be: Discuss your project with people having experience with one or the other platform. NGS is a fantastic novel technology which provides completely new possibilities! Projects that have been even unthinkable a few years ago are now easy going! Thank you very much for your attention!

Steffen Rapp NGS Unit manager Benjamin Rieger bioinformatician Nicole Naumann technical assistant Rudolf Baader technical assistant