THE SEQUENCING TECNOLOGY (R)EVOLUTION

Similar documents
Human genome sequence

DNA-Sequencing. Technologies & Devices

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI)

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI)

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday June 16, 2014

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday September 15, 2014

Next Generation Sequencing. Jeroen Van Houdt - Leuven 13/10/2017

DNA-Sequenzierung. Technologien & Geräte

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Third Generation Sequencing

Next Generation Sequencing Lecture Saarbrücken, 19. March Sequencing Platforms

Next Generation Sequencing (NGS)

Opportunities offered by new sequencing technologies

Research school methods seminar Genomics and Transcriptomics

Ultrasequencing: Methods and Applications of the New Generation Sequencing Platforms

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Tuesday December 16, 2014

Next-Generation Sequencing. Technologies

Introduction to Next Generation Sequencing (NGS)

Sequencing techniques and applications

Genome Sequencing. I: Methods. MMG 835, SPRING 2016 Eukaryotic Molecular Genetics. George I. Mias

Functional Genomics Research Stream. Research Meetings: November 2 & 3, 2009 Next Generation Sequencing

High Throughput Sequencing Technologies. UCD Genome Center Bioinformatics Core Monday 15 June 2015

Bioinformatics Advice on Experimental Design

Outline. General principles of clonal sequencing Analysis principles Applications CNV analysis Genome architecture

Contact us for more information and a quotation

Introductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie. Sander van Boheemen Medical Microbiology

Welcome to the NGS webinar series

Next Gen Sequencing. Expansion of sequencing technology. Contents

NGS technologies approaches, applications and challenges!

Biochemistry 412. New Strategies, Technologies, & Applications For DNA Sequencing. 12 February 2008

Next Generation Sequencing. Simon Rasmussen Assistant Professor Center for Biological Sequence analysis Technical University of Denmark

NEXT-GENERATION SEQUENCING AND BIOINFORMATICS

Introduction to Bioinformatics and Gene Expression Technologies

Next Generation Sequencing Technologies

INTRODUCCIÓ A LES TECNOLOGIES DE 'NEXT GENERATION SEQUENCING'

CSC Assignment1SequencingReview- 1109_Su N_NEXT_GENERATION_SEQUENCING.docx By Anonymous. Similarity Index

A window into third-generation sequencing

Next Generation Sequencing. Josef K Vogt Slides by: Simon Rasmussen

Genetics Lecture 21 Recombinant DNA

Modern Epigenomics. Histone Code

CHEM 4420 Exam I Spring 2013 Page 1 of 6

Genetics and Genomics in Medicine Chapter 3. Questions & Answers

CM581A2: NEXT GENERATION SEQUENCING PLATFORMS AND LIBRARY GENERATION

Thema Gentechnologie. Erwin R. Schmidt Institut für Molekulargenetik Vorlesung #

HLA-Typing Strategies

BIOINFORMATICS 1 SEQUENCING TECHNOLOGY. DNA story. DNA story. Sequencing: infancy. Sequencing: beginnings 26/10/16. bioinformatic challenges

Genetic Fingerprinting

Next Generation Sequencing. Dylan Young Biomedical Engineering

HiSeqTM 2000 Sequencing System

MHC Region. MHC expression: Class I: All nucleated cells and platelets Class II: Antigen presenting cells

RIPTIDE HIGH THROUGHPUT RAPID LIBRARY PREP (HT-RLP)

Lecture Four. Molecular Approaches I: Nucleic Acids

Introductory Next Gen Workshop

Growing Needs for Practical Molecular Diagnostics: Indonesia s Preparedness for Current Trend

Genome Sequencing Technologies. Jutta Marzillier, Ph.D. Lehigh University Department of Biological Sciences Iacocca Hall

Mate-pair library data improves genome assembly

FGCZ NEWSLETTER FALL Next Generation Sequencing at the Functional Genomics Center Zurich

Chapter 10 Analytical Biotechnology and the Human Genome

DNA Technology. Asilomar Singer, Zinder, Brenner, Berg

Introduction to Bioinformatics

Next Generation Sequencing Technologies. Some slides are modified from Robi Mitra s lecture notes

Overview and Applications of Next-Generation Sequencing Technologies

Lecture 8: Sequencing and SNP. Sept 15, 2006

NPTEL VIDEO COURSE PROTEOMICS PROF. SANJEEVA SRIVASTAVA

Ion S5 and Ion S5 XL Systems

1.1 Post Run QC Analysis

Computational Biology I LSM5191

2/5/16. Honeypot Ants. DNA sequencing, Transcriptomics and Genomics. Gene sequence changes? And/or gene expression changes?

Sequencing Theory. Brett E. Pickett, Ph.D. J. Craig Venter Institute

Course Overview: Mutation Detection Using Massively Parallel Sequencing

Gene Expression Technology

Methods, Models & Techniques. High-throughput DNA sequencing concepts and limitations

NEXT GENERATION SEQUENCING: A REVOLUTION IN GENE SEQUENCING

Chapter 20: Biotechnology

Molecular Cell Biology - Problem Drill 11: Recombinant DNA

Recombinant DNA Technology. The Role of Recombinant DNA Technology in Biotechnology. yeast. Biotechnology. Recombinant DNA technology.

Targeted Sequencing in the NBS Laboratory

Next Generation Sequencing Technologies. Rob Mitra 1/30/17

Applications and Uses. (adapted from Roche RealTime PCR Application Manual)

Bioinformatics and computational tools

The Polymerase Chain Reaction. Chapter 6: Background

Genetic Identity. Steve Harris SPASH - Biotechnology

SNP GENOTYPING WITH iplex REAGENTS AND THE MASSARRAY SYSTEM

The $100 Genome: Implications for the DoD

Getting high-quality cytogenetic data is a SNP.

Multiple choice questions (numbers in brackets indicate the number of correct answers)

Mutations, Genetic Testing and Engineering

Polymerase Chain Reaction (PCR) and Its Applications

Methods of Biomaterials Testing Lesson 3-5. Biochemical Methods - Molecular Biology -

Impact of gdna Integrity on the Outcome of DNA Methylation Studies

Illumina (Solexa) Throughput: 4 Tbp in one run (5 days) Cheapest sequencing technology. Mismatch errors dominate. Cost: ~$1000 per human genme

Ion S5 and Ion S5 XL Systems

Cancer Genetics Solutions

NOTES - CH 15 (and 14.3): DNA Technology ( Biotech )

Exploring Genetic Variation in a Caffeine Metabolism gene LAB TWO: POLYMERASE CHAIN REACTION

Analysing genomes and transcriptomes using Illumina sequencing

STUDY OF VNTR HUMAN POLYMORPHISMS BY PCR

Comparative genomics on gene and single nucleotide level

Transcription:

THE SEQUENCING TECNOLOGY (R)EVOLUTION TIM STAKENORG IMEC MB&C meeting May 16, 2013 IMEC 2013

HISTORY OF SEQUENCING 384-322 BC - Aristotle told his students that all inheritance comes from the father 1977 (2 indepent methods published in PNAS) - Maxam & Gilbert: chemical degradation method - Sanger: ddntp-mediated chain termination!! 1995 (Fleishmann et al., Science 269: 485) - Mycoplasma genitalium (first fully sequenced bacterial genome) 2001 (Science/Nature) - First human genome (13 years, 300 million USD) May 2005 (454 technology) - 6 month, >30 million USD IMEC 2013 2

HISTORY OF SEQUENCING 1,E+09 Itanium 2 G80 RV770 AMD K10 transistor count (Moore's law) vs. sequenced kilo base pairs/day 1,E+08 1,E+07 1,E+06 1,E+05 1,E+04 1,E+03 1,E+02 1,E+01 Intel 4004 Intel 8088 Moore s law Intel 8080 Intel 80286 Intel 80386 Pentium II Intel 80486 ABI373 Pentium Pentium 4 AMD K5 ABI377 AMD K7 Pentium III ABI37000 Cell AMD K8 Barton Roche 454 Life Sciences ABI 3730XL Illumina HiSeq Atom Pacific Biosciences SMRT* 454 Titanium, ABI Solid3 First Solid manual slab gel 1,E+00 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 date of introduction 3 rd generation 2 nd generation (sequence by synthesis) 1 st generation (capillary electrophoresis) (slab gels) Note: human genome = ~3.10 9 bases IMEC 2013 3

HISTORY OF SEQUENCING Still many challenges in post-processing of data Data handling Computational algorithms IMEC 2013 4

THE FIRST GENERATION IMEC 2013 5

FIRST GENERATION (SANGER) Cyclic sequencing (amplification) reaction - PCR products of different length - Last base is fluorescent (different color per base) - Separation by size Pros and Cons - Extensive sample prep (-) - High cost (-) - Low throughput (-) - Long read lengths (+) IMEC 2013 6

DRAFT GENOME 1990 Human genome project started First draft in 2001, over 10 years and $3 billion later In 2003 (published 2004) finished human genome sequence February 2001 April 2011 IMEC 2013 7

IMEC 2013

THE NUMBER OF GENES Human genome : ~3 Gbase (300,000 kbases) Average gene size: ~3kbases, but sizes vary greatly (largest is dystrophin: 2.4 Mbases) GENE SWEEP (Cold Spring Harbor Lab 2000-2003) Rules: $1 in 2000, $5 in 2001 and $20 in 2002 165 bets Mean 61710 Lowest 25947 (Lee Rowen) Highest 153478 IMEC 2013 9

THE HUMAN GENOME ~3 Gbase, 24 chromosomes: 1-22, X, Y 21,500-24,000 genes only 2% of the genome encodes genes about 46% of the genome is repetitive sequence => THERE IS A LOT OF GENOMIC DARK MATTER (or non coding RNA) IMEC 2013 10

THE HUMAN GENOME IMEC 2013 11

THE HUMAN GENOME Almost all (99.9%) nucleotide bases are exactly the same in all people (0.1%, difference which is 1 difference per 1,000 base pairs) - Humans (0.08-0.1%) - Chimpanzees (0.12-0.17%) - Drosophila simulans (2%) - E. coli (5%) - HIV-I (30%) SNPs (a single base change in more than 1% humans) - Harmless (e.g. change in phenotype) - Harmful (e.g. diabetes, cancer, heart disease, Huntington s) - Latent (e.g. susceptibility to lung cancer) IMEC 2013 Photos from UN photo gallery www.un.org/av/photo 12

THE SECOND GENERATION (NEXT GEN) IMEC 2013 13

SECOND GENERATION Sequence by synthesis - Step-wise base addition & read-out - Washing steps between each step Pros and Cons - Extensive sample-prep (-) - Relative costly reagents/run (-) - Massively parallel sequencing (+) - Relatively short fragments (-) Examples: Roche 454 GS-FLX, Illuminia HiSeq, Applied Solid, IonTorrent, etc IMEC 2013 14

2 ND GENERATION: SAMPLE PREP Extensive sample prep - Library generation (generation of fragments with adapters) - Clonal amplification - e.g. empcr (e.g. Roche GS-FLX, ABI Solid, etc) - e.g. bridge PCR (e.g. Solexa from Illumina) IMEC 2013 15

FLUORESCENT READ-OUT e.g. Illumina, ABI Solid, (or Helicos on single molecule level) IMEC 2013 16

BASE CALLING: NOISE FACTORS Phasing noise - Leading / Lagging Fading noise - Exponential decay in fluorescent signal Cycle-dependent change in fluorophore cross-talk IMEC 2013 Erlich et al. Nature Methods 5: 679-682 (2008); http://www.cs.utoronto.ca/~brudno/csc2431w10/altacyclic_pres.pdf 17

PYROSEQUENCING (OPTICAL) e.g. Roche GS FLX 454 IMEC 2013 Figure from OMICS Journals (doi:10.4172/jcsb.1000019) and Nature Biotechnology (doi:10.1038/nbt1485) 18

PYROSEQUENCING (ELECTRICAL) e.g. IonTorrent (Life Technologies) IMEC 2013 19

PYROSEQUENCING (ELECTRICAL) Making small sequencing tests available (e.g. DNA electronics/roche) IMEC 2013 20

THE THIRD GENERATION (NEXT NEXT-GEN) IMEC 2013 21

THIRD GENERATION Sequencing (by synthesis) - Single molecule sensitivity - Read-out during copying Pros and Cons - Potentially long fragments (+) - Large cost reduction per run (+) - Easier sample prep (+) - Enzyme necessary: speed limited (1-3 bases/second/pore) Examples: Pacific Biosciences, Oxford Nanopore, Visigen (now Life Tech), etc. IMEC 2013 22

REAL-TIME SEQUENCING Zero mode waveguides (Pacific Biosciences) Single Molecule Real-Time (SMRT) sequencing 70nm - Polymerase is immobilized in 20 zl sized zeromode waveguides (ZMW) - Polymerase cleaves off the fluorescent tags - Fluorescent read-out - Diffusion time: microseconds - Incorporation time: milliseconds IMEC 2013 23

MINATURIZING DNA SEQUENCING IMEC 2013 - Molecular Biology and Cytometry Course - SCK CEN 24

IMEC 2013 25

COMPARISON OF COMMERCIAL PRODUCTS Illumina: HiSeq 2000, began shipping in the third quarter of 2012. The instrument produces 2x150-base paired-end reads, which will increase to 2x250. That will give you around 300 gigabases in approximately 60 hours, Roche: GS FLX+ system, coupled with its newest software produces reads of up to 1,000 bp and beyond Life Technologies (Ion Torrent): Ion Proton can sequence a human exome in a few hours, Proton II is basically a 50x improvement of their first chip (120 Gb), but with a somewhat higher error rate than Illumina Pacific Biosciences: PacBio RS, the company s new XL chemistry produces reads averaging 5,000 bases a piece, though about 5% of those exceed 10,000 bases. IMEC 2013 26

FUTURE FOURTH GENERATION IMEC 2013 27

NANOPORE BASED SEQUENCING e.g. Oxford Nanopore IMEC 2013 28

NANOPORE BASED SEQUENCING Hybridization assisted sequencing e.g. Nabsys - Short fragments are hybridized to DNA - Their distance is measured - In parallel for many fragments e.g. Noblegen - Replace bases by barcode - Hybridize molecular beacons - Unzip DNA fragments in pore - Read fluorescent signals IMEC 2013 29

FOURTH GENERATION Direct read-out of DNA - Nanopore based sequencing - Electron microscopy Pros and Cons - In principle, simple sample prep - Limited or no reagent costs - Long read lengths - No enzymatic reaction needed - Ability to read RNA, DNA modifications, etc Examples: imec, IBM, Halycon, IMEC 2013 30

NANOPORE BASED SEQUENCING IMEC 2013 Figures from Hao Liu, (Biodesign Institute) and http://www.mcb.harvard.edu/branton/index.htm 31

NANOPORE/NANOSLIT COMBINATION Controlled translocation through a solid-state nanopore Electrically induced translocation Mechanical confinement of a single DNA strand V SERS in a plasmonic nanoslit Vibrational fingerprinting Molecular information in the pore IMEC 2013 - RESTRICTED

MOLECULAR SPECTROSCOPY BY SERS The normal Raman effect Inelastic scattering from light by molecules through the excitation of molecular vibrations Spectroscopy Weak process! Surface Enhanced Raman Scattering Hot spots near metal nanostructures (excitation of plasmons) Enhancement with E 4 Single molecule resolution IMEC 2013 - RESTRICTED

SERS NANOSLIT λ=785 nm Au H 2 O Hot spot Generating a hot spot using top-down designed plasmonic nanocavities Large and highly localized field enhancement Raman enhancement: 10 5-10 10 x (to single molecule levels) IMEC 2013 - RESTRICTED

NEXT-GENERATION SEQUENCING 1 st generation 2 nd generation 3 nd generation 4 th generation Basic principle Sanger sequencing with size separation of amplified fragments Site-selective amplification followed by iterative base-incorporation, Enzymatic reaction to continuously integrate and read-out bases. True single molecule Direct read-out of bases (without copying). True single molecule analysis read and wash steps analyses Sample preparation Extensive Extensive Moderate Almost none Speed/base/site Very low (<<1/sec) Low (<1/sec) Moderate (3/sec) Very fast (~1 ms) Throughput Low Very high Very high Very high Accuracy High Low Low (~80%) NA Read length Long (~1000) Short (~15-400) Moderate (~450) Very long (>1000) De novo sequencing Possible Not possible Difficult Easy Repeat regions Limited Highly limited Limited No intrinsic limit DNA/protein Indirectly Indirectly Indirectly Yes derivatives Reagent cost Very high Very high High None IMEC 2013 35

Technology Generation On market Single molecule Nanopore (NP) / Enzymatic (E) Based Principle website Illumina HiSeq * 2 Yes No E Fluorescence, sequence by synthesis www.illumina.com Roche (FLX Titanium) 2 Yes No E Light, sequence by synthesis www.454.com Polonator 2 Yes? E Fluorescence/Polony www.polonator.org Complete Genomics 2 Yes No E Fluorescence, sequence by synthesis www.completegenomics.com Helicos (TSMS) 2 Yes Yes E Fluorescent, sequence by synthesis www.helicosbio.com Life Tech (ABI Solid4 ) 2 Yes No E Fluorescence, sequence by synthesis www.appliedbiosystems.com Life Tech (IonTorrent) 2 (3) Yes No E Electrical, sequence by synthesis www.iontorrent.com Intelligent Bio 2 No? E Fluorescence, sequence by synthesis www.intelligentbiosystems.com GE Global 2 No Yes E Fluorescence, sequence by synthesis http://ge.geglobalresearch.com/blog/sequencing-a-human-sized-genomein-less-than-a-day/ GnuBio 2 No No E Microdroplets, sequence by ligation www.gnubio.com Genizon Bioscience 2? No No E http://www.genizon.com/images/pdfs/pihlak_linnarsson_nbt2008.pdf www.geniozon.com Light Speed 2? No?? Light interference, patent: US 2009/0061526 www.lsgen.com Mobious Biosystems (Nexus) 2? No No E?? www.mobious.com Pacific Biosciences (tsmrt) 3 Yes Yes E Fluorescence (SMRT), sequence by synthesis www.pacificbiosciences.com Oxford Nanopore 3 No Yes NP/E Electrical, enzymatic cutting of DNA www.nanoporetech.com Visigen 3 No? E FRET measurement using TIRF www.visigenbio.com Cracker 3 No Yes E SMRT, read-out on chip www.crackerbio.com IBM/Roche nanopore 4 No Yes NP/- Electrical, tunneling using NPs http://www-03.ibm.com/press/us/en/pressrelease/32037.wss Nabsys 4 No Yes NP/- Electrical, hybridization assisted NP sequencing www.nabsys.com NobleGen Biosciences 4 No Yes NP/- Electrical, fluorescent after hybiridization (Meller) www.noblegenbio.com Electronic Bio 3 (4?) No Yes NP/- Electrical using biological NPs www.electronicbio.com Reveo 4 No Yes -/- Electrical, tunneling using nano-knifes www.reveo.com Base4 Innovation 4 No Yes?? Nanopore + optical? www.base4innovation.co.uk ZS Genetics 4? No Yes -/- Electronmicroscopy www.zsgenetics.com Halcyon Molecular 4? No Yes -/- Electronmicroscopy www.halcyonmolecular.com IMEC 2013 36

MANY APPLICATIONS & PUBLICATIONS IMEC 2013 37

APPLICATIONS OF NEXT-GEN SEQUENCING Whole-genome sequencing Comparative genomics Genome re-sequencing Structural variation analysis Polymorphism discovery Meta-genomics Environmental sequencing Gene expression profiling Genotyping Population genetics Migration studies Ancestry inference Relationship inference Genetic screening Drug targeting Forensics IMEC 2013 38

HAS NGS A PROGNOSTIC VALUE IMEC 2013

IMEC 2013... and for personal health?

IMEC 2013... and for personal health?

IMEC 2013

IMEC 2013

... and for personal health? 2 virus infections during the test period (common cold and sinus infection) Diabetes developed during / after the 2 nd infection (Genetic risk had already been identified from whole genome sequencing) IMEC 2013

IMEC 2013

HAS NGS A PROGNOSTIC VALUE Sequencing has gone through a revolution and has become affordable for some applications (e.g. exome sequencing) Personal genome sequencing is already possible, but the medical interpretation is still difficult Genome sequencing can predict disease risks Genome sequencing should be combined with other omics to monitor disease risk Integrated analysis are possible, but still need further improvement and understanding Regulatory information needs to be considered Every person is unique and longitudinally follow-up will provide further insight Longitudinal follow-up: case studies have proven value, but no good biomarkers yet IMEC 2013

IMEC 2013 THANK YOU FOR YOUR ATTENTION