Next eneration Sequencing Simon Rasmussen Assistant Professor enter for Biological Sequence analysis Technical University of Denmark
DNA Sequencing DNA sequencing Reading the order of bases in DNA fragments
Why NS? Transforming how we are doing biological science (and bioinformatics)
1st generation to NS 1,000,000,000 Single molecule? Pac Bio Kilobases per day per machine 100,000,000 10,000,000 1,000,000 100,0000 10,000 1,000 100 el-based systems Manual slab gel Automated slab gel Massively parallel sequencing apillary sequencing irst-generation capillary Microwell pyrosequencing Second-generation capillary sequencer Short-read sequencers Ion Torrent Illumina Solid 454 10 1980 1985 1990 1995 2000 Year 2005 2010 uture 1977 - Sanger hain-termination method Stratton et al., Nature 2009
Important feature: Output 1st Sanger: >800nt, very low Illumina: 35-100nt, 25b/day 2nd Roche 454: 200-600nt, 1b/day ABI Solid: 35, 60, 75nt, 10-15b/day 3rd Ion Torrent: 100-200nt, 10Mb-1b/chip Pacific Biosciences: 1300nt, 45Mb/chip
Important feature: Output 1st Sanger: >800nt, very low 1 machine pr. day: Illumina: 35-100nt, 25b/day 1 X 10 X 2nd Roche 454: 200-600nt, 1b/day ABI Solid: 35, 60, 75nt, 10-15b/day 3rd Ion Torrent: 100-200nt, 10Mb-1b/chip Pacific Biosciences: 1300nt, 45Mb/chip
Important feature: Output 1st Sanger: >800nt, very low 1 machine pr. day: Illumina: 35-100nt, 25b/day 1 X 10 X 2nd Roche 454: 200-600nt, 1b/day ABI Solid: 35, 60, 75nt, 10-15b/day 3rd Ion Torrent: 100-200nt, 10Mb-1b/chip Pacific Biosciences: 1300nt, 45Mb/chip
Important feature: Output 1st Sanger: >800nt, very low 1 machine pr. day: Illumina: 35-100nt, 25b/day 1 X 10 X 2nd Roche 454: 200-600nt, 1b/day ABI Solid: 35, 60, 75nt, 10-15b/day 3rd Ion Torrent: 100-200nt, 10Mb-1b/chip Pacific Biosciences: 1300nt, 45Mb/chip BI, based in hina, is the world s largest genomics research institute, with 167 DNA sequencers producing the equivalent of 2,000 human genomes a day.
Important reason: ost Drop in costs is faster than Moore s Law (omputer power doubles every 2 years)
Expensive storage Highest cost is (almost) not the sequencing but storage and analysis A standard human whole-genome sequencing exp. would create 200 b of data
Human sequencing irst draft genome of human in 2001, final 2004 Estimated costs $3 billion, time 13 years Today: Illumina: 1 week, 4000$ Exome: 6 weeks*, $998 * Real-time, not machine-time
How it works
irst generation: Sanger (dye) ragment DNA lone into plasmid and amplify Sequence using dntp + labelled ddntps (stops reaction) Run capillary electrophoresis and read DNA code Low output, long reads (~300-1000 nt), high quality
Next generation (2nd) ragment DNA Add adaptors (where primers can bind) and amplify, DNA amplification (empr, bridgepr) Immobilize single strand DNA to surface Perform sequencing by polymerase of the 2nd strand Detect nucleotides incorporated using fluorescence
2: Amplification and immobilization Emulsion PR (454, Solid): Water, oil, beads, one DNA template/droplet Bridge PR (Illumina): One DNA template/cluster, primers on surface, grow by bridging primers Metzker, Naten Rev. 2010
2: luorescence detection REVIEWS Illumina - yclic reversible termination 454 - Pyrosequencing a Illumina/Solexa Reversible terminators A T A T c Helicos BioSciences Reversible terminators Add all dntps Incorporate all four nucleotides, each label with a different dye labelled w. diff dye T A A T Incorporate single, dye-labelled nucleotides Load template beads into wells reate fourcolor image Wash, fourcolour imaging T A T Wash, onecolour imaging low one dntp across wells Polymerase incorporates nucleotide leave dye and repeat next cycle leave dye and terminating groups, wash T A T leave dye and inhibiting groups, cap, wash Release of PPi leads to light b Repeat cycles d Imaging, next Repeat cycles dntp Metzker, Naten Rev. 2010 T A
groups, wash groups, cap, wash 2: Imaging handout Repeat cycles Repeat b d Illumina 1: T A Illumina 2: T A T A Top: ATT Bottom: Top: Bottom: TAT ATA One-base-encoded probe An oligonucleotide sequence in which one interrogation base is associated with a particular igure 2 our-colour and one-colour cyclic reversible termination methods. a The four-colour cyclic termination (RT) method uses Illumina/Solexa s 3 -O-azidomethyl reversible terminator chemistry 23,101 (B solid-phase-amplified template clusters (I. 1b, shown as single templates for illustrative purposes). ollo imaging, a cleavage step removes the fluorescent dyes and regenerates the 3 -OH group using the reduci tris(2-carboxyethyl)phosphine (TEP) 23. b The four-colour images highlight the sequencing data from tw amplified templates. c Unlike Illumina/Solexa s terminators, 454: the Helicos Virtual Terminators 33 are labelled same dye and dispensed individually in a predetermined order, analogous to a single-nucleotide addition ollowing total internal reflection fluorescence imaging, a cleavage step removes the fluorescent dye and groups using TEP to permit the addition of the next y5-2 -deoxyribonucleoside triphosphate (dntp) an free sulphhydryl groups are then capped with iodoacetamide before the next nucleotide addition 33 (step d The one-colour images highlight the sequencing data from two single-molecule templates. Metzker, Naten Rev. 2010
groups, wash groups, cap, wash 2: Imaging handout - answers Repeat cycles Repeat b d T A T A T A Top: ATT Bottom: Top: Bottom: TAT ATA One-base-encoded probe An oligonucleotide sequence in which one interrogation base is associated with a particular igure 2 our-colour and one-colour cyclic reversible termination methods. a The four-colour cyclic termination (RT) method uses Illumina/Solexa s 3 -O-azidomethyl reversible terminator chemistry 23,101 (B solid-phase-amplified template clusters (I. 1b, shown as single templates for illustrative purposes). ollo imaging, a cleavage step removes the fluorescent dyes and regenerates the 3 -OH group using the reduci tris(2-carboxyethyl)phosphine (TEP) 23. b The four-colour images highlight the sequencing data from tw amplified templates. c Unlike Illumina/Solexa s terminators, the Helicos Virtual Terminators 33 are labelled same dye and dispensed individually in a predetermined order, analogous to a single-nucleotide addition ollowing total internal reflection fluorescence imaging, a cleavage step removes the fluorescent dye and groups using TEP to permit the addition of the next y5-2 -deoxyribonucleoside triphosphate (dntp) an free sulphhydryl groups are then capped with iodoacetamide before the next nucleotide addition 33 (step d The one-colour images highlight the sequencing data from two single-molecule templates. Metzker, Naten Rev. 2010
groups, wash groups, cap, wash 2: Imaging handout - answers Repeat cycles Repeat b d T A Quality of base call deteriorates after many T A cycles (eg. 75 cycles) T A Top: ATT Bottom: Top: Bottom: TAT ATA One-base-encoded probe An oligonucleotide sequence in which one interrogation base is associated with a particular igure 2 our-colour and one-colour cyclic reversible termination methods. a The four-colour cyclic termination (RT) method uses Illumina/Solexa s 3 -O-azidomethyl reversible terminator chemistry 23,101 (B solid-phase-amplified template clusters (I. 1b, shown as single templates for illustrative purposes). ollo imaging, a cleavage step removes the fluorescent dyes and regenerates the 3 -OH group using the reduci tris(2-carboxyethyl)phosphine (TEP) 23. b The four-colour images highlight the sequencing data from tw amplified templates. c Unlike Illumina/Solexa s terminators, the Helicos Virtual Terminators 33 are labelled same dye and dispensed individually in a predetermined order, analogous to a single-nucleotide addition ollowing total internal reflection fluorescence imaging, a cleavage step removes the fluorescent dye and groups using TEP to permit the addition of the next y5-2 -deoxyribonucleoside triphosphate (dntp) an free sulphhydryl groups are then capped with iodoacetamide before the next nucleotide addition 33 (step d The one-colour images highlight the sequencing data from two single-molecule templates. Metzker, Naten Rev. 2010
groups, wash groups, cap, wash 2: Imaging handout - answers Repeat cycles Repeat b d T A Quality of base call deteriorates after many T A cycles (eg. 75 cycles) T A Top: ATT Bottom: Top: Bottom: TAT ATA One-base-encoded probe An oligonucleotide sequence in which one interrogation base is associated with a particular igure 2 our-colour and one-colour cyclic reversible termination methods. a The four-colour cyclic termination (RT) method uses Illumina/Solexa s 3 -O-azidomethyl reversible terminator chemistry 23,101 (B solid-phase-amplified template clusters (I. 1b, shown as single templates for illustrative purposes). ollo imaging, a cleavage step removes the fluorescent dyes and regenerates the 3 -OH group using the reduci tris(2-carboxyethyl)phosphine (TEP) 23. b The four-colour images highlight the sequencing data from tw Homopolymer runs are amplified templates. c Unlike Illumina/Solexa s terminators, the Helicos Virtual Terminators 33 labelled same dye and dispensed individually in a predetermined problematic, order, analogous to a gives single-nucleotide rise to addition ollowing total internal reflection fluorescence imaging, a cleavage step removes the fluorescent dye and groups using TEP to permit the addition of the next y5-2 -deoxyribonucleoside indels triphosphate (dntp) an free sulphhydryl groups are then capped with iodoacetamide before the next nucleotide addition 33 (step d The one-colour images highlight the sequencing data from two single-molecule templates. Metzker, Naten Rev. 2010
3rd eneration Single molecule - no initial PR amplification This is nice because the PR step can/will introduce errors No underrepresentation of AT/-rich templates urrently: Pacific Biosciences, Ion Torrent (still pcr), Helicos(?) uture: Oxford Nanopore,...
Target enrichment Use DNA microarrays to enrich for wanted DNA/RNA Eg Human exome is ~50Mb compared to 3b
Ion Torrent The chip is the machine Based on semiconductors, ie. no fluorescence Release of hydrogen when a nucl. is incorporated is measured by ph-meter Small machine, low price pr. run
Applications of NS
Some applications of NS Whole genome re-sequencing Ancient genomes Metagenomics ancer genomics Exome sequencing (targeted) RNA sequencing hip-seq Epidemiology...
Anything with DNA