Illumina s Suite of Targeted Resequencing Solutions

Size: px
Start display at page:

Download "Illumina s Suite of Targeted Resequencing Solutions"

Transcription

1 Illumina s Suite of Targeted Resequencing Solutions Colin Baron Sr. Product Manager Sequencing Applications 2011 Illumina, Inc. All rights reserved. Illumina, illuminadx, Solexa, Making Sense Out of Life, Oligator, Sentrix, GoldenGate, GoldenGate Indexing, DASL, BeadArray, Array of Arrays, Infinium, BeadXpress, VeraCode, IntelliHyb, iselect, CSPro, GenomeStudio, Genetic Energy, HiSeq, HiScan, TruSeq, Eco, MiSeq and Nextera are registered trademarks or trademarks of Illumina, Inc. All other brands and names contained herein are the property of their respective owners.

2 TruSeq Sample Prep Solutions Integrated workflow from sample to analyzed data TruSeq DNA Simple, scalable, and cost effective TruSeq RNA Optimized, gel-free, low input TruSeq Custom & Exome Enrichment Lowest cost and most scalable targeted resequencing TruSeq Small RNA High-throughput mirna discovery & profiling Nextera Low input, fast TruSeq Chemistry Clustering & Sequencing 2

3 TruSeq Targeted Resequencing The simplest and most scalable targeted resequencing solutions TruSeq Exome Enrichment Targets = 100,000s TruSeq Custom Enrichment Targets = 1000s Multiplexed Amplicons Targets = 100s PCR Amplicons Targets = 10s 3

4 TruSeq Targeted Resequencing A broad suite of tools for discovery or validation experiments Option Amount of sequence Best for Availability TruSeq Exome Enrichment ~62 Mb Mendelian disease: case-control exome studies, rarer variants, causal variants exome-wide linkage analysis Now! TruSeq Custom Enrichment ~1 to ~10 Mb GWAS follow-up: validation of variants, variant discovery, pathways Mid-2011 TruSeq Custom Amplicon Sub-500 Kb Amplicon sequencing: highthroughput CE experiments, ultra deep seq, variant disc, screening 2H2011 Nextera + PCR Amplicons 100 s of bp targets Amplicon sequencing: ultra-deep sequencing, validation, screening, CE replacement Now! 4

5 Exome Sequencing Science Magazines Top Breakthroughs of 2010 Quantum motion machine Synthetic Biology Neanderthal Genome HIV Prophylaxis Exome Sequencing/Rare Disease Genes Molecular Dynamics Simulations Quantum Simulator Next-Generation Genomics RNA Reprogramming The Return of the Rat Published on December 16,

6 Exome Approach Success Evident by the Number of Publications Dramatic increase in number of exome publications over the past 3 years Major focus has been on study of Mendelian Disease 2008: 1 publication 2009: 6 publications 2010: 66 publications Huge increase in number of variants 6

7 TruSeq Exome Enrichment Pre-enrichment pooling and comprehensive coverage for the most cost-effective exome Most comprehensive exome solution High coverage uniformity Lowest DNA input Plate-based processing for up to 96 samples Simple & scalable workflow Launch of TruSeq Exome Pre-enrichment pooling of up to 6 samples Reduced hands-on time Decreased costs Gel-free protocol Integrated with TruSeq DNA Sample Prep Kits Optimized workflow Internal QC controls 7

8 TruSeq Exome Enrichment Kit Most up-to-date and comprehensive exome available Only empirically tested probes are included 8

9 TruSeq Exome Enrichment Workflow Three-day assay with <4 hours hands-on time TruSeq DNA Sample Prep (1 ug starting input) 9 *PCR, cluster generation & sequencing * 2 successive rounds of enrichment

10 Internal Quality Controls For TruSeq exome enrichment Internal controls for sample prep and exome enrichment with full software support Library prep controls: Enzymatic activity of End Repair A-Tailing Ligation reactions CTO (Custom Target Oligo) Set of specialized probes (150) in capture pool targeting nonpolymorphic regions across high, med, and low GC classes Also target known homo- & heterozygous SNPs CTE1 Control Target End-repair 1 CTE2 Control Target End-repair 2 CTA Control Target A-tailing CTL Control Target Ligation CTO Custom Target Oligo 10

11 % bases covered Highest Coverage Uniformity Coverage uniformity for 6-plex sample pooling* >80% of targeted bases covered at 0.2x of the mean coverage 11 *HiSeq 2000 run data

12 High On-Target Enrichment Enrichment rates* for 6-plex sample pooling *Percentage of reads mapping to target from total reads HiSeq 2000 run data. +/- 150bp includes flanking up- & down-stream regions of target 12

13 Probe-fragment-library design Fully optimized with the most efficient design Strategically designed to be larger than other commonly used methods Reduced cost less probes; savings passed on to customer Better coverage uniformity; can tolerate more variance in fragment size Less issues with problematic regions ie. GC content Sum length of ~340K x 95-mer probes is only ~32 Mb, the enrichment actually targets 62Mb of the human genome, or 117.5Mb if the 150 bases up/down stream are taken into account 13

14 Multiplexed sample enrichment A huge time and cost savings Developed and optimized for 6 samples per enrichment reaction Huge increase in throughput Massively reduced FTE time No impact on ability to call variants 14

15 Normalized coverage plots Determine amount of sequencing needed to achieve desired coverage e.g. 90% of bases covered at 10x 0.2 = Desired Coverage Mean Norm. Coverage 15

16 Calculating sequence amount needed Sum Regions Size x Coverage Enrichment % Alignment rate = PF Gb Eg. 50x TruSeq Exome: (62Mb x 50 / 0.65) /.90 = 5.3Gb 2.4 exomes / lane at 200Gb = exomes / lane at 600Gb = 115 Breakdown of TruSeq Exome Costs* 200Gb 600Gb Library Prep $54 $54 Enrichment $300 $300 Combo total $354 $354 Cluster Gen $259 $87 Sequencing (2 x 100 bp) $359 $120 Total Per Exome $972 $ *50x coverage per sample, processed on HiSeq at 2 x 100bp; list price.

17 TruSeq Exome Data Analysis Alignment and variant calling with CASAVA Target statistics, graphs Enrichment %, read distribution, coverage, normalized coverage plots, controls Genome Studio integration for visualization 17

18 Coming Soon! TruSeq Custom Enrichment Kits Same proven technology as in the TruSeq Exome Enrichment Kits Target 1 10 Mb of DNA per sample Highest enrichment efficiency and coverage uniformity Leverages Illumina s expertise in oligo production Interactive online design software High coverage of targeted regions Pre-enrichment Sample pooling Up to 12 samples per enrichment reaction Reduced hands-on time; increased throughput Integrated with TruSeq DNA Sample Prep Kits Fully optimized workflow Most cost-effective solution available Early access program has begun! First custom order expected summer 2011! 18

19 Coming Soon! TruSeq Custom Amplicon Sequencing Highly multiplexed, targeted amplicon resequencing Fully customized target probes and capture Based upon GoldenGate Technology Interactive probe design and ordering Streamlined user interface Rapid probe turnaround Rapid & economical amplicon sequencing Up to 384 amplicons per sample Plate-based processing; 96 samples per plate Assay time < 8 hours No additional hardware requirements Up to 10 more cost effective than CE* *Based on example study of 96 samples and 384 targets 19

20 TruSeq Custom Amplicon Assay Time 96 samples & 384 targets: from DNA to called variants in ~2 days Hybridization Setup Assay Biochemistry Library Normalization Cluster Gen & Sequencing Real-time Analysis Oligos, universal reagents Extension & Ligation, PCR Create pooled Library, normalize Pre-kitted sequencing reagents Alignments, variant calling 8am Day 1 2pm Day 1 5pm Day 2 <8 hr assay with <3 hr hands-on time No fragmentation required No gel purification steps No additional hardware 21

21 Simplest workflow and most convenient TRS offering One stop shop for entire targeted resequencing workflow Fully integrated, end-to-end solution including probe design, sample prep, enrichment, sequencing and data analysis Multiplexed sample enrichment (up to 12) Support Sales Training Master-mixed formulations and plate-based processing for up to 96 samples Analysis Sample Prep Internal quality controls for each assay step from library prep through enrichment with full software support Seq Cluster gen Enrichmen t 22

22 Epicentre Nextera Technology for Library Prep Single Tube, Rapid Library Prep SIMPLE, FAST LIBRARY PREP IN LESS THAN 2 HOURS Transposon mediated library preparation Closed tube DNA fragmentation Ultra-low input requirements (50 ng) ENABLES A RANGE OF CE AND NGS APPLICATIONS VALIDATED BY LEADING RESEARCHERS 23

23 TruSeq Sample Prep Kits for RNA & DNA Master-mixed formulations & gel-free RNA protocol Universal adapter design with embedded index Plate-based processing up to 96 samples; volumes optimized for liquid handling Low price, all-inclusive kit Simple workflow with minimal pipetting and clean-up steps Flexible Design one kit for single, paired-end, and mate paired reads Robust indexing solution High-throughput and automation friendly Convenient, one-stop shop Economical large-scale studies Internal quality controls Sample prep success monitoring with software support 26

24 A Sequencer for Every Need. Every Budget. Every Lab. Redefining the trajectory of sequencing. Powerful. Flexible. Scalable. Two proven technologies. One powerful platform. The most widely cited platform, now at half the price. My Samples. My Study. MiSeq HiSeq 2000 HiSeq 1000 HiScanSQ GA IIx MiSeq 27

25 28 Questions?

26 Using coverage uniformity curves to calculate output needed Inputs: Size of Exome (no. of targeted bases) = 62Mb Enrichment efficiency (fraction of reads on target) = 65% Uniformity = 80% of targeted bases at 0.2x mean coverage Desired minimum % of bases at specified coverage = 80% at 10x Normalized mean coverage =??? Mean sequencing coverage can be calculated from normalized coverage plots (normalized mean coverage) x (mean sequencing coverage) = 10x (0.2) x (mean sequencing coverage) = 10x; therefore, Mean sequencing coverage = 10x / 0.2 = 50x The total amount of sequence can be calculated as follows: Total amount of target sequence = 62Mb Mean sequencing coverage = 50x Enrichment efficiency (fraction of reads on target) = 65% (62Mb) x (50x) / (0.65) = 4.8Gb 29

27 Optimizing Coverage for Targeted Resequencing Determining the optimal amount of sequencing to achieve a desired coverage level TruSeq exome product specification: >80% of bases covered at 0.2 mean coverage If average/mean coverage is 100x, then >80% of bases are covered at 20x (20 reads per base) If average/mean coverage is 50x, then >80% of bases are covered at 10x (10 reads per base) Determine number of target bases Calculate fraction of reads on target Determine mean sequencing coverage with normalized coverage plots Calculate required amount of sequencing data Optimize coverage by leveraging: TruSeq Exome Scripts Mean normalized coverage curves 30

28 TruSeq Exome Enrichment Kits Pricing Catalog Number FC FC FC FC FC FC FC Product TruSeq Exome Enrichment Kit 8 TruSeq Exome Enrichment Kit 24 TruSeq Exome Enrichment Kit 48 TruSeq Exome Enrichment Kit 96 TruSeq Exome Enrichment Kit TruSeq Exome Enrichment Kit TruSeq Exome Enrichment Kit 960 Reactions per kit Samples per kit (6- plex) Kit Price Price per sample 8 48 $14,400 $ $39,600 $ $72,000 $ $129,600 $ $230,400 $ $504,000 $ $864,000 $150 Orders from Nov 3, 2010; Shipping from Nov 22 31

29 TruSeq Exome Data Analysis Overview of outputs from TruSeq Exome Scripts data visualization & QC GC control probes Probes selected to have low, medium or high GC content Independent of the probes targeting the exome Read distribution Shows the number of reads around the center of the targeted regions Coverage & mean coverage levels Shows the fraction of targeted bases that are covered at a given coverage level Compare data on the same scale from runs that have different mean coverage levels Control SNPs Targeted at known SNPs that are not in any targeted region 33

30 TruSeq Exome Data Analysis Leverage GenomeStudio software for simple reporting on exome data BED file seamlessly defines targeted regions Regions table allows for easy selection of targets Navigate regions and view annotation data in GenomeStudio s viewer View multiple samples in single table or IGV window 34