Sample to Insight. Dr. Bhagyashree S. Birla NGS Field Application Scientist

Size: px
Start display at page:

Download "Sample to Insight. Dr. Bhagyashree S. Birla NGS Field Application Scientist"

Transcription

1 Dr. Bhagyashree S. Birla NGS Field Application Scientist

2 NGS spans a broad range of applications DNA Applications Human ID Liquid biopsy Biomarker discovery Inherited and somatic SNP Microsatellite instability Tumor mutation burden Methylomics (gene regulation) Single cell RNA Applications Gene expression Pathway analysis Biomarker discovery Allele specific expression Fusion genes in cancer Tumor heterogeneity Immune repertoire Single cell 2

3 QIAGEN NGS Portfolio QIAseq DNA RNA WGS WES Metagenome Targeted WTS mirna/small RNA Targeted Methylome Single cell SNP in/del TMB MSS Methyl mrna/lncrna Single cell Gene fusions T cell receptor 3

4 The problem with patient samples Low purity Cancer cells may only be a minor fraction of total sample Heterogeneity Multiple sub-clones of cancer may be present in one tumor sample Rare targets 1000X coverage is required to get >90% sensitivity to detect ~5% mutation frequency Whole genome / exome is expensive and will not yield sufficient coverage 4

5 Targeted What is targeted? Sequencing a region or subset of the genome or transcriptome Why targeted? Not all regions of the genome or transcripts are relevant to a specific study Exome Sequencing: most of the coding regions of the genome (exome). The protein-coding region constitutes less than 2% of the entire genome Focused panel/hot spot : focused on the genes or regions of interest e.g. Clinical relevance tumor supressor genes, inherited mutations What are the advantages of targeted? More coverage per sample, more sensitive detection 1 gene copy ~ 3 pg, 3000 copies in 10 ng Heterogeneous sample 1% tumor cell = 30 copies in 10ng Every base not covered equally in typical NGS experiment More samples per run, lower cost per sample Compatible with less than ideal sample quality biofluids, FFPE 5

6 QIAGEN Targeted NGS Portfolio QIAseq DNA RNA WGS WES Metagenomic Targeted WTS mirna/small RNA Targeted Methylome Single cell QIAseq DNA Methyl QIAseq RNA QIAseq UPX QIAseq RNAScan T cell receptor 6

7 Benefits of targeted Targeted DNA delivers accurate information required for precision medicine Attribute/ parameter Whole genome Whole exome Targeted DNA Benefits of targeted DNA Information level 3 x 10 9 bps 5 x 10 7 bps 6 x 10 4 bps More relevant data Cost per sample $5000 $2000 $200 More cost effective Coverage achieved 30x 100x 1000x Detect low-frequency mutations DNA input 1 µg ng 10 ng Lower DNA requirements No. of samples multiplexed Higher multiplexing capabilities Targeted reduces cost per sample and required DNA input 7

8 Typical challenges in targeted DNA Inability to detect lowfrequency mutations PCR and errors Limits sensitivity and accuracy of calling low-frequency variants o Doesn t let you confidently call variants down to 1% variant allele frequency (VAF) Inefficient enrichment and of GC-rich regions Suboptimal, GC-rich region-incompatible PCR chemistry Limits comprehensiveness of panel coverage o Doesn t let you efficiently sequence clinically-relevant genes such as CEBPA or CCND1 or clinically-relevant regions such as TERT promoter Suboptimal uniformity of enrichment and Conventional PCR protocols and two-primer amplicon design Increases variability in coverage across targeted genomic regions o o Causes you to over-sequence to accommodate the under-sequenced Doesn t let you call variants in low-depth regions Mainly due to intrinsic limitations of PCR amplification approaches. 8

9 Boosting NGS sensitivity with error correction UMI: Unique molecular index A tag (barcode) to identify unique DNA molecules nnnnnnnnnnnn (12 nucleotides long) Incorporate this random barcode (signature) into the original DNA or RNA molecules before amplification to preserve their uniqueness 9

10 Boosting NGS sensitivity with error correction PCR and errors (artifacts) limit variant calling accuracy Conventional targeted DNA EGFR exon 21 A mutation is seen in 1 out of 100 reads that map to EGFR exon Cannot accurately tell whether the mutation is: 1. A PCR or error (artifact)/false positives, or 2. A true low-frequency mutation of 1% * Variant calling based on non-unique reads does not reflect the mutational status of original DNA molecules 10

11 Implementing molecular barcoding in NGS Count and analyze single original molecules (not total reads) = digital Conventional targeted DNA EGFR exon 21 Five reads or library fragments that look exactly the same. Cannot tell whether they represent: 1. Five unique DNA molecules, or 2. Five reads of the same DNA molecule (PCR duplicates) Digital with UMIs UMIs before any amplification UMI UMI Five unique DNA molecules Quintuplets of the same DNA molecule (PCR duplicates) 11

12 Achieve accurate variant calling with molecular barcodes Count and analyze single original molecules (not reads) = digital Conventional targeted DNA A mutation is seen in 1 out of 5 reads that map to EGFR exon 21. Cannot accurately tell whether the mutation is: 1. A PCR or error (artifact) / false positives, or 2. A true low-frequency mutations EGFR exon 21 * UMIs before any amplification Digital with UMIs UMI * * False variant is present in some fragments carrying the same UMI True variant is present in all fragments carrying the same UMI 12

13 UMIs in the QIAGEN NGS Portfolio The use of UMI increases sensitivity in targeted DNAseq and quantification accuracy in targeted RNAseq QIAseq DNA RNA WGS WES Metagenomics Targeted WTS mirna/small RNA Targeted Whole Methylome DNA Methyl mrna/lncrna Gene fusions Immune WGS, WES, WTS, Metagenomics usually too shallow to benefit from UMIs UMIs need to be read more than once for error correction to be effective 13

14 SPE technology: DNA variant analysis Enzymatic fragmentation SPE: More tolerant of fragmentation than PCR Random fragmentation yields unique molecules More tolerant of large primer pools than PCR Easier to optimize Unmatched uniformity 14

15 Specifications of QIAseq targeted DNA panels Somatic and germline SNP, in/del, CNV, MSI, TMB, mitochondrial genome DNA input Primer multiplexing level Enrichment technology Amplicon size Sample multiplexing Total workflow DNA-> Library Variant allele or fusion frequency called As little as 10 ng DNA Up to 20,000 primers per single reaction Single primer extension (SPE) with UMI Average 150 bp (tunable enzymatic fragmentation) 384 (Illumina), 96 (Ion Torrent) 8 hours 0.5% across entire panel 15

16 QIAseq DNA panel performance Panel Panel size (bases) Uniformity (0.2x mean base %) Pharmacogenomics Panel 3, Actionable Solid Tumor Panel 15, BRCA1 And BRCA2 Panel 16, Mitochondria Panel 16, BRCA1 And BRCA2 Plus Panel 25, Colorectal Cancer Panel 215, Lung Cancer Panel 318, Breast Cancer Panel 370, Myeloid Neoplasms Panel 436, Comprehensive Cancer Panel 836, Inherited Diseases Panel 838, TMB Panel 1,500, High uniformity is critical for low level mutation calling 16

17 Content for a wide range of applications Cancer Breast cancer Colorectal cancer Myeloid Neoplasms Lung cancer BRCA1 & BRCA2 BRCA1 & BRCA2 plus Actionable solid tumor Comprehensive cancer Pharmacogenomics Mitochondria Inherited disease Custom Your own: Genes Exons Hotspots/SNPs Intronic regions Extendable Easily increase panel content with SPE primers QIAseq targeted DNA panels; Raed N. Samara, PhD 17

18 18 QIAseq Targeted RNA Sequencing RNA NGS logistics Dynamic range: 1 to 100,000 copies of any one RNA in a cell Celle/tissue With a 10 5 range, read budget has to be high enough to capture targets of interest Biofluids FFPE Targeted RNA NGS allows the rational design of a panel to maximize value and throughput RNAseq Biopsy Single Cell Further complexities Even within a pure cell population there can be significant cell-to- cell variability in gene expression. Bulk analysis of RNA expression represents a steady state average of a complex, dynamic sample CTC

19 Targeted gene expression: Why NGS? cells HEK293T Cells treated with 90 different small molecule inhibitors. One day from RNA to sequence ready libraries for 96 samples. Overnight run on NextSeq treated cells RNA Indexed libraries Normalized, pooled libraries 19

20 20 QIAseq RNAscan: fusion gene detection and classification A fusion gene is a hybrid gene formed from two previously separate genes. It can occur as a result of: translocation, interstitial deletion, or chromosomal inversion.

21 RNAscan workflow: cdna synthesis, then SPE

22 QIAseq targeted NGS a complete solution Sample isolation Library construction and target enrichment NGS run Data analysis Interpretation RNA panels DNA panels RNAScan QIAseq UPX Biomarker development and pathway analysis Mutations, indels, copy number variants, Human ID Fusions in disease Single cell high throughput analysis 22

23 Thank You! 23