Insights from the first RT-qPCR based human transcriptome profiling based on wet lab validated assays

Size: px
Start display at page:

Download "Insights from the first RT-qPCR based human transcriptome profiling based on wet lab validated assays"

Transcription

1 Slide 1 of 38 Insights from the first RT-qPCR based human transcriptome profiling based on wet lab validated assays Jan Hellemans, PhD CEO Biogazelle qpcr & NGS 2013 Freising, Germany March 19, 2013

2 Biogazelle Slide 2 of 38

3 The biogazelle team and collaborators Slide 3 of 38 Biogazelle! Barbara D haene! Pieter Mestdagh! Gaëlle Van Severen! Nele Nijs! Anthony Van Driessche! Manuel Luypaert! Shana Robbrecht! Ariane Deganck! Jo Vandesompele Ghent University! Steve Lefever VIB nucleomics core Bio-Rad

4 Introduction Slide 4 of 38 qpcr: reference technology for nucleic acid quantification! sensitivity and specificity! wide dynamic range! speed! relative low cost! conceptual and practical simplicity easy to perform easy to do it right! many steps involved! all need to be right

5 Prepare cycle report Slide 5 of 38 experiment design relative quantification samples assays P C R prepare cycle report quality control statistical analysis

6 Prepare cycle report Slide 6 of 38 experiment design relative quantification samples assays P C R prepare cycle report quality control statistical analysis

7 Assays & MIQE Slide 7 of 38 design! amplicon length! primer positions (exonic or intron-spanning)! transcript coverage in-silico! specificity prediction (retropseudogenes and other homologues)! secondary structure analysis wet lab! specificity assessment (gel, melt, sequence)! Cq of NTC (for SYBR assays)! amplification efficiency determination (slope, E, SE(E), r²)

8 Dealing with MIQE Slide 8 of 38 DIY experts in qpcr! spend a lot of effort in doing it right DIY novel to qpcr! adhering to the MIQE guidelines is a challenge users of commercial assays! if they sell it, it must be good

9 Dealing with MIQE Slide 9 of 38 DIY experts in qpcr! spend a lot of effort in doing it right à save time DIY novel to qpcr! adhering to the MIQE guidelines is a challenge à focus on biological question rather than technical qpcr challenges users of commercial assays! if they sell it, it must be good à have proof that it is good

10 The perfect assay Slide 10 of 38 Properties of the perfect assay! specific for the gene of interest (no off-target amplification)! detection of all transcript variants! detection not affected by polymorphisms (no allelic bias or drop out)! amplification efficiency ~100%! no gdna co-amplification! no primer dimer formation

11 The perfect assay Slide 11 of 38 Some genes cannot have a perfect assay! no unique sequences (homology with other genes pseudogenes)! not a single part of the gene occurs in all transcripts! regions are excluded because of repeats, secondary structures, SNPs, homology,... Make the best possible compromise and report any potential issues Design à in-silico quality control à lab validation

12 Assay designs Slide 12 of 38 primerxl (UGent)! database of genomic information! tools for target region selection

13 Gene sequence fragmentation for target region selection Slide 13 of 38 1 gene, 3 transcripts, 6 fragments (coverage frequency 1 to 3) gene transcript 1 transcript 2 transcript

14 Assay designs Slide 14 of 38 primerxl (UGent)! database of genomic information! tools for target region selection! primer3 based primer design! analysis of secondary structures and SNPs in primer binding regions! specificity prediction (BiSearch)! relaxation cascade

15 BiSearch specificity prediction Slide 15 of 38 BiSearch loose! BiSearch strict! ! only the gene of interest

16 BiSearch specificity prediction Slide 16 of 38 BiSearch loose! BiSearch strict! ! only the gene of interest (FFAR2) reads seq gene_list official_symbol loca8on 2843 CATGGCAGTCACCATCTTCTGCTACTGGCGTTTTGTGTGGATCATGCTCTCCCAGCCCCTTGTGGGGGCCCAGAGG CGGCGCCGAGCCGTGGGGCTGGCTGTGGTGACGCTGCTCAATTTCCTGGTGTGCTTCGGACCTTACAGATCGGAA 1897 GTAAGGTCCGAAGCACACCAGGAAATTGAGCAGCGTCACCACAGCCAGCCCCACGGCTCGGCGCCGCCTCTGGGCC CCCACAAGGGGCTGGGAGAGCATGATCCACACAAAACGCCAGTAGCAGAAGATGGTGACTGCCATGAGATCGGAA 1535 GTAAGGTCCGAAGCACACCGAGAGCTGGGAGCAGGAGCTACACAGTCTGCTGGCCTCACTGCACACCCTGCTGGGG GCCCTGTACGAGGGAGCAGAGACTGCTCCTGTGCAGAATGAAGGCCCTGGGGTGGAGATGCTGCTGTCCTCAGAA 1097 CATGGCAGTCACCATCTTCTGAGGACAGCAGCATCTCCACCCCAGGGCCTTCATTCTGCACAGGAGCAGTCTCTGC TCCCTCGTACAGGGCCCCCAGCAGGGTGTGCAGTGAGGCCAGCAGACTGTGTAGCTCCTGCTCCCAGCTCTCGG 1091 CATGGCAGTCACCATCTTCTGAGGACAGCAGCATCTCCACCCCAGGGCCTTCATTCTGCACAGGAGCAGTCTCTGC TCCCTCGTACAGGGCCCCCAGCAGGGTGTGCAGTGAGGCCAGCAGACTGTGTAGCTCCTGCTCCCAGCTCTCGGT ENSG FFAR2 19: ENSG FFAR2 19: ENSG AC : ENSG AC : ENSG AC :

17 Gene homology prevents perfect designs Slide 17 of 38 50% 45% 40% 35% 30% 25% 20% 15% 10% 5% 0% distances (clustalw) between all genes without perfect design / 2043 (75%) of genes without perfect design have homologous genes that differ less than 12.5% (2 variations per 16 bases)

18 Wet lab validation Slide 18 of 38 PCR composition! total volume: 5 ul! instrument: CFX-384 (with automation)! mastermix: SsoAdvanced SYBR! primer conc: 250 nm each PCR program! default cycling protocol for SsoAdvanced SYBR (Ta=60 C) Samples! cdna: 25 ng (total RNA equivalents Agilent Universal human reference RNA)! gdna: 2.5 ng (Roche)! NTC: water + carrier (5 ng/µl yeast transfer RNA)! synthetic template (pooled 60-mers in concentration range: 2E7 2E1 copies)

19 Some numbers Slide 19 of 38 lab validation of assays (human and mouse) reactions PCR plates (384-well) equivalent to PCR plates (96 well) 172m

20 Two generations of external oligonucleotide standards Slide 20 of 38 Vermeulen et al., Nucleic Acids Research, 2009! 55-mer! standard desalted! 3 blocked to prevent elgongation! 5 points dilution series: molecules > 15 molecules FP stuffer RCRP New approach: easier + cheaper + as good! 60-mer! first (5 ) and last (3 ) 30 nucleotides of amplicon sequence! standard desalted! no 3 blocking! 7 points dilution series: > 20 molecules 30 nt 5 30 nt 3

21 Synthetic templates are equivalent to natural templates Slide 21 of 38 comparison between short ss synthetic template and full length ds template! > 300 assays ds template ss oligo r²< median E average E count E <> [ ] 1 3 paired t-test p-value

22 Efficiency evaluation Slide 22 of 38 amplification efficiency! 6 orders of magnitude! 20 20M copies! linear over entire range! LOD (LOQ) 20 molecules! E in % range

23 Efficiency distribution (n = ) Slide 23 of 38 89%

24 Efficiency distribution (n = ) Slide 24 of 38 redesign 89% redesign

25 NGS as preferred method for specificity assessment Slide 25 of 38 amplicon sizing ( + melt analysis for SYBR assays)! limited sensitivity for detecting low level non-specific coamplification! failure to observe non-specific amplification of sequences with similar size and/or Tm e.g. expressed pseudogenes or homologous genes Next level of specificity assessment! in-silico specificity predictions by BiSearch! massively parallel sequencing of pooled PCR products! average coverage > 1000-fold à lab specificity > 99.9%! times more sensitive than size analysis and Sanger sequencing

26 Most assays are 100% on-target Slide 26 of 38

27 2/3 of non-specific assays may go unnoticed without NGS Slide 27 of 38 % on-target 100% 75% 50% 25% 0% assays with off-target reads 0.9 < x < < x < < x < < x < < x < < x < < x < < x < < x < < x < 0.1 0% 10% 20% 30% 40% 50% 60%

28 MIQE compliant PrimePCR assay validation data sheet Slide 28 of 38

29 MAQC Slide 29 of 38 Micro-array quality control study! A = universal RNA! B = brain RNA! C = ¾ A + ¼ B! D = ¼ A + ¾ B! ~ 100,000 PCRs! reproducibility! titration response! accuracy! dynamic range A C D B

30 Reproducibility (n = 3 678) Slide 30 of 38 Expression A = expression B à expression C = expression D

31 Titration response (n = ) Slide 31 of 38 Expression A > Expression B à A > C > D > B

32 Accuracy (n = 785) Slide 32 of 38 No expression in A or B à C/D = 3 or 1/3 à dcq = 1.58

33 dynamic range Slide 33 of human mouse > fold gene count copies per cell

34 qpcr detects more brain genes than micro-arrays (n = 12 Slide 853) 34 of 38 array qpcr

35 SEQC Slide 35 of 38 Comparison of our qpcr data to SEQC results! correlation for common genes

36 qpcr vs sequencing depth Slide 36 of 38 qpcr NGS Sum of detected copies of measured protein coding genes: ~500 million high throughput gene expression profiling: 125 indexed samples (two concurrent HiSeq flow cells at about 25 million reads per sample) high resolution transcriptome analysis: more than 50 samples 100 million reads per sample

37 Human vs mouse Slide 37 of gene count dcq (human-mouse)

38 Conclusions Slide 38 of 38 Assay design and in-silico validation! qpcr assays for protein coding genes in human and mouse! Transcript coverage! SNPs and secondary structures! Specificity prediction Lab validation! E in % range! Stringent specificity analysis by NGS qpcr based transcriptome profiling! High sensitivity and dynamic range! MAQC screening allows for cross-platform comparison