Whole Genome Amplification (WGA): What to Do When You Don t Have Enough Genomic DNA

Size: px
Start display at page:

Download "Whole Genome Amplification (WGA): What to Do When You Don t Have Enough Genomic DNA"

Transcription

1 Whole Genome Amplification (WGA): What to Do When You Don t Have Enough Genomic DNA Rob Brazas, Ph.D. Senior Product Manager, Lucigen January,

2 Agenda Improving Whole Genome Amplified DNA Quality What is whole genome amplification? Experimental challenges of inaccurate or incomplete WGA PCR- and MDA-based methods of WGA and their strengths and weaknesses Variation on standard MDA: Sygnis TruePrime WGA Kit Technology Overview of TruePrime WGA Kits Comparison of Sequencing Results of WGA gdna from TruePrime Single Cell WGA Kit and Other WGA Kits/Methods Summary Final Thoughts

3 Whole Genome Amplification: Production of µg of gdna from ng or Less Perfect Whole Genome Amplification Homologous Chromosomes Whole Genome Amplification Alleles From what and why? gdna amplification from: Single cells Limiting amounts of purified gdna samples (biopsies, metagenomic samples, etc.) Produce enough material for your experiments and archiving

4 Whole Genome Amplification: Real-Life Amplification is Not Perfect Perfect Whole Genome Amplification Homologous Chromosomes Whole Genome Amplification Alleles Real-Life Whole Genome Amplification Homologous Chromosomes Whole Genome Amplification Alleles

5 Whole Genome Amplification: Multiple Types of Amplification Errors Occur Perfect WGA Real-Life WGA Uniform amplification of entire chromosomes Accurate representation of each set of alleles o AB, AA, BB High fidelity amplification no errors (No SNV, INDEL creation) Amplified gdna = Starting gdna except there s more of it Uneven amplification across chromosomes (missing areas, uneven amplification) Loss of heterozygosity o AB AA or AB BB Error introduction false SNVs Introduction of contaminants Creation of chimeras Amplified gdna Starting gdna

6 Inaccurate Whole Genome Amplification Creates Multiple Experimental Challenges Incomplete whole genome sequencing results Difficulty assembling whole genomes due to contaminating sequences Changes in species abundance (representation) within a population sample Inaccurate or difficulty identifying: SNV (Single Nucleotide Variants) CNV (Copy Number Variation) Structural variation

7 From: Blainey 2013 FEMS Microb. Rev PCR-based WGA Methods Based on Various Primer Designs LA-PCR (Linker-Adaptor or Ligation-Anchored PCR) 1. Fragment 2. Ligate adaptors with embedded PCR primer sites 3. Amplify by PCR IRS-PCR (Interspersed Repetitive Sequence PCR) 1. Primers designed to known repetitive elements 2. Amplify by PCR PEP-PCR (Primer Extension Preamplification PCR) 1. Random 15-mer PCR primers 2. Amplify by PCR under permissive priming conditions

8 From: Blainey 2013 FEMS Microb. Rev PCR-based WGA Methods Based on Various Primer Designs DOP-PCR (Degenerate Oligonucleotide Primed PCR) 1. Primers with degenerate 3 ends (~6 bp constant at 3 end) and constant 5 ends 2. Primer extension at random sites, low temp annealing 3. PCR amplification at higher temps D-DOP-PCR (Displacement - Degenerate Oligonucleotide Primed PCR) 1. Primers with degenerate 3 ends and constant 5 ends 2. Primer extension at random sites, low temp annealing, strand displacement of newly synthesized strands by others 3. PCR amplification at higher temps with added 5 -end specific primers

9 From: Blainey 2013 FEMS Microb. Rev Multiple Displacement Amplification WGA Methods Based on DNA Pols with Strand Displacement Activity MDA (Multiple Displacement Amplification) 1. Random hexamer primers 2. Extended by DNAP with strong strand displacement activity 3. Isothermal reaction temp Variant of MDA Available pwga based on a reconstituted T7 replication system (Li, Y. et. al. Nuc. Acids Res. E79 (2008)

10 From: Blainey 2013 FEMS Microb. Rev Multiple Displacement Amplification WGA Methods Based on DNA Pols with Strand Displacement Activity SPIA (Single Primer Isothermal Amplification) 1. Primers with specific RNA sequence fused to partially degenerate DNA primer sequence 2. Primer extension at set temperature 3. Degradation of RNA portion of primer with RNaseH 4. Reinitiation with new RNA/DNA primer and strand displacement extension

11 From: Blainey 2013 FEMS Microb. Rev Multiple Displacement Amplification WGA Methods Based on DNA Pols with Strand Displacement Activity MALBAC (Multiple Annealing and Looping-based Amplification Cycles) 1. Primers with degenerate 3 ends and constant 5 ends 2. Primer extension (quasi-linear amp) at random sites with thermocycling 3. Products with primer sequences at both ends loop due to sequences within DNA portion of primers 4. Conventional PCR amplification

12 Strengths and Weaknesses (Perceived and Real) of PCR and MDA WGA Systems MDA-based PCR-based Amplified Fragment Lengths kb ~1-2 kb Nucleotide Error Rate Chimera Formation Higher (?) Lower (?) Completeness of Genome Coverage High 10 70% Variability of Amplification High Low CNV Detection Poor Good Duplicate Formation Lower Higher Allelic Dropout (ADO) (AB AA or BB) 5-50% (?)? SNV Detection OK +/- Protocol Simple Often multi-step

13 Focus On MDA Due to Completeness of Genome Coverage Kits/Methods Used REPLI-g Single-Cell Kit (Qiagen) Commercially available MDA kit utilizing random hexamers as primers TruePrime WGA Kits (Sygnis Kits, Single Cell and general purified gdna Kits) Primase enzyme synthesizes primers in place of random primers Generic MDA WGA Kit TruePrime components with random hexamers substituted for Primase enzyme MALBAC Single Cell WGA Kit (Yikon Genomics) Majority of data shown are published in: Picher, AJ et al. Nat. Comm. 7:13296 (2016) or provided to Lucigen by Sygnis

14 Sygnis TruePrime Kit Methodology Primase Enzyme Synthesizes Initial Primers 1. TthPrimPol (Primase) binds denatured DNA at random sites 2. Primase synthesizes short DNA primers 3. Phi29 DNA pol displaces Primase and begins polymerization 4. Phi29 DNA pol performs strand displacement 5. Primase binds to newly formed DNA and synthesizes new DNA primers 6. Phi29 DNA pol displaces Primase, binds DNA primers and begins polymerization

15 Protocols for Sygnis TruePrime Kits Simple Isothermal Amplification Reactions TruePrime Single Cell WGA Kit v2 Protocol TruePrime WGA Kit Protocol

16 Size of Products Amplified from a Single HEK293 Cell MALBAC Produced Small Fragments MALBAC kb TruePrime kb REPLI-g 9 19 kb 0.8% Agarose Gel Picher, AJ et al. Nat. Comm. 7:13296 (2016) Tapestation Plots

17 Yield of Amplified DNA with Primase vs. RPs 100X Greater Sensitivity with TruePrime Kit (Primase) Generic WGA (TruePrime with Random Primers) TruePrime WGA Kit Human gdna (Promega) input at indicated amounts TruePrime WGA Kit and protocol (or with Random Primers substituted for Primase) 3 hr incubation at 30 C DNA quantitation using Quant-iT PicoGreen dsdna Assay Kit (ThermoFisher) Picher, AJ et al. Nat. Comm. 7:13296 (2016)

18 Decreased Creation/Amplification of Random Primer Artefacts with TruePrime WGA Kit TruePrime WGA Kit Generic WGA (TruePrime with Random Primers) 1 pg human gdna (Promega) TruePrime WGA Kit and protocol (or with Random Primers substituted for Primase) Subjected to next gen sequencing and mapped back to known genomes

19 At 1 fg of Input, 95% of TruePrime WGA Kit Amplified gdna is Target Derived Varied inputs of human gdna (Promega) Amplified with TruePrime WGA Kit for 6 hr Subjected to next gen sequencing and mapped back to known genomes

20 WGA with Random Primers is More Sensitive to Contaminating DNA than TruePrime WGA Kit TruePrime WGA Kit Generic WGA (TruePrime with Random Primers) 1 pg denatured human gdna (Promega) + 1 ng non-denatured yeast gdna TruePrime WGA Kit and protocol (or with Random Primers substituted for Primase) Subjected to next gen sequencing and mapped back to known genomes

21 Sequencing Analysis WGA Followed by Illumina Sequencing Single HEK293 cells were amplified by WGA using various kits/methods TruePrime Single Cell WGA Kit Generic MDA WGA Kit (TruePrime Kit with random primers in place of primase) REPLI-g Single Cell WGA Kit (Qiagen) MALBAC Single Cell WGA Kit (Yikon Genomics) Libraries were made and sequenced by: Shearing using Covaris Focused-Ultrasonicator Constructing libraries using NEBNext DNA Library Prep Kit (NEB) which includes PCR Deep sequencing on a HiSeq 2500, 2 x 125 bp, v4 chemistry Sampling and analysis of specific number of reads based on experimental goals Picher, AJ et al. Nat. Comm. 7:13296 (2016)

22 Genomic Coverage TruePrime Kit Coverage is Closest to Non-Amplified X Y Non-amplified TruePrime SC WGA Kit REPLI-g SC WGA Kit MALBAC SC WGA Kit Generic MDA WGA Kit Analyzed 12 million read pairs per sample 50 kb bin size Averaged coverage at each 50 kb interval Picher, AJ et al. Nat. Comm. 7:13296 (2016)

23 Coverage (0-50X) Genomic Coverage of Chromosome 3 TruePrime Kit Coverage is Closest to Non-Amplified Chromosome 3 Position Non-amplified TruePrime SC WGA Kit REPLI-g SC WGA Kit MALBAC SC WGA Kit Generic MDA WGA Kit Analyzed 12 million read pairs per sample Picher, AJ et al. Nat. Comm. 7:13296 (2016)

24 Coverage (0-50X) Zoomed Genomic Coverage of Chromosome 3 TruePrime Continues to Match Non-Amplified Non-amplified TruePrime SC WGA Kit REPLI-g SC WGA Kit MALBAC SC WGA Kit Generic MDA WGA Kit ~25M bp Picher, AJ et al. Nat. Comm. 7:13296 (2016)

25 Coverage (0-50X) Zoomed Genomic Coverage of Chromosome 3 TruePrime Continues to Match Non-Amplified Non-amplified TruePrime SC WGA Kit REPLI-g SC WGA Kit MALBAC SC WGA Kit Generic MDA WGA Kit ~500K bp Picher, AJ et al. Nat. Comm. 7:13296 (2016)

26 Breadth of Coverage (%) Average Breadth (%) of Coverage Similar TruePrime Coverage to NA Except Chr19,22 Non-amplified TruePrime SC WGA Kit REPLI-g SC WGA Kit MALBAC SC WGA Kit Analyzed 12 million read pairs per sample Drop in TruePrime coverage for chr19 and chr22 Other MDA approaches as well Chromosome Picher, AJ et al. Nat. Comm. 7:13296 (2016)

27 TruePrime Kit Coverage is Highly Reproducible and Parallels Non-amplified Well X Y 1 2 Non-amplified TruePrime SC WGA Kit (Replicates 1-4) Analyzed 5 million read pairs per sample Chromosome Picher, AJ et al. Nat. Comm. 7:13296 (2016)

28 Coverage (0-20X) TruePrime Kit Chr4 Coverage is Highly Reproducible and Parallels Non-amplified Chromosome 4 Position Chromosome 4 Position Non-amplified TruePrime SC WGA Kit Replicate 1 Replicate 2 Replicate 3 Replicate 4 Analyzed 5 million read pairs per sample Picher, AJ et al. Nat. Comm. 7:13296 (2016)

29 Coverage (0-30X) Coverage (0-20X) Even Zoomed-in, TruePrime Chr6 Coverage is Highly Reproducible & Parallels Non-amplified Chromosome 6 Position Non-amplified TruePrime SC WGA Kit Replicate 1 Replicate 2 Replicate 3 Replicate 4 Analyzed 5 million read pairs per sample Picher, AJ et al. Nat. Comm. 7:13296 (2016)

30 One Last Look at Coverage Using Deep Sequencing Non-amplified TruePrime SC WGA Kit REPLI-g SC WGA Kit MALBAC SC WGA Kit No significant differences in errors Picher, AJ et al. Nat. Comm. 7:13296 (2016)

31 Copy Number Making CNV Calls with WGA Amplified Material CNV Calling is Possible with TruePrime Results Non-amplified TruePrime SC WGA Kit REPLI-g SC WGA Kit Used deep sequencing results Calculated # of reads per 500 kb bin Deduced ploidy level using Gingko analysis software and reads per bin More reads = greater copy number and vice versa MALBAC SC WGA Kit Generic MDA WGA Kit Highly varied coverage from REPLI-g, MALBAC and Generic MDA makes CNV calling difficult. Picher, AJ et al. Nat. Comm. 7:13296 (2016)

32 Making SNV Calls with WGA Amplified Material Better SNV Overlap of TruePrime Samples with NA SNV # Used deep sequencing results Used 4 different SNV callers to identify/analyze SNV and Het>Hom conversion ISAAC, Samtools / Bcftools, Varscan2, CLC Low Frequency Caller Median results are shown Median # Overlapping Non- Amplified (NA) Median % Overlapping NA Numbers varied considerably depending the analysis program used Het>Hom SNV Conversion (ADO) Non-amplified 3.02M 3.02M 100% 0% TruePrime TM SC WGA Kit 2.72M 2.42M 80% 5.95% REPLI-g SC WGA Kit 1.65M 1.37M 45% 29.65% MALBAC TM SC WGA Kit 2.55M 0.82M 30% 31.05% Better SNV overlap and lower Het>Hom conversion rates with TruePrime amplified samples when compared to non-amplified sample results. Picher, AJ et al. Nat. Comm. 7:13296 (2016)

33 Summary: TruePrime WGA Kits More Uniform Amplification Improves NGS Results High sequencing breadth of coverage nearly equal to non-amplified samples 91.27% at ~19X depth vs % at ~19X depth for non-amplified Significantly better than REPLI-g and MALBAC results (85.6%, 58.9% respectively) More uniform sequencing depth that parallels non-amplified sequencing the best High quality SNV calling is possible with TruePrime amplified samples 80.6% overlap with SNVs called in the non-amplified samples Decreased heterozygous SNV to homozygous SNV conversions with TruePrime amplified samples

34 Sygnis TruePrime WGA Kits are Available from Lucigen in the U.S.* Sygnis TruePrime TM WGA Kits Sygnis TruePrime TM Single Cell WGA Kit version 2.0 Lucigen Cat. No. Size (rxn) U.S. List Price SYG $248 SYG $675 SYG $560 SYG $1890 Visit the TruePrime WGA Kits webpages TruePrime WGA Kit: TruePrime Single Cell WGA Kit: *Lucigen is a Sygnis distributor for the United States. For those outside the U.S., please contact Lucigen Customer Service (custserv@lucigen.com) and we will connect you with Sygnis/Expedeon

35 One Last Thought PCR Amplified NGS Libraries Were Used PCR introduces its own bias within the library Could using PCR-free library prep improve the results even more? WGA produces enough material for PCR-free library prep The Lucigen NxSeq AmpFREE Low DNA Library Kit requires only 75 ng sheared gdna and produces the most efficient PCR-free libraries Learn more about the NxSeq AmpFREE Low DNA Library Kit

36 Questions? Lucigen Tech Support 1 (608) am 5 pm central time Contact me. Rob Brazas, Ph.D. Sr. Product Manager rbrazas@lucigen.com Thank You and Our Friends at Sygnis!