Measuring Rates of mtdna Heteroplasmy Using a NextGen Sequencing Approach

Size: px
Start display at page:

Download "Measuring Rates of mtdna Heteroplasmy Using a NextGen Sequencing Approach"

Transcription

1 Mitchell M. Holland, Ph.D. Former Director, Forensic Science Program Associate Professor, Biochem & MolBio Penn State University, University Park, PA Measuring Rates of mtdna Heteroplasmy Using a NextGen Sequencing Approach NC State University 15 Sep

2 Types of DNA in the Cell Nucleus Mitochondria Mitochondrial DNA High Copy Number Genome Nuclear DNA Membrane-enclosed organelles distributed through the cytosol of most eukaryotic cells

3

4 Identification of Nicholas Romanov Tsar Nicholas II Family Reference 5 Generations Removed

5 Identification of Nicholas Romanov Tsar Nicholas II Georgij Romanov LR = 150 LR = 375,000 When the heteroplasmy is considered

6 Substitution Rate of mtdna We compared DNA sequences of the two CR hyper-variable segments from close maternal relatives, from 134 independent mtdna lineages spanning 327 generational events. Germline Bottleneck

7 DGGE to Identify Heteroplasmy Family Reference 5 Generations Removed Used DGGE analysis to identify the heteroplasmic sequences including from the distant maternal relative

8 Substitution Rate of mtdna Ten substitutions were observed, resulting in an empirical rate of 1/33 generations, or 2.5/site/Myr. This is roughly twenty-fold higher than estimates derived from pylogenetic studies; / /site/Myr. Using our empirical rate to calibrate the mtdna molecular clock would result in an age for the mtdna MRCA of only ~6,500 y.a., clearly incompatible with the known age of modern humans.

9 Genetic Bottlenecks & Empirical Mutation Rates The germline mutation rate is 0.13 mutations/site/myr (compared to phylogenic rate estimates of 0.118) The number of mtdna molecules transmitted to the next generation is (human germline bottleneck) Non-synonymous mutations showed signs of purifying selection Proceedings of the National Academy of Sciences (2014) Using an NGS Approach

10 Here we are in 2015, and Problem: Forensic labs still don t have a suitable method for detecting and reporting mtdna heteroplasmy. Even high levels of heteroplasmy go unreported, lowering the discrimination potential of the typing system. Hypothesis: An NGS approach will allow for the routine detection and reporting of mtdna heteroplasmy, including low level variants. Differences in heteroplasmic profiles may allow for the differentiation of maternal relatives.

11 Our Initial Work: (Under Mitch Holland s Research Page) Croat Med J (2011), 52, pp Using the 454 LifeSciences GS Junior Instrument & Chemistry

12 Evaluated 30 individuals from 25 different mtdna lineages Table 3, Holland et al, CMJ 2011 Sample Sanger mtdna Profile Percent of Minor Heteroplasmy & Site 454 GS Junior mtdna Profile Percent of Minor Heteroplasmy & Site Concordance F2 F T, 16093C, 16126C, 16261T, 16274A, 16355T 16069T, 16126C, 16145A, 16172C, 16261T % C 16069T, 16093C, 16126C, 16261T, 16274A, 16355T 16069T, 16126C, 16145A, 16172C, 16261T % T % C % C F4 No polymorphisms No polymorphisms F5 F7, F12-13, M A, 16172C, 16223T, 16311C 16192T, 16256T, 16270T 16129A, 16172C, 16223T, 16311C 16192T, 16256T, 16270T % G % T % C F T, 16362C 16223T, 16362C % C F C 16356C F C 16298C % T F C, 16239T, 16294T, 16296T, 16304C 16126C, 16239T, 16294T, 16296T, 16304C F G 16343G F C 16093C F C, 16278T 16172C, 16278T M T 16355T M T 16111T % C 0.33% or 1/300

13 Sanger versus NGS Heteroplasmy Detection Figure 2, Holland et al, CMJ 2011 SANGER NGS 3.71% C/T Heteroplasmy 1.29% T/C Heteroplasmy 20.14% C/T Heteroplasmy

14 Other Examples 64 mtgenomes <0.02% Differences from Sanger Data Most Differences in Homopolymeric Stretches PGM capable of producing quality, reliable mtdna sequence data Concordance

15 Reproducibility Is low level heteroplasmy reproducible? Sample M5 M A, 16129A, 16192T, 16213A, 16223T, 16278T, 16355T, 16362C 16129A, 16223T, 16264T 16114A, 16129A, 16192T, 16213A, 16223T, 16278T, 16355T, 16362C 16129A, 16223T, 16264T % C M C, 16311C 16224C, 16311C M9 Sanger mtdna Profile 16301T, 16343G, 16356C Percent of Minor Heteroplasmy & Site 454 GS Junior mtdna Profile 16301T, 16343G, 16356C Percent of Minor Heteroplasmy & Site M C 16304C % C % T % T M A, 16223T 16129A, 16223T M T, 16126C 16069T, 16126C % T M C, 16224C, 16311C 16093C, 16224C, 16311C % T M C, 16294T, 16296T 16126C, 16294T, 16296T M T, 16304C, 16311C 16278T, 16304C, 16311C % T % C % G % T Reproducibility M19, F T, 16126C, 16222T 16069T, 16126C, 16222T

16 Reproducibility Sample Sanger mtdna Profile Percent of Minor Heteroplasmy & Site 454 GS Junior mtdna Profile M C 16304C Percent of Minor Heteroplasmy & Site % C % T % T M10 Replicate # % C % T % T M10 Replicate # % C % T % T

17 Rate of Heteroplasmy Data Set = 109 Individual Lineages (50 Pairs of Maternal Relatives) % Heteroplasmy >1% Heteroplasmy >10% Heteroplasmy Coding Region 69% 50% 14% Control Region 50% 26% 8.6%* *Consistent with previous reports: for example, Irwin et al, J Mol Evol 2009

18 Things to Consider If we agree that NGS should be employed in forensic cases then we need to better understand: rates of heteroplasmy (per sample & per nucleotide) transmission and drift of heteroplasmic variants where to set reporting thresholds how DNA damage will impact thresholds statistical approaches when reporting heteroplasmy

19 Rate Study mtdna Control Region Buccal swabs from 550 Unrelated individuals European decent Three age groups 18-30, 31-50, >50 yoa MiSeq/Nextera XT Initial findings Haplotypes/Heteroplasmy Quigley's Cartoons blog June 5, swabbing-cheek NIJ 2014-DN-BX-K022

20 Haplotypes ~72% ~63% 265 samples analyzed, thus far 222 different haplotypes in the dataset (84%) 196/265 unique haplotypes (74%) Consistent with previous analyses, but higher percentages due to sequence range analyzed

21 Shared Haplotypes Most common haplotype = 16519C, 263G, 315.1C (3 %) Shared by 8/265 individuals

22 Haplogroups I, M, N, V, W, X, 20 K, 23 H, 93 T, 25 J, 26 R, 27 U, 42 Native American (C n=3) African (L n=6)

23 Heteroplasmy Observations of Heteroplasmy 60% individuals 13% individuals Observations of Heteroplasmy No Heteroplasmy One Site Two Sites Three Sites Four Sites Five Sites At Least One At Least Two At Least Three At Least Four At Least Five 1-10% MAF >10% MAF NOTE: >10% means at least one site above this value

24 Heteroplasmy v. Age 45 Samples with No Heteroplasmy % 24% 26% Years of Age Years of Age >50 Years of Age Normalized for Sample Set Size

25 Heteroplasmy v. Age : 26% individuals 31-50: 27% individuals >50: 43% individuals Number of Samples One Site Two Sites Three Sites Four Sites Five or More Sites Years of Age Years of Age >50 Years of Age

26 Heteroplasmy v. Site 65% in HV1, 21% in HV2, 14% Outside HV1/HV2 Hot Spots Cold Spots

27 Likelihood Ratio LR = p(e1/r) x p(e2/r) p(e1/r ) x p(e2/r ) p(e1/r) = the probability of the evidence (match between Georgij and Nicholas) given the hypothesis that the remains are those of Nicholas Romanov E2 = the probability of co-occurrence of heteroplasmy R = given the hypothesis that the remains are unrelated Increase Discrimination Potential LR = 375,000

28 Differentiate Between Maternal Relatives #1098 #1100 Primary Haplotype A263G Heteroplasmy Positions 200 A/G (3.0%) Primary Haplotype A263G Heteroplasmy Positions T16093C 16093T/C (12.6%) C16261T C16291C T16311C T16362C T16519C T16093C 16093C/T (3.4%) C16261T C16291C T16311C T16362C T16519C

29 Issues Still to Address Forensic Context Reporting mechanism for heteroplasmy Weight of a heteroplasmic match Impact of maternal transmission of heteroplasmic variants Impact of drift in heteroplasmic variants at the tissue level

30 Thanks!! Illumina Cydne Holt, Kathy Stephens, Joe Valaro, Carey Davis, Dan Gheba, etc SoftGenetics NextGENe John Fosnacht, Teresa Snyder-Leiby, etc Penn State Kateryna Makova, Anton Nekratenko Mitotyping Technologies Bob Bever, et al Battelle Memorial Institute National Institute of Justice (NIJ 2014-DN-BX-K022) Eberly College of Science, Forensic Science Program

31 Walther Parson & Ann Gross Jen McElhoe, Research Associate (NIJ) Master s Students: Molly Rathbun (damage) Laura Wilson (D-loop val) Elena Zavala (bone extr) Jamie Gallimore (drift) UG Students: Alyssa Duffy Jillian Baker Erica Pack Current Research Group

32 Thanks for your hospitality!!