Mutations during meiosis and germ line division lead to genetic variation between individuals Types of mutations: point mutations indels (insertion/deletion) copy number variation structural rearrangements Li et al. (Science, 2008) MAT 394 (ASU) Human Genetic Variation Spring 2012 1 / 20
Point mutations change individual nucleotides Mutations occur through replication errors by DNA polymerase and because of damage to DNA caused by radiation and chemical mutagens. Most mutations are corrected by repair proteins. Point mutation rates 1 10 8 /site/generation. Transitions occur twice as frequently as transversions. Carr (2010) MAT 394 (ASU) Human Genetic Variation Spring 2012 2 / 20
Nucleotide diversity is low in the human species Nucleotide diversity (π) is defined as the probability that two individuals sampled at random will differ at a given nucleotide. π differs across loci and populations. Global, genome-wide nucleotide diversity is π 1 10 3, i.e., one difference for every 1000 bp. Diversity is lower on the sex chromosomes. Diversity is also lower in coding regions due to purifying selection. Durbin et al. (2010) MAT 394 (ASU) Human Genetic Variation Spring 2012 3 / 20
Single nucleotide polymorphisms (SNPs) account for 90% of human genetic variation Most variants are extremely rare. SNPs have a minor allele frequency 1%. Occur every 100-300 bases. Less common in coding regions seq6 seq12 and on sex chromosomes. Usually just two nucleotides segregating. CLUSTAL 2.1 MULTIPLE SEQUENCE ALIGNMENT File: /Users/jtaylor/Desktop/JET/teaching/ASU courses/2012-spring/mat 394/lectures/polymorphism Date: Tue Jan 17 slides/d 21:25:56 loop 20 Page 1 of 1 seq3 seq7 seq9 seq1 seq2 seq8 seq10 seq11 seq5 seq4 ***** ************** ******* ********** AATTTCCACCAAACCCCCCCCCTCCCCCCGCTTCTGGCCA AATTTCCACCAAACCCCCCC--TCCCCCCGCTTCTGGCCA AATTTCCACCAAACCCCCCCC-TCCCCCCGCTTCTGGCCA AATTTCCACCAAACCCCCCCCCTCCCCCCGCTTCTGGCCA AATTTCCACCAAACCCCCCCCCTCCCCCCGCTTCTGGCCA AATTTCCACCAAACCCCCCC--TCCCCCCGCTTCTGGCCA AATTTCCACCAAACCCCCCC--TCCCCCCACTTCTGGCCA AATTTCCACCAAACCCCCCC--TCCCCCCGCTTCTGGCCA AATTTCCACCAAACCCCCCCC-TCCCCCCGCTTCTGGCCA AATTTCCACCAAACCCCCCC--TCCCCCCGCTTCTGGCCA AATTTTCACCAAACCCCCCCC-TCCCCCCGCTTCTGGCCA AATTTCCACCAAACCCCCCC--TCCCCCCGCTTCTGGCCA..840...850...860...870... human mtdna D loop partial alignment MAT 394 (ASU) Human Genetic Variation Spring 2012 4 / 20
Adjacent SNPs tend to be inherited together as haplotype blocks Haplotype blocks arise because of genetic linkage. Block sizes range from 1 to 200 kb. Adjacent SNPs tend to carry redundant information about genetic identity. Small numbers of Tag SNP s can be used to define haplotype blocks. Limited use in forensic genetics at present. Skelding et al. (2007) MAT 394 (ASU) Human Genetic Variation Spring 2012 5 / 20
Human mtdna variation Because there are tens to hundreds of mitochondria per cell, mtdna can often be recovered even from degraded samples. Indicative only of maternal relationships. Analyses usually focus on the two hypervariable regions HV1 and HV2 (610 bp). Mutation rate in these regions is 6 10 6 /site/generation. Analysis is complicated by heteroplasmy and shared haplotypes. Blanco et al. (2011) MAT 394 (ASU) Human Genetic Variation Spring 2012 6 / 20
Tandemly repeated DNAs exhibit high levels of copy number variation Satellite DNA is found in the centromeres and telomeres. Minisatellites repeats are 10-400 bp in length. Microsatellite repeats are 2-7 bp in length. The number of repeats can vary greatly between individuals. Sometimes the repeats themselves vary by point mutations or indels. VNTR = Variable Number Tandem Repeats MAT 394 (ASU) Human Genetic Variation Spring 2012 7 / 20
Replication slippage leads to changes in copy number Replication slippage occurs when the parent and daughter strands partially separate during replication and then incorrectly re-anneal. Slippage usually leads to a gain or a loss of a single repeat, although larger changes sometimes occur. Mutation rates can be on the order of 1 event per 1000 generations. MAT 394 (ASU) Human Genetic Variation Spring 2012 8 / 20
Unequal crossovers can also cause changes in copy number Improper alignment of tandem repeats during meiotic recombination results in unequal crossovers. These events can lead to large changes in copy number. MAT 394 (ASU) Human Genetic Variation Spring 2012 9 / 20
Short Tandem Repeats (STRs) are the most-commonly used markers in forensic work Repeat units are 2-7 bp (di-, tri-, tetra-, penta-, hexa- and heptanucleotide repeats). Loci are typically 100-400 bp (tens to hundreds of copies). Small size facilitates amplification and size resolution. Many thousands of STRs have been identified in the human genome. MAT 394 (ASU) Human Genetic Variation Spring 2012 10 / 20
STRs used in forensic work should have several properties Markers need to be highly polymorphic to distinguish between individuals. Polymorphism can be measured by the heterozygosity at the locus, which is the probability that two chromosomes sampled at random carry different alleles. The repeat units should have limited internal variation and be flanked by invariant regions. Different loci should be unlinked (preferably on different chromosomes) and not under selection. 13 core STR loci were selected by the FBI for inclusion in CODIS. Eight of these are also included in the UK s NDNAD. MAT 394 (ASU) Human Genetic Variation Spring 2012 11 / 20
Statistics for the core CODIS STR loci (Butler et al., 2003) heterozygosity name chrm. repeat allele range Cauc.; Afr. Am. TPOX 2 GAAT 4-16 0.656; 0.764 D3S1358 3 TCTG/TCTA 8-21 0.765; 0.764 FGA 4 CTTT/TTCC 12.2-51.2 0.887; 0.884 D5S818 5 AGAT 7-18 0.709; 0.733 CSF1PO 5 TAGA 5-16 0.725; 0.759 D7S820 7 GATA 5-16 0.818; 0.764 D8S1179 8 TCTA/TCTG 7-20 0.778; 0.764 TH01 11 TCAT 3-14 0.719; 0.760 VWA 12 TCTG/TCTA 10-25 0.841; 0.802 D13S317 13 TATC 5-16 0.745; 0.690 D16S539 16 GATA 5-16 0.735; 0.783 D18S51 18 AGAA 7-40 0.881; 0.860 D21S11 21 TCTA/TCTG 12-41.2 0.841; 0.830 MAT 394 (ASU) Human Genetic Variation Spring 2012 12 / 20
Steps in the production of a STR profile 1 Extraction of the DNA. 2 Amplification of the DNA markers by polymerase chain reaction (PCR). 3 Separation of PCR products by length. 4 Detection of separated products using stains or fluorescent dyes. MAT 394 (ASU) Human Genetic Variation Spring 2012 13 / 20
Polymerase Chain Reaction (PCR) Requires primers (18-22 bp) complementary to both ends of the region to be amplified. Amplification may occur even with a few mismatches, but primers should target conserved sequences. Taq polymerase is stable at high temperatures. Successive rounds of denaturation, annealing and extension lead to exponential amplification of DNA. MAT 394 (ASU) Human Genetic Variation Spring 2012 14 / 20
DNA fragments can be separated by size using electrophoresis Negatively-charged DNA molecules will migrate in the presence of an electric field. Smaller fragments migrate more rapidly. In capillary electrophoresis, fragments are distinguished by the time required to travel from one end to the other. Multiplexing is facilitated by using different sized fragments and different dyes. MAT 394 (ASU) Human Genetic Variation Spring 2012 15 / 20
STR profiles are displayed on electropherograms Each peak corresponds to a single STR allele. Homozygous loci produce single peaks, while heterozygous loci and mixtures produce multiple peaks. Peak height and area are correlated with the abundance of the PCR product. Stutter peaks arise from polymerase slippage during PCR. MAT 394 (ASU) Human Genetic Variation Spring 2012 16 / 20
PCR Profiles of Mixtures The presence of three or more peaks at a single locus usually indicates that the sample is a mixture. Interpretation is difficult since diploid genotypes cannot be inferred, e.g., A + B could come from AA and BB homozygotes or from two AB heterozygotes, etc. Genotypes can sometimes be assigned to individuals if the mixture proportions are very uneven. MAT 394 (ASU) Human Genetic Variation Spring 2012 17 / 20
Several processes can give rise to anomalous STR profiles Microvariants are STR alleles that differ in the composition of one or more repeats. Partial repeats can be identified by electrophoresis, but SNPs cannot. Allelic drop-out occurs when an allele fails to be amplified, either because of sample degradation or because of variation within the primer binding region. Partial chromosome duplications can give rise to triallelic profiles, but will affect at most one locus. These are fairly rare (1-20 per 10,000 loci). MAT 394 (ASU) Human Genetic Variation Spring 2012 18 / 20
Amelogenin is often used for sex identification Amelogenin is a protein that facilitates enamel formation. Gene is present on both sex chromosomes. AMELX differs from AMELY by a 6 bp deletion in intron 1 which can be detected when analyzing STR markers. Fang et al. (2011) Assay can be mislead by AMELY deletions and by mutations in the primer-binding regions. Sex identification can also be done using Y-linked STRs. MAT 394 (ASU) Human Genetic Variation Spring 2012 19 / 20
References Butler, J. M. (2010) Fundamentals of Forensic DNA Typing. Academic Press. CODIS Database: www.fbi.gov/about-us/lab/codis STRbase: www.cstl.nist.gov/strbase MAT 394 (ASU) Human Genetic Variation Spring 2012 20 / 20