Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012

Size: px
Start display at page:

Download "Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012"

Transcription

1 Lecture 23: Causes and Consequences of Linkage Disequilibrium November 16, 2012

2 Last Time Signatures of selection based on synonymous and nonsynonymous substitutions Multiple loci and independent segregation Estimating linkage disequilibrium

3 Today Recombination and LD Drift and LD Mutation and LD Selection and LD Hitchhiking and selective sweeps

4 Effects of recombination rate on LD Decline in LD over time with different theoretical recombination rates (c) Even with independent segregation (c=0.5), multiple generations required to break up allelic associations D t = e ct D 0 Where t is time (in generations) and e is base of natural log (2.718)

5 LD varies substantially across human genome NATURE Vol October 2005 Average r 2 for pairs of SNP separated by 30 kb in 1 Mb windows LD affected by location relative to telomeres and centromeres, chromosome length, GC content, sequence polymorphism, and repeat composition Highest and lowest levels of LD found in gene-rich regions

6 Human HapMap Project and Whole Genome Scans LD structure of human Chromosome 19 ( 1 common SNP genotyped every 700 bp for 270 individuals (3.4 million SNP) 9.2 million SNP in total NATURE Vol October 2005

7 LD in the Poplar Genome LD declines rapidly with distance 1 LD higher in genes than in genome as a whole Loci separated by kilobases still in LD! Genomewide (core of range) Genes (core of range) r Distance (kb)

8 Recombination Across Poplar Chromosomes Substantial variation in recombination rate Related to repeat composition, methylation, and distance from centromere

9 Recombination rate varies among individuals Rate is often higher in females than males Rate varies among individuals within males and females Variation in recombination rate in the MHC region (3.3 Mb in human sperm donors

10 Genetic Drift and LD Begin with highly diverse haplotype pool Drift leads to chance increase of certain haplotypes Generates nonrandom association between alleles at different loci (LD)

11 Genetic Drift and LD Why doesn t recombination reduce LD in this situation?

12 Expected Gamete Frequencies: Double Homozygote A 1 B 1 Meiosis A 1 B 1 A 1 B 1 A 1 B 1 A 1 B 1 A 1 B 1 NonRecombinant Recombinant Recombinant NonRecombinant

13 Expected Gamete Frequencies: Double Heterozygote A 1 B 1 Meiosis A 2 B 2 A 1 B 1 A 1 B 2 A 2 B 1 A 2 B 2 NonRecombinant Recombinant Recombinant NonRecombinant

14 LD is partially a function of recombination rate Expected proportions of gametes produced by various genotypes over two generations Double heterozygote is only case where recombination matters Where c is the recombination rate and D 0 is the initial amount of LD

15 Effect of Drift on LD Drift and recombination will have opposing effects on LD Where r 1 2 is the squared correlation coefficient for alleles at two loci, E(r 2 ) = 1+ 4N e c 4Nec is population recombination rate, Expression approaches 0 for large populations or high recombination rates N e is effective population size, and c is recombination rate

16 Combined effects of Drift and Recombination LD declines as a function of population recombination rate (N e r in this figure, same as N e c) Effects of chance fluctuation of gamete frequencies

17 How should inbreeding affect linkage disequilibrium?

18

19 Mutation and LD: High mutation rates Allelic associations are masked by high mutation rates, so LD is decreased Gamete Pool with Low Mutation Gamete Pool with High Mutation

20 LD and neutral markers Low LD is the EXPECTED condition unless other factors are acting If LD is low, neutral markers represent very small segment of the genome in most cases In most parts of the genome, LD declines to background levels within 1 kb in most cases (though this varies by organism and population) Care must be taken in drawing conclusions about selection based on population structure derived from neutral markers

21 Selection and Linkage Disequilibrium (LD) Selection can create LD between unlinked loci Epistasis: two or more loci interact with each other nonadditively Phenotype depends on alleles at multiple loci Change in D over time due to epistatic interactions between loci with directional selection Why does D decline after generation 15 in this scenario? D D max = min( p1q2, p2q1) for D > 0

22 Epistasis and LD Begin with highly diverse haplotype pool Directional selection leads to increase of certain haplotype combinations Generates nonrandom association between alleles at different loci (LD)

23 Recombination vs Polymorphism in Poplar LG VII Rate Nec π Nucleotide diversity (π) is positively correlated with population recombination rate (4N e c) (R 2 =0.38) Position (Mb)

24 Recombination vs Polymorphism Recombination rate varies substantially across Drosophila genome Nucleotide diversity is positively correlated with recombination rate Hartl and Clark 2007

25 Why is polymorphism reduced in areas of low recombination? (or why is polymorphism enhanced in areas of high recombination)

26 Selection and LD Selection affects target loci as well as loci in LD Hitchhiking: neutral alleles increase in frequency because of selective advantage of allele at another locus in LD Selective Sweep: selectively advantageous allele increases in frequency and changes frequency of variants in LD Background Selection: selection against detrimental mutants also removes alleles at neutral loci in LD Hill-Robertson Effect: directional selection at one locus affects outcome of selection at another locus in LD

27 Selective Sweep in Plasmodium Pyrimethamine used to treat malaria parasite (Plasmodium falciparum) Parasite developed resistance at locus dhfr, which rapidly became fixed in population (6 years on Thai border) Microsatellite variation wiped out in vicinity of dhfr

28 Selective Sweep Positive selection leads to increase of a particular allele, and all linked loci Results in enhanced LD in region of selected polymorphism Accentuated in rapidly expanding population

29 Derived Alleles and Selective Sweeps Recent, incomplete selective sweeps are expected to leave a molecular signature of High frequency of derived alleles Strong geographic differentiation Elevated LD A C AA AA AC chimp Africans Europeans

30 LD Provides evidence of recent selection Regions under recent selection experience selective sweep, show high LD locally Patterns of LD in human genome provide signature of selection A statistic based on length of haplotypes and frequency of derived alleles reveals regions under selection ( ihs statistic) Selective sweep for lactase enzyme in Europeans after domestication of dairy cows Voight et al Plos Biology 4:

31 Some factors that affect LD Factor Effect Recombination rate Higher recombination lowers LD Genetic Drift Increases LD Inbreeding Increases LD Mutation rate High mutation rate decreases overall LD Epistasis Increases LD Selection Locally increased LD