Random Allelic Variation AKA Genetic Drift
Genetic Drift a non-adaptive mechanism of evolution (therefore, a theory of evolution) that sometimes operates simultaneously with others, such as natural selection the frequency of gene copies (i.e., alleles) in any generation of adult organisms represents only a sample of the gene copies carried by gametes of the previous generation, and the sample is subject to random variation, i.e., sampling error
Beginning with the Hardy-Weinberg model no mutation no selection no gene flow But with one wrinkle in drift finite population size Result random changes in allele frequency (there is never a change in allele frequency in the Hardy- Weinberg model)
Definitions monomorphy no allelic variation at a locus in a population polymorphy multiple alleles at a locus in a population fixation, fixed describes an allele frequency reaching 100% and therefore monomorphic N = censused population size
Definitions private allele an allele unique to only one population, but not necessarily fixed within it cline continuous change in allele frequency along a geographic transect, the hallmark of gene flow deme a reproductively isolated or semi-isolated population, i.e., reduced or no gene flow among or between demes metapopulation a collection of conspecific demes
Two models of drift Random Walk prospective, looking forward in time Coalescence retrospective, looking backward in time
Random Walk Model Monte Carlo Markov Chain prospective, looking forward in time The state at time t=0 is determined only by the state at time t-1 plus a random event example stand at sundial on Horseshoe in front of Hadley Hall facing West take one step forward, flip a coin, move one step to the right if heads or one step to the left if tails repeat process until you either reach Espina, run into either North Horseshoe Rd or South Horseshoe Rd
Espina St South Horseshoe Rd North Horseshoe Rd
the horseshoe as a graph of allele frequency Espina St South Horseshoe Rd North Horseshoe Rd y-axis (vertical) = allele frequency x-axis (horizontal) = time in generations p = 0 at North Horseshoe Rd (extinction of A 1, fixation of A 2 ) p = 1.0 at South Horseshoe Rd (fixation of A 1, extinction of A 2 ) the sundial is halfway between, p = 0.5 at time t=0
the outcome will differ every time because of the random component
Unlike the coin-toss exercise, in which the probability of heads and tails remains equal, the probability of an allele being represented in a gamete changes with each generation the probability of an allele being represented in a gamete is equal to the new allele frequency this will tend to ensure allele fixation or loss
the width of the Horseshoe (i.e., North-South) is analogous to population size (N) the smaller the population, the narrower the width (or more specifically, the greater the variance of change) the smaller the population, the greater the sampling bias of gametes, and the more probably and rapidly an allele frequency will become fixed or monomorphic (100%) or go extinct (0%)
variance is higher in small samples V = p 1 p 2N (N = population size)
If Drift is random (by definition, it is), then how can you predict change in allele frequency or which allele will become fixed? Probability of fixation = p Probability of extinction = 1 p for any new mutant, probability of fixation = initial frequency p = 1 2N intuitively, probability of fixation of a new mutant by chance alone is greater in a small rather than large population average time to fixation by drift (without selection) = 4N generations (in diploid species)
Coalescent Model Coalescent Theory any two lineages can be traced backward in time to a common ancestor alleles, haplotypes, or lineages are said to coalesce at that generation of common ancestry
Coalescent Model Coalescent Theory Example a haploid non-recombining bacterium in each generation, the bacterium may die, survive but not reproduce, or survive and reproduce thus, in a population of finite size, if some lineages leave no descendants while others reproduce, eventually all individuals will be descendants of just one single lineage barring consideration of new mutants, initially polymorphic populations become increasingly closely related (as descendants of a single common ancestor) as allelic variation is lost by fixation and extinction
Same thing in color Histogram of generations to coalescence of lineages http://www.csbio.unc.edu/mcmillan/index.py?run=courses.comp790s09
time to coalescence of alleles = 4N generations (diploids) 2N generations (haploids) 1N generations in maternally inherited haploid organellar DNA (i.e., mitochondria, chloroplast) because the paternal lineage ends in every generation Conclusion coalescence is fast in small populations Drift is greatest in small populations Note that time to coalescence is exactly the same as time to fixation under the random walk model (retrospective = prospective)
Coalescent Theory Predicts (in the absence of gene flow, mutation, selection) Allele or haplotype frequencies fluctuate at random but, in finite populations, one will become fixed Individual populations lose their ancestral genetic variation Initially similar populations diverge in allele frequencies by chance alone because they become fixed for different alleles or different combinations of alleles at unlinked loci The probability that an allele will ultimately become fixed is equal to its frequency in the population in any given generation Rate of fixation (or loss) is greater in small populations
Distinct evolutionary histories of species and their genes Polymorphism arises before speciation modified from Ebersberger et al. Mol Biol Evol 2007
Lineage Sorting the time-dependent process by which species lose their ancestral polymorphism through the process of genetic drift Hemiplasy genes or characters with different evolutionary histories than the species that possess them, most often due to incomplete lineage sorting (ILS) complete complete complete ancestral polymorphism the shorter the time between speciations, the more ILS, hemiplasy Robinson et al 2008 PNAS 105:14477-14481
How is hemiplasy manifested? Mosaic Genomes with discordant gene trees among three or more species that diverged in rapid succession Percentage of 25,000 genes most closely related Between: human-chimp chimp-gorilla human-gorilla Ebersberger et al Mol Biol Evol 2007
Heterozygosity (H) single locus H the number of individuals in a population that are heterozygous for a given locus multilocus H the number of loci that are heterozygous in an average individual H highest in a population with equal numbers of homozygotes Within demes, drift fixes alleles Across the metapopulation, allele frequencies remain unchanged, but genotype frequencies deviate from Hardy-Weinberg equilibrium, i.e., heterozygosity decreases (H )
Ten Populations, red and blue alleles panmictic with gene flow, high H demic with genetic structure, low H
Effective Population Size (N e ) the N e of an actual population is equal to the censused population size (N) of an ideal population (i.e., in which all individuals breed and contribute equally to the gene pool) that would show the amount of drift actually observed and measured by heterozygosity (H) typically, N e < N because of: sex bias (the less numerous sex limits N e ) reproductive variance of the sexes (the polygamous sex limits N e ) overlapping generations fluctuations in population size, e.g., past bottlenecks ploidy
Founder Effect the principle that the founders of a new colony carry only a fraction of the total genetic variation of the source population genetic drift will have a strong effect on small founding populations most rare alleles will not be represented, a few will be overrepresented
Founder Effect initially, H tends to be similar in source and founder populations because H is most influenced by common alleles but H decreases rapidly in founder populations, more so in small populations, less so in populations with high intrinsic growth rate (r)
Logistic Growth curve
Logistic Growth curve dn dt = r(1 N K )
examples of drift Buri 1956 Fixation of eye color allele from initial freq = 0.5 in 107 populations of Drosophila in 19 generations
examples of drift Baker and Mooed 1987 Mynah birds are indigenous to India Mynahs were introduced by humans to Australia, New Zealand, Fiji, and Hawaii in the 1800 s among natural populations of Mynahs, Nei s D = 0.001 (a genetic distance that describes the inverse correlation coefficient of shared alleles) among naturalized populations, Nei s D = 0.006 equivalent to sub-species differences in about a century also, most rare alleles lost, but some increased from p = 0.01 to p = 0.08
Inbreeding (Assortative Mating, compared to drift) the antithesis of panmixia, panmixis, random mating Inbreeding Coefficient (F) the frequency of autozygous individuals in a population Autozygous ( identical by descent ) - both alleles in a homozygous individual were inherited directly from a single haploid allele in an ancestor (e.g., grandparent) Allozygous not identical by descent; either homozygous or heterozygous in an inbred population, H is low
Pedigree with Identity by Descent Parental A 1 A 2* A 1 A 2 F 1 inbreeding A 1 A 2* A 2 A 2* F 2 generation A 2* A 2*
Genotype Frequencies with Inbreeding allozygous autozygous A 1 A 1 p 2 (1-F) + pf A 1 A 2 2pq (1-F) A 2 A 2 q 2 (1-F) + qf as F 1.0, the frequency the frequency of autozygous homozygotes increases at the expense of all allozygous genotypes the greater F, the faster H decreases Selfing (self-fertilization) H is halved in each generation
Inbred population F > number of autozygous individuals in a panmictic population F = 1 fully inbred, F = 0 no inbreeding
comparison of Inbreeding to H-W equilibrium allele frequencies do not change (necessarily) genotype frequencies do change phenotypic variance usually increases due to loss of heterozygotes inbreeding depression (reduction in mean of phenotype due to increased expression of recessive alleles in homozygous genotype) number of homozygous recessive alleles increases mean fitness of population decreases which, when coupled with natural selection, can then change allele frequency; it can promote linkage disequilibrium due to lack of heterozygotes, even if loci are not physically linked
comparison of Inbreeding to Drift both genetic drift and inbreeding lead to deviation from Hardy- Weinberg equilibrium heterozygosity decreases in small demes genetic drift causes change in allele frequency (and consequently genotype frequency) inbreeding causes change in genotype frequency (but not allele frequency in the absence of selection) both cause a decrease in heterozygosity
Neutral Theory Mootoo Kimura 1968 Original thesis: there are too many genes for selection to act in any significant way on all simultaneously, no population is sufficiently large to bear the reduction in fitness Now better understood as a balance between new mutation and genetic drift; the genome is in a constant state of flux Consistent with Molecular clock Large percentage of non-coding and non-conserved DNA and redundancy of genetic code
Gene Flow homogenization of metapopulations addition of alleles/genotypes to demes opposite effect of drift complete gene flow = panmixia recall models of gene flow: island, stepping stone, isolation by distance, extinction and recolonization
Gene Flow m = rate of gene flow = % of gene copies carried into a population from outside per generation Nm = number of immigrants per generation, a measure of gene flow F ST = fixation index % of genetic variation of a total population that is represented in a sub-population genetic structure a structured population is one with high F ST, that is, the subpopulation is not representative of the total As Nm, equilibrium F ST, genetic structure
Gene Flow It takes very few immigrants to homogenize populations Yet, typically populations are very structured, gene flow is surprisingly low Direct estimates of gene flow (mark and recapture studies) suggest more gene flow than typically measured by F ST