Hardy Weinberg Equilibrium

Size: px
Start display at page:

Download "Hardy Weinberg Equilibrium"

Transcription

1 Gregor Mendel Hardy Weinberg Equilibrium Lectures 4-11: Mechanisms of Evolution (Microevolution) Hardy Weinberg Principle (Mendelian Inheritance) Genetic Drift Mutation Sex: Recombination and Random Mating Epigenetic Inheritance Natural Selection ( ) G. H. Hardy ( ) Wilhem Weinberg ( ) These are mechanisms acting WITHIN populations, hence called population genetics EXCEPT for epigenetic modifications, which act on individuals in a Lamarckian manner Recall from Previous Lectures Darwin s Observation Evolution acts through changes in allele frequency at each generation Leads to average change in characteristic of the population Recall from Lecture on History of Evolutionary Thought Darwin s Observation HOWEVER, Darwin did not understand how genetic variation was passed on from generation to generation Gregor Mendel, Father of Modern Genetics Gregor Mendel, Father of Modern Genetics Mendel presented a mechanism for how traits got passed on Individuals pass alleles on to their offspring intact (the idea of particulate (genes) inheritance) Gregor Mendel Mendel s Laws of Inheritance Law of Segregation only one allele passes from each parent on to an offspring Law of Independent Assortment different pairs of alleles are passed to offspring independently of each other Gregor Mendel ( ) ( ) 1

2 Gregor Mendel Using 29,000 pea plants, Mendel discovered the 1:3 ratio of phenotypes, due to dominant vs. recessive alleles In cross-pollinating plants with either yellow or green peas, Mendel found that the first generation (f1) always had yellow seeds (dominance). However, the following generation (f2) consistently had a 3:1 ratio of yellow to green. Mendel uncovered the underlying mechanism, that there are dominant and recessive alleles Hardy-Weinberg Principle Mathematical description of Mendelian inheritance Godfrey Hardy ( ) Wilhem Weinberg ( ) The Hardy-Weinberg Principle Testing for Hardy-Weinberg equilibrium can be used to assess whether a population is evolving A population that is not evolving shows allele and genotypic frequencies that are in Hardy Weinberg equilibrium If a population is not in Hardy-Weinberg equilibrium, it can be concluded that the population is evolving 2

3 ALASKA YUKON CANADA 9/17/18 Evolutionary Mechanisms (will put population out of HW Equilibrium): Genetic Drift Natural Selection Mutation Migration Requirements of HW Large population size Random Mating No Mutations Violation Evolution Genetic drift Inbreeding & other Mutations *Epigenetic modifications change expression of alleles but not the frequency of alleles themselves, so they won t affect the actual inheritance of alleles However, if you count the phenotype frequencies, and not the genotype frequencies, you might see phenotypic frequencies out of HW Equilibrium due to epigenetic silencing of alleles. (epigenetic modifications can change phenotype, not genotype) No Natural Selection No Migration Natural Selection Migration An evolving population is one that violates Hardy-Weinberg Assumptions Fig. 23-5a What is a population? A group of individuals within a species that is capable of interbreeding and producing fertile offspring (definition for sexual species) Porcupine herd range Beaufort Sea ALASKA Fortymile herd range MAP AREA NORTHWEST TERRITORIES In the absence of Evolution Patterns of inheritance should always be in Hardy Weinberg Equilibrium Following the transmission rules of Mendel According to the Hardy-Weinberg principle, frequencies of alleles and genotypes in a population remain constant from generation to generation Also, the genotype frequencies you see in a population should be the Hardy-Weinberg expectations, given the allele frequencies Null Model No Evolution: Null Model to test if no evolution is happening should simply be a population in No Selection: Null Model to test whether Natural Selection is occurring should have no selection, but should include Genetic Drift This is because Genetic Drift is operating even when there is no Natural Selection 3

4 Hardy-Weinberg Theorem Example: Is this population in Hardy Weinberg Equilibrium? Generation Generation Generation In a non-evolving population, frequency of alleles and genotypes remain constant over generations You should be able to predict the genotype frequencies, given the allele frequencies important concepts gene: A region of genome sequence (DNA or RNA), that is the unit of inheritance, the product of which contributes to phenotype locus: Location in a genome (used interchangeably with gene, if the location is at a gene but, locus can be anywhere, so meaning is broader than gene) loci: Plural of locus allele: Variant forms of a gene (e.g. alleles for different eye colors, BRC breast cancer allele, etc.) genotype: The combination of alleles at a locus (gene) phenotype: The expression of a trait, as a result of the genotype and regulation of genes (green eyes, brown hair, body size, finger length, cystic fibrosis, etc.) important concepts allele: Variant forms of a gene (e.g. alleles for different eye colors, BRC breast cancer allele, etc.) We are diploid (2 chromosomes), so we have 2 alleles at a locus (any location in the genome) However, there can be many alleles at a locus in a population. For example, you might have inherited a blue eye allele from your mom and a brown eye allele from your dad you can t have more alleles than that (only 2 chromosomes, one from each parent) BUT, there could be many alleles at this locus in the population, blue, green, grey, brown, etc. Alleles in a population of diploid organisms Random Mating (Sex) Eggs Sperm Zygotes A2 A2 A3 A3 A2 A4 A4 So then can we predict the % of alleles and genotypes in the population at each generation? Eggs Sperm Zygotes A2 A2 A3 A3 A2 A4 A4 A3 A3 Genotypes A2A4 A2A4 A3 A3 4

5 Fig Hardy-Weinberg Theorem In a non-evolving population, frequency of alleles and genotypes remain constant over generations Frequencies of alleles p = frequency of CR allele = 0.8 q = frequency of CW allele = 0.2 Alleles in the population 80% chance Each egg: Gametes produced 20% chance Each sperm: 80% chance 20% chance Hardy-Weinberg proportions indicate the expected allele and genotype frequencies, given the starting frequencies By convention, if there are 2 alleles at a locus, p and q are used to represent their frequencies The frequency of all alleles in a population will add up to 1 For example, p + q = 1 If p and q represent the relative frequencies of the only two possible alleles in a population at a particular locus, then for a diploid organism (2 chromosomes), (p + q) 2 = 1 = p 2 + 2pq + q 2 = 1 where p 2 and q 2 represent the frequencies of the homozygous genotypes and 2pq represents the frequency of the heterozygous genotype What about for a triploid organism? What about for a triploid organism? (p + q) 3 = 1 = p 3 + 3p 2 q + 3pq 2 + q 3 = 1 Potential offspring: ppp, ppq, pqp, qpp, qqp, pqq, qpq, qqq How about tetraploid? You work it out. 5

6 Hardy Weinberg Theorem ALLELES Probability of A = p p + q = 1 Probability of a = q GENOTYPES AA: p x p = p 2 Aa: p x q + q x p = 2pq aa: q x q = q 2 p 2 + 2pq + q 2 = 1 More General HW Equations One locus three alleles: (p + q + r) 2 = p 2 + q 2 + r 2 + 2pq +2pr + 2qr One locus n # alleles: (p 1 + p 2 + p 3 + p 4 + p n ) 2 = p p p p p n 2 + 2p 1p 2 + 2p 1p 3 + 2p 2p 3 + 2p 1p 4 + 2p 1p p n-1p n For a polyploid (more than two chromosomes): (p + q) c, where c = number of chromosomes If multiple loci (genes) code for a trait, each locus follows the HW principle independently, and then the alleles at each loci interact to influence the trait ALLELE Frequencies Frequency of A = p = 0.8 Frequency of a = q = 0.2 p + q = 1 Expected GENOTYPE Frequencies AA: p x p = p 2 = 0.8 x 0.8 = 0.64 Aa: p x q + q x p = 2pq = 2 x (0.8 x 0.2) = 0.32 aa: q x q = q 2 = 0.2 x 0.2 = 0.04 p 2 + 2pq + q 2 = = 1 Expected Allele Frequencies at 2nd Generation p = AA + Aa/2 = (0.32/2) = 0.8 Allele frequencies remain the same at next generation q = aa + Aa/2 = (0.32/2) = 0.2 Hardy Weinberg Theorem ALLELE Frequency Frequency of A = p = 0.8 p + q = 1 Frequency of a = q = 0.2 Expected GENOTYPE Frequency AA: p x p = p 2 = 0.8 x 0.8 = 0.64 Aa: p x q + q x p = 2pq = 2 x (0.8 x 0.2) = 0.32 aa : q x q = q 2 = 0.2 x 0.2 = 0.04 p 2 + 2pq + q 2 = = 1 Expected Allele Frequency at 2nd Generation p = AA + Aa/2 = (0.32/2) = 0.8 q = aa + Aa/2 = (0.32/2) = 0.2 Similar example, But with different starting allele frequencies p q 6

7 Calculating Allele Frequencies from # of Individuals The frequency of an allele in a population can be calculated from # of individuals: For diploid organisms, the total number of alleles at a locus is the total number of individuals x 2 The total number of dominant alleles at a locus is 2 alleles for each homozygous dominant individual p 2 2pq q 2 plus 1 allele for each heterozygous individual; the same logic applies for recessive alleles Calculating Allele and Genotype Frequencies from (# of individuals) #A = (2 x AA) + Aa = = 300 #a = (2 x aa) + Aa = = 130 Proportion A = 300/total = 300/430 = 0.70 Proportion a = 130/total = 130/430 = 0.30 A + a = = 1 # of Individuals Applying the Hardy-Weinberg Principle Proportion AA = 120/215 = 0.56 Proportion Aa = 60/215 = 0.28 Proportion aa = 35/215 = 0.16 AA + Aa + aa = = 1 Example: estimate frequency of a disease allele in a population Phenylketonuria (PKU) is a metabolic disorder that results from homozygosity for a recessive allele Individuals that are homozygous for the deleterious recessive allele cannot break down phenylalanine, results in build up à mental retardation The occurrence of PKU is 1 per 10,000 births How many carriers of this disease in the population? Rare deleterious recessives often remain in a population because they are hidden in the heterozygous state (the carriers ) Natural selection can only act on the homozygous individuals where the phenotype is exposed (individuals who show symptoms of PKU) We can assume HW equilibrium if: There is no migration from a population with different allele frequency Random mating No genetic drift Etc 7

8 So, let s calculate HW frequencies The occurrence of PKU is 1 per 10,000 births (frequency of the disease allele): q 2 = q = sqrt(q 2 ) = sqrt(0.0001) = 0.01 The frequency of normal alleles is: p = 1 q = = 0.99 The frequency of carriers (heterozygotes) of the deleterious allele is: 2pq = 2 x 0.99 x 0.01 = or approximately 2% of the U.S. population Conditions for The Hardy-Weinberg theorem describes a hypothetical population The five conditions for nonevolving populations are rarely met in nature: No mutations Random mating No natural selection Extremely large population size No gene flow So, in real populations, allele and genotype frequencies do change over time DEVIATION from Indicates that EVOLUTION Is happening Hardy-Weinberg across a Genome In natural populations, some loci might be out of HW equilibrium, while being in Hardy-Weinberg equilibrium at other loci For example, some loci might be undergoing natural selection and become out of HW equilibrium, while the rest of the genome remains in HW equilibrium Allele Demo How can you tell whether a population is out of HW Equilibrium? 8

9 Perform HW calculations to see if it looks like the population is out of HW equilibrium Then apply statistical tests to see if the deviation is significantly different from what you would expect by random chance Example: Does this population remain in Hardy Weinberg Equilibrium across Generations? Generation Generation Generation Generation Generation Generation In this case, allele frequencies (of A and a) did not change. ***However, the population did go out of HW equilibrium because you can no longer predict genotypic frequencies from allele frequencies For example, p = 0.5, p 2 = 0.25, but in Generation 3, the observe p 2 = 0.10 How can you tell whether a population is out of HW Equilibrium? 1. When allele frequencies are changing across generations 2. When you cannot predict genotype frequencies from allele frequencies (means there is an excess or deficit of genotypes than what would be expected given the allele frequencies) Testing for Deviaton from Hardy- Weinberg Expectations A c 2 goodness-of-fit test can be used to determine if a population is significantly different from the expections of Hardy-Weinberg equilibrium. If we have a series of genotype counts from a population, then we can compare these counts to the ones predicted by the Hardy-Weinberg model. O = observed counts, E = expected counts, sum across genotypes Example Genotype Count: AA 30 Aa 55 aa 15 Calculate the c 2 value: Genotype Observed Expected (O-E) 2 /E AA Aa aa Total Since c 2 = 1.50 < (from Chi-square table, alpha = 0.05), we conclude that the genotype frequencies in this population are not significantly different than what would be expected if the population is in Hardy-Weinberg equilibrium. 9

10 Testing for Deviaton from Hardy- Weinberg Expectations A c 2 goodness-of-fit test can be used to determine if a population is significantly different from the expections of Hardy-Weinberg equilibrium. If we have a series of genotype counts from a population, then we can compare these counts to the ones predicted by the Hardy-Weinberg model. O = observed counts, E = expected counts, sum across genotypes 55 Testing for Deviaton from Hardy-Weinberg Expectations O = observed counts, E = expected counts, sum across genotypes We test our c 2 value against the Chi-square distribution (sum of square of a normal distribution), which represents the theoretical distribution of sample values under HW equilibrium à Less likely to get these values by chance And determine how likely it is to get our result simply by chance (e.g. due to sampling error); i.e., do our Observed values differ from our Expected values more than what we would expect by chance (= significantly different)? 56 Test for Deviation from HW equilibrium Genotype Count Generation 4: AA 65 Aa 31 aa 4 Calculate the c 2 value: Genotype Observed Expected (O-E) 2 /E AA Aa aa Total Since c 2 = < (from Chi-square table for critical values, alpha = 0.05), we conclude that the genotype frequencies in this population are not significantly different than what would be 57 expected if the population were in Hardy-Weinberg equilibrium. The chi-squared distribution is used because it is the sum of squared normal distributions Calculate Chi-squared test statistic Figure out degrees of freedom Select confidence interval (P-value) Compare your Chi-squared value to the theoretical distribution (from the table), and accept or reject the null hypothesis. If the test statistic > than the critical value, the null hypothesis (H0 = there is no difference between the distributions) can be rejected with the selected level of confidence, and the alternative hypothesis (H1 = there is a difference between the distributions) can be accepted. If the test statistic < than the critical value, the null hypothesis cannot be rejected 58 Test for Significance of Deviation from HW Equilibrium Degrees of Freedom is n 1 = 2 alleles (p, q) -1 = 1 59 Testing for significance The results come out not significantly different from HW equilibrium This does not necessarily mean that genetic drift is not happening, but that we cannot conclude that genetic drift is happening Either we do not have enough power (not enough data, small sample size), or genetic drift is not happening Sometimes it is difficult to test whether evolution is happening, even when it is happening... The signal needs to be sufficiently large to be sure that you can t get the results by chance (like by sampling error) 60 10

11 Test for Deviation from HW equilibrium Genotype Count Generation 4 à increase sample size AA Aa aa 4000 Calculate the c 2 value: Genotype Observed Expected (O-E) 2 /E AA Aa aa Total 100, , Since c 2 = > (from Chi-square table for critical values, alpha = 0.05), we conclude that the genotype frequencies in this population ARE significantly different than what would be 61 expected if the population were in Hardy-Weinberg equilibrium. Test for Significance of Deviation from HW Equilibrium Degrees of Freedom is n 1 = 2 alleles (p, q) -1 = 1 62 One generation of Random Mating could put a population back into Hardy Weinberg Equilibrium What would Genetic Drift look like? Most populations are experiencing some level of genetic drift, unless they are incredibly large Generation Generation Generation Generation Is this population in HW equilibrium? If not, how does it deviate? What could be the reason? 11

12 Generation Generation Generation Generation This is a case of Genetic Drift, where allele frequencies are fluctuating randomly across generations Is this population in HW equilibrium? If not, how does it deviate? What could be the reason? Here this appears to be Directional Selection favoring AA Or Negative Selection disfavoring aa Is this population in HW equilibrium? If not, how does it deviate? What could be the reason? This appears to be a case of Heterozygote Advantage (or Overdominance) Is this population in HW equilibrium? If not, how does it deviate? What could be the reason? 12

13 Selection appears to be favoring aa Summary (1) A nonevolving population is in HW Equilibrium (2) Evolution occurs when the requirements for HW Equilibrium are not met (3) HW Equilibrium is violated when there is Genetic Drift, Migration, Mutations, Natural Selection, and Nonrandom Mating Hardy Weinberg Equilibrium Gregor Mendel Eggs Fig % C R ( p = 0.8) C R (80%) C W (20%) Sperm C R (80%) 64% ( p 2 ) C R C R 16% (qp) C R C W 20% C W (q = 0.2) C W (20%) 16% ( pq) C R C W 4% (q 2 ) C W C W 64% C R C R, 32% C R C W, and 4% C W C W Perform the same calculations using percentages Gametes of this generation: 64% C R + 16% C R = 80% C R = 0.8 = p 4% C W + 16% C W = 20% C W = 0.2 = q ( ) G. H. Hardy ( ) Wilhem Weinberg ( ) Genotypes in the next generation: 64% C R C R, 32% C R C W, and 4% C W C W plants Fig % C R (p = 0.8) 20% C W (q = 0.2) Fig C R (80%) Sperm C W (20%) 64% C R C R, 32% C R C W, and 4% C W C W Gametes of this generation: C R (80%) 64% C R + 16% C R = 80% C R = 0.8 = p Eggs 64% (p 2 ) C R C R 16% (pq) C R C W 4% C W + 16% C W = 20% C W = 0.2 = q C W (20%) 16% (qp) C R C W 4% (q 2 ) C W C W 13

14 Fig % C R C R, 32% C R C W, and 4% C W C W Gametes of this generation: 64% C R + 16% C R = 80% C R = 0.8 = p 4% C W + 16% C W = 20% C W = 0.2 = q Genotypes in the next generation: 64% C R C R, 32% C R C W, and 4% C W C W plants 1. Nabila is a Saudi Princess who is arranged to marry her first cousin. Many in her family have died of a rare blood disease, which sometimes skips generations, and thus appears to be recessive. Nabila thinks that she is a carrier of this disease. If her fiancé is also a carrier, what is the probability that her offspring will have (be afflicted with) the disease? (A) 1/4 (B) 1/3 (C) 1/2 (D) 3/4 (E) zero The following are numbers of pink and white flowers in a population. Pink White Generation 1: Generation 2: Generation 3: Which of the following is most likely to be TRUE? (A) The heterozygotes are probably pink (B) The recessive allele here (probably white) is clearly deleterious (C) Evolution is occurring, as allele frequencies are changing greatly over time (D) Clearly there is a heterozygote advantage (E) The frequencies above violate Hardy-Weinberg expectations The following are numbers of purple and white peas in a population. () (A2) (A2A2) Purple Purple White Generation 1: Generation 2: Generation 3: What are the genotype frequencies at each generation? (A) Generation 1: 0.30, 0.50, 0.20 Generation 2: 0.20, 0.40, 0.40 Generation 3: 0, 0.333, (B) Generation 1: 0.36, 0.48, 0.16 Generation 2: 0.10, 0.20, 0.20 Generation 3: 0, 0.10, 0.30 (C) Generation 1: 0.36, 0.48, 0.16 Generation 2: 0.20, 0.40, 0.40 Generation 3: 0, 0.25, 0.75 (D) Generation 1: 0.36, 0.48, 0.16 Generation 2: 0.36, 0.48, 0.16 Generation 3: 0.36, 0.48, From the example on the previous slide, what are the frequencies of alleles at each generation? (A) Generation1: Dominant allele () = 0.6, Recessive allele (A2) = 0.4 Generation2: Dominant allele = 0.4, Recessive allele = 0.6 Generation3: Dominant allele = 0.125, Recessive allele = (B) Generation1: Dominant allele = 0.6, Recessive allele = 0.4 Generation2: Dominant allele = 0.6, Recessive allele = 0.4 Generation3: Dominant allele = 0.6, Recessive allele = 0.4 (C) Generation1: Dominant allele = 0.6, Recessive allele = 0.4 Generation2: Dominant allele = 0.5, Recessive allele = 0.5 Generation3: Dominant allele = 0.25, Recessive allele = From the example two slides ago, which evolutionary mechanism might be operating across generations? (A) Mutation (B) Selection favoring (C) Heterozygote advantage (D) Selection favoring A2 (E) Inbreeding (D) Generation1: Dominant allele = 0.4, Recessive allele = 0.6 Generation2: Dominant allele = 0.5, Recessive allele = 0.5 Generation3: Dominant allele = 0.25, Recessive allele =

15 Answers: 1. Parents: Aa x Aa = Offspring: AA (25%), Aa (50%), aa (25%) Answer = A 2. A 3. C 4. A 5. D 15