Mathematical Population Genetics

Size: px
Start display at page:

Download "Mathematical Population Genetics"

Transcription

1 Mathematical Population Genetics (Hardy-Weinberg, Selection, Drift and Linkage Disequilibrium ) Chiara Sabatti, Human Genetics 5554B Gonda csabatti@mednet.ucla.edu Populations Which predictions are of interest and how we are going to test them depends on the available data. Many data types in Genetics. Two classically studied: genotypes of individuals organized in families genotypes of individuals belonging to a human population Today we will focus on DDUCTING specific predictions from general hypothesis. We will consider hypothesis and predictions that have to do with POPULTION type data. In the rest of the week, inference for population data and prediction and inference for family data. 1 3 Mathematical Language Scientific Method: 1. formulate general hypothesis, theories 2. draw specific predictions from hypothesis by means of logical deduction 3. test the predictions against data. (Popper, The logic of scientific discovery ) mathematical language 1. mathematics helps making specific deductions from general theories 2. statistics helps testing the deductions against data. General Principles Specific Predictions Deduction Mathematics Test and Revise Hypothesis with data Induction Statistics Mathematical Population Genetics IN TH BGINNING R.. Fisher (10-162) S. Wright (1-1) Haldane (12-164). Mendel (123-14) vs Darwin (10-12) The most serious difficulty to intellectual co-operation [between mathematicians and biologists] would seem to be removed if it were clearly and universally recognized that the essential difference lies, not in intellectual methods, and still less in intellectual ability, but in an enormous and specialized extension of the imaginative faculty, which each has experienced in relation to the needs of his special subject (Fisher, The genetical theory of natural selection ) NOW genetic epidemiology (Rotter, Cohn) linkage disequilibrium mapping (Peltonen) what is special about isolates (Peltonen) what is the history of human populations (Lusis) 2 4

2 axaa Total 1 aaxaa xaa axa xa x Type of Mating Frequency of Mating Offspring Offspring a Offspring aa Parental genotype a aa quilibrium wens (17) Mathematical Population Genetics. Springer-Verlag Li (176) First Course in Population Genetics. The Boxwood Press Cavalli-Sforza Bodmer (171) The genetics of human populations. Dover References: Linkage Disequilibrium Drift Selection Hardy-Weinberg Four examples: Topics and Bibliography 5 7 h a d aa r a h+1/2d=p r+1/2d=q pp qp pq qq h+1/2d=p r+1/2d=q a a aa h d r chieved in one generation Consequences: variation is maintained... quilibrium achievable in one generation lleles Frequencies Genotypes Frequencies a Prediction: Hypotheses: infinite populations no inbreeding random mating no mutation no migration no selection no overlapping generations. Hardy-Weinberg 6

3 + and at the current and of the frequency ffect of different selection pressures Dominant, Recessive, Heterozygous advantage. )( ( * Frequency of gamete FaS() FaS() FaS(a) FaS(a) FaS(aa) Total 1 1 aa a Genotype Freq. b. selection Relative fitness Selection Freq. a. selection each genotype has a fitness two alleles, and a, with frequency generation Statement : Hypotheses: infinite populations no inbreeding random mating no mutation no migration selection acting on one gene alone Selection 11 from equation above and some approximation you get formulas like the one in your book. due to mutation =, = + due to selection In general, if there is a constant frequency of one allele, on which selection acts, there must be an equilibrium between mutation and selection all the dominant fully penetrant, lethal, early-onset disorders are due to new mutations. Mutation disappears immediately $ Tot ' $ $ ' if dominant and aa 1 1 a 11 Genotype Fitness Dominant Recessive Heteroz. dvantage disease allele a wildetype The frequency of an allele at time, fitness parameters: f $, is function of the xamples 10 12

4 ever becomes either 0 or 1, it cannot change -if frequency Population size = 10 ffects of drift over time Frequency at next generation Probability of the frequency Distribution of frequency for N=10,p=.5 Statement: frequency of alleles at each generation is random (as the number of male in the children of one family is random) lleles at generation generation are a sample with replacement of the alleles at Premises: no inbreeding random mating no mutation no migration no selection population of -individuals. Drift: finite populations frequency Population size = 50 ffects of drift over time 0 and 1 are special values: if from those values any more. is very big, you do not see much effect of randomness - Var / 1 Prob / $ Distribution of 14 16

5 = at next generation: if equilibrium D C B b B b Haplotype distribution at generation two alleles at each locus a and B b recombination fraction between locus 1 and 2 chromosome of one generation are either chromosomes of the previous one or recombined version of them The population scale 1 What if the loci are linked Marker 2 B3 Haplotype : Fr > : Marker 1 1 lleles distribution Statement: Premises: no inbreeding random mating no mutation no migration no selection no linkage between the two loci,... Gametic phase equilibrium 17 disequilibrium between two markers decreases faster the higher recombination between the markers disequilibrium between two markers decreases with number of generations equilibrium is not achieved in one @ Disequilibrium over time The closer the loci, the less independent the alleles we inherit from one parent The distance between loci influences the probability of cross-over (map functions are mathematical models for this) two chromosomes cross-over cross-over gametes B2 B3 B3 B The family scale (Meiosis) 1 20

6 G candidate Prediction: distribution for the length of the conserved haplotype around a disease locus after Kgenerations in presence of founder effects. and it is independent from what happens outside the intervals between the markers $ e I J I Hypothesis: probability of recombination between two markers at distance His difficult to consider multiple markers at the same time. difficult to measure (measures of depend on the frequency of the alleles at each marker) difficult to extend to multiple alleles so far: F Can we make a better prediction Distance between markers in Morgans Disequilibrium as D t=0,10,20,100,200 Decay of disequilibrium as a function of distance between markers and generations Length of conserved haplotype in Morgans Density function K L This is the distribution for a chromosome at random among the ones that carry the disease. Distribution of the length of the conserved haplotype genes) this will translate in association between alleles at a marker and disease status even if the marker is not the gene (linkage disequilibrium the markers closer to the disease locus will be in linkage disequilibrium with the disease the disequilibrium will decrease as the distance between disease gene and marker increases and as the number of generations from the founder increases recombination and mutation will erode the haplotype Initially all the chromosomes that inherit the disease inherit this haplotype D4 B3 1 D L3 M2 R1 T2 Suppose that in a population of chromosomes, 1 undergoes a mutation in a gene that causes a disease LD and Disease Genes in Founder Populations 22 24

7 Length Conserved Haplotype 6 Length in Morgans xpected value and variance of the conserved haplotype 25 Science, Mathematics and Imagination Imagination is more important than knowledge. Knowledge is limited. Imagination encircles the world. (instein) The ordinary mathematical procedure in dealing with any actual problem is, after abstracting what are believed to be the essential elements of the problem, to consider it as one of a system of possibilities infinitely wider than the actual, the essential relations of which may be apprehended by generalized reasoning, and subsumed in general formulae (Fisher) Hilbert, being told that one of his students left mathematics to become a poet: I always knew that he had not enough imagination 26