Midterm exam BIOSCI 113/244 WINTER QUARTER,

Size: px
Start display at page:

Download "Midterm exam BIOSCI 113/244 WINTER QUARTER,"

Transcription

1 Midterm exam BIOSCI 113/244 WINTER QUARTER, Name: Instructions: A) The due date is Monday, 02/13/06 before 10AM. Please drop them off at my office (Herrin Labs, room 352B). I will have a box set up for collecting the exams. Note that we will not accept late exams. The answers to the exam will be posted at some point after 10AM on 02/13/06 on the web site. We will do our best to correct the exam by the lecture time on 02/15/06. B) Please staple your answers to your exam copy. Don t forget to put your name on each sheet of the answers in case the stapling proves unreliable. PLEASE MARK CLEARLY YOUR FINAL ANSWER. C) You can use any book you wish while doing the exam. Please do not consult with each other. D) Please show your work for each problem. This can help us give you partial credit if the answer is not exactly correct. Problem Maximum Actual Problem MaximumActual a 7 3a 7 7b 7 3b 7 7c 7 3c 7 7d a b 8 TOTAL 200 I have abided by the Stanford Honor Code: Signature

2 1) (23 pts) Consider a single nucleotide site that contains the nucleotide A in every individual of a population of a species of mammals. Assume the Jukes-Cantor model with the mutation rate between any two nucleotides being 10-9 per year. What is the probability that this nucleotide site will be a G or a C 900 million years later? Assume that the time it takes to fix a mutation is very short relative to the evolutionary time under investigation. Assume also that at this position all nucleotides are selectively equivalent to each other. 2) (23 pts) At a particular locus G:C pairs are slightly preferable to A:T pairs, such that the probability of fixation of a new C:G mutation at a position where all preexisting alleles are A:T is 1.2-fold higher than that of a new A:T allele where all of the preexisting alleles are G:C. Assume that G:C pairs mutate to A:T pairs twice as frequently as A:T pairs mutate to G:C pairs. Calculate the expected proportion of G:C pairs at this locus at equilibrium. 3) After a particularly vicious storm hits a small island off of the Atlantic coast of the US a single plant (species Arabidopsis thaliana) is left standing. Fortunately this plant is hermaphroditic and after selfing generates 2 plants in the next generation (F1 plants). The original population had two distinct functional alleles at a gene controlling flowering time. The early allele had the frequency of 0.1 prior to the storm. Plants homozygous for the early allele flower in March while the heterozygotes and homozygotes for the other allele (late) flower in June. Assume that the original population was in Hardy-Weinberg equilibrium and also assume the Wright-Fisher model for generating the two F1 plants. a) (7 pts) Calculate the probability of the late allele being absent in the genomes of the two F1 plants. b) (7 pts) Calculate the probability of the early allele being absent in the genomes of the two F1 plants.

3 c) (7 pts) Calculate the probability that both F1 plants will flower in June. 4) (20 pts) Imagine that uracil was the natural base in DNA and thymine was the natural base in RNA. Would you expect methylation of cytosines in CpG dinucleotides still have an effect on the mutation rate? If, yes, what kind of an effect? Explain. 5) In a diploid species of newly discovered animal, two alleles A and a segregate at a locus important for mating. For any two individuals, the more different they are from each other at this locus, the more likely they are to mate and to produce progeny. Specifically, two AA homozygotes never mate with each other at all. The same is true for two aa homozygotes, and Aa heterozygotes. On the other hand, when AA homozygotes meet the aa homozygotes they engage in a mating at 2-fold higher rate compared to the situation when either homozygote runs into an Aa heterozygote. a) (8 pts) A population of this animal starts off with 20% AA homozygotes and 50% Aa heterozygotes. What is the proportion of a alleles in the population at this time? b) (8 pts) Calculate the proportion of each genotype among the offspring after one generation of mating.

4 6) (23 pts) A friend of yours asks for your help in determining whether a protein she's studying has been under selection. She has the full protein sequence for one individual in each of two different species. However, the company she hired to sequence the DNA loci coding for the respective proteins failed to send her the last 15 base pairs of each sequence, and subsequently went out of business. Based on the DNA alignment she does have, she finds that there were 189 nonsynonymous sites and 61 synonymous sites, between which there were 25 nonsynonymous changes and 4 synonymous changes. She shows you the part of the protein alignment corresponding to the missing DNA sequence: Pro Val Pro Cys Gly Pro Ala Thr Cys Ala Calculate the maximum and minimum ratios of Ka/Ks possible for the protein overall. Based on these results, would you tell your friend that negative selection had been operating? Positive selection? 7) You catch Drosophila melanogaster flies and set up what is known as a population cage, which allows to you maintain the population indefinitely at constant size. After the adults mate and lay eggs for the next generation you kill them and determine the population frequency at a particular microsatellite locus. The flies turn out to have the following genotypes at this locus: A/A A/B A/C B/B B/C C/C a) (7 pts) What are the frequencies for each of the alleles? b) (7 pts) What is the observed heterozygosity in this system?

5 c) (7 pts) What is the expected heterozygosity on the assumption that the D. melanogaster population and this locus conform to the assumptions of Hardy- Weinberg equilibrium? d) (7 pts) Under the Wright-Fisher model, what is the expected heterozygosity after generations of random mating of these flies? 8) (23 pts) A whistleblower sends you a tissue sample of what he claims is dugong meat bought illegally in Vanuatu, but you have your doubts about the sample s authenticity. Fortunately, you possess a 3000 bp dugong DNA sequence from a locus known to be neutral. In addition, you possess a second 3000 bp DNA sequence orthologous to the dugong sequence, which came from a manatee, and which differs from the known dugong sequence at 281 sites. You know from previous work that the size of the current dugong population is ~ 100,000. You learn that manatees and dugongs diverged 50 million years ago, and that both species reproduce once every decade on average. If the informant s sample truly came from a dugong, how many differences should you expect between your trusted dugong sequence and the orthologous sequence from the tissue sample? 9) (23 pts) You sequence a very long stretch of DNA from three different squirrel monkeys. After analyzing the aligned sequences you determine that there are 63 synonymous polymorphisms detectable using these sequences. The biotech company you work for wants you to find as many replacement polymorphisms as possible, but they want to know how likely it is that you will find anything before spending any money. Assume that all synonymous mutations and 10% of all replacement mutations are selectively neutral. The rest of the replacement mutations are selectively deleterious to

6 the extent that they are virtually lethal and can never be detected as polymorphisms. Assume also that 75% of all mutations in an average protein-coding sequence result in a replacement change. How many replacement polymorphisms would you expect to observe if you sample 15 alleles of the orthologous sequence from the same population of squirrel monkeys?