Biology 203 - Evolution Dr. Kilburn, page 1 In this unit, we will look at the mechanisms of evolution, largely at the population scale. Our primary focus will be on natural selection, but we will also look at the processes of mutation, gene flow, drift, and non-random mating as causes of changes in gene/allele frequency within populations over time. First, though, we need to review some basic genetics. Topic outline: I. Review of DNA structure and function A. Structure of the DNA molecule B. Basic DNA function C. Structure and function of eukaryotic genes II. Point mutations create new alleles A. Causes and types of point mutations B. Fitness effects of point mutations C. Rates of point mutations III. Origins of new genes A. Gene duplication B. Hybrid genes IV. Types and consequences of chromosome alterations A. Inversions B. Polyploidy V. Measuring genetic variation A. Determining genotypes B. Quantifying genetic variation C. Why is there so much genetic variation? VI. Summary/conclusions I. Review of DNA structure and function A. Structure of the DNA molecule 1. DNA is a double helix of nucleotide polymers a. Nucleotide = 5-carbon sugar + phosphate group + nitrogenous base b. two classes of bases: i. purines = double-ring, A & G ii. pyrimidines = single-ring, T & C c. nucleotides are linked into polymers by covalent bonds between phosphates and sugars
Biology 203 - Evolution Dr. Kilburn, page 2 d. two polymers are held together by hydrogen bonds between complementary bases: i. A bonds with T ii. G bonds with C 2. General functions of DNA are to a. store information needed to regulate cell structure and function i. information stored = amino acid sequences (1' structure) of proteins ii. information stored in the sequence of bases along the DNA molecule b. pass information accurately from cell to cell i. requires that DNA be able to replicate itself exactly B. Basic DNA functions 1. Key features of DNA replication: a. suite of proteins, especially DNA polymerase, carry out replication b. during replication, each strand of the DNA molecule acts as template to replicate its complementary strand i. after strands are separated, free nucleotides in the nucleus pair with complementary bases along each open strand of DNA c. although base pairing is specific, errors do get made fairly frequently i. DNA polymerase and other proteins proofread new DNA strands and repair most replication errors 2. DNA is fairly fragile a. can be damaged by a variety of things, including high-energy radiation, chemical mutagens, etc. b. cells have repair mechanisms (enzyme systems) that detect and repair damaged DNA 3. Transcription and translation use the information stored in the DNA molecule to build proteins
Biology 203 - Evolution Dr. Kilburn, page 3 a. in transcription, a length of DNA is copied in the form of a singlestranded mrna molecule b. in the cytoplasm, mrna combines with ribosome c. during translation, trna brings amino acids to mrna/ribosome complex to build protein in the sequence specified by the original length of DNA 4. The genetic code a. triplet code: set of 3 DNA bases code for one mrna codon (triplet of mrna coding for 1 amino acid) and one amino acid b. code is redundant: 64 triplets coding for ~20 amino acids, so some amino acids have more than one codon: i. e.g., phenylalanine = UUU, UUC ii. redundant codons usually have same first 2 bases and differ in the third position 5. Some definitions based on understanding molecular structure of DNA: a. gene = length of DNA coding for functional RNA product i. note definition includes both DNA coding for proteins (mrna is functional RNA product) and for rrnas, trnas ii. locus (pl. loci) = position of gene on chromosome; often used almost synonymously with gene b. alleles = versions of a gene that differ in their base sequences C. Structure and function of eukaryotic genes: eukaryotic genes and genomes are very complex 1. genome includes lots of DNA whose function is other than to be transcribed/translated a. some seems to play a role in regulating transcription/translation b. some has unknown function c. some appears to have no function at all 2. eukaryotic genes are complex structurally and functionally
Biology 203 - Evolution Dr. Kilburn, page 4 a. coding region of the gene includes stretches of DNA that are transcribed but not translated i. pieces that are transcribed and translated = exons (because they re expressed) ii. pieces that are transcribed but not translated = introns (because they intervene between exons) b. after translation, but before mrna leaves nucleus, system of enzymes carries out mrna processing = process of removing introns and splicing exons back together c. for at least some genes, differences in processing (i.e., which exons are used and which are not) can result in different proteins being generated from the same gene! II. Point mutations create new alleles A. Causes and types of point mutations 1. Point mutations are changes in a single base of a DNA sequence 2. Two causes, both the result of reactions catalyzed by DNA polymerase: a. uncorrected replication errors b. errors in repair of damaged sites 3. Point mutations can be classified by the type of change to the DNA sequence (note my terminology is a little different from text): a. replacement mutations: an incorrect base is paired during replication or repair i. transitions = replace purine with purine, pyrimidine with pyrimidine ii. transversions = replace purine with pyrimidine and vice-versa iii. transitions ~ twice as common as transversions, probably because they don t distort the double helix (why not?) and so are less likely to be detected and repaired iv. this type of mutation leaves the reading frame (beginning and ending
Biology 203 - Evolution Dr. Kilburn, page 5 point of each codon) intact only one codon is affected b. frameshift mutations change the reading frame, so all codons after the point of mutation are affected: i. additions = extra base inserted ii. deletions = base is missed B. Fitness effects of point mutations 1. Overall, the effect of a point mutation can range from highly beneficial to very harmful (deleterious) 2. In general, though: a. most replacement mutations will have relatively little effect either positive or negative on fitness b. most frameshift mutations will have large effects usually negative i. fairly common for frameshift to result in premature stop codon (nonsense mutation) so resulting protein may be completely nonfunctional 3. Specific fitness effects of replacement mutations: a. mutation may result in no change in amino acid sequence i. because of redundancy of code, many third-site mutations simply replace one codon with a codon for the same amino acid ii. called synonymous or silent site mutations iii. have no effect on fitness b. mutation may change the amino acid sequence of the protein i. called nonsynonymous or missense mutations ii. effects vary, depending on the specific amino acid substitution, where it occurs in the protein, etc. iii. remember that fitness effects will also depend on environmental conditions as well! iv. e.g.: sickle-cell anemia
Biology 203 - Evolution Dr. Kilburn, page 6 a) caused by allele with adenine instead of thymine at nucleotide 2 in codon 6 of ß hemoglobin b) results in replacement of glutamic acid with valine c) harmful in homozygous condition d) beneficial in heterozygous condition in areas with high incidence of malaria v. note, however, than many different alleles of human ß hemoglobin exist, most of which work just fine C. Rates of point mutations 1. on your own, read about use of loss-of-function mutations to calculate mutation rates and why rates tend to be underestimated 2. General findings: a. although mutation rates are generally low on a per-gene basis, when whole genome is taken into account (i.e., 60,000 genes in humans), probably ~ 10% of all gametes carry a phenotypically detectable mutation i. because many mutations phenotypically undetectable, actual % will be higher ii. likely that majority of all offspring have at least one new allele somewhere in their genomes! b. mutation rates vary at all levels of organization: i. May vary up to 100,000-fold among species a) reasons not well understood b) in plants, mutation rate correlated with generation time: because of the way plants generate germ cell tissues, plants with long generation times accumulate more mutations in future germ cell tissues than do plants with short generation times c) does the same idea hold in animals? don t know ii. Rates also variable among individuals because of genetic variation in
Biology 203 - Evolution Dr. Kilburn, page 7 enzymes used for proofreading and repair of DNA a) Like any other protein, many different alleles of DNA polymerase exist b) alleles vary in both accuracy and speed: the faster the enzyme works (so the more rapid replication is), the less accurate they are (an example of a selective tradeoff!) c) similar pattern may hold for DNA repair mechanisms iii. Rates vary among genes within individuals; this is also poorly understood, except that: a) mutation rate in coding regions of genes is less than non-coding regions b) mutation rate in DNA that isn t transcribed/translated is higher than in transcriptionally active DNA III. Origins of new genes: the preceding discussion explains how new variants of existing genes arise but not how new genes (genes with completely new functions) arise. Here we ll deal primarily with one major mechanism A. Gene duplication 1. Genes are duplicated during unequal cross-over (fig 4.7) 2. Very important process because results in additional DNA: a. parental gene still works b. so duplicate is free to mutate without fitness consequences (i.e., even if a mutation results in a loss of function, organisms still has a functional copy, so mutation doesn t matter) c. as a result, it s possible for the mutating duplicate gene to produce a novel protein with a new function 3. E.g., globin gene family a. Globin gene family = two clusters of loci coding for component subunits of hemoglobin
Biology 203 - Evolution Dr. Kilburn, page 8 i. a-like cluster on chromosome 16 includes 3 functional loci ii. ß-like cluster on chromosome 11 includes 5 functional loci b. each locus is expressed during a different time in human development (fig. 4.8) functions differ enough to make each locus well-adapted for a different stage of development c. Why do we think that these loci arose via duplication? Test hypothesis that the loci within the gene clusters arose via duplication i. prediction #1 = should get fairly high degree of structural and functional similarity: 3 lines of evidence support the hypothesis a) high structural similarity of transcription units among loci, including position and length of introns and exons (fig. 4.9) b) high sequence similarity among loci c) similarity in function ii. prediction #2: Not all duplicated genes will have accumulated favorable mutations should see some loci that are non-functional ( failed experiments ): supported by presence of pseudogenes = nonfunctional lengths of DNA (but with some structural similarities) 4. Using same criteria, numerous other gene families have been identified (see table 4.3) B. Hybrid genes can arise from a combination of mechanisms e.g., jingwei gene in Drosophila teissieri and D. yakuba 1. In Drosophila, chromosome 2 includes locus for alcohol dehydrogenase (Adh) = enzyme that breaks down alcohol (found in rotting fruit) 2. In two species, found a locus on chromosome 3 that is very similar to Adh, but without the introns 3. Concluded that the new locus was created by reverse transcription from normal Adh mrna a. Reverse transcriptase is present in eukaryotic cells
Biology 203 - Evolution Dr. Kilburn, page 9 b. remember that introns are removed in the nucleus during mrna processing c. so mechanism would simply be that processed mrna was reverse transcribed, and the new DNA inserted back into a chromosome d. note that this is another form of gene duplication 4. Found evidence that the new locus is functional: a. alleles sequenced so far include only silent site mutations, suggesting that selection is acting on the locus to eliminate missense mutations b. in pseudogenes, in contrast, missense mutations are as common as silent site mutations 5. Sequencing mrna from jingwei revealed that, in addition to the Adh-like sequences, the gene also includes exons similar to those found in adjacent genes on chromosome 3 6. So seems that the gene arose from duplication of at least two loci on two different chromosomes! IV. Types and consequences of chromosome alterations = mutations affecting whole chromosomes, not just individual genes A. Inversions 1. Result from breakage/rearrangement of chromosomes (fig. 4.11) 2. Consequence = alleles in inverted segment are tightly linked a. linkage = tendency of alleles to be inherited together i. alleles on the same chromosome are more tightly linked than alleles on separate chromosomes ii. even on the same chromosome, though, alleles may be separated due to crossing-over iii. in general, likelihood of alleles being separated during cross-over is proportional to the distance between them: close together >unlikely to be separated > tightly linked (more on this later!)
Biology 203 - Evolution Dr. Kilburn, page 10 b. inversions protect combinations of alleles from being separated during crossing-over: i. inversions prevent homologs from aligning properly in heterozygotes (i.e., individuals with one normal and one inverted chromosome) ii. when crossing-over takes place between alleles within an inversion, get gametes with duplication and/or loss of chromosomal regions so they re dysfunctional iii. result = only those inversions whose alleles haven t been disrupted by cross-over get inherited iv. this can be very important when specific combinations of alleles are beneficial 3. Read example of inversions in Drosophila as example pay special attention to evidence that specific inversions are adaptive B. Polyploidy 1. Results from duplication of entire genome (vs. single region of DNA or single chromosome) due to segregation errors in meiosis 2. Common phenomenon in plants, but not in animals (although it does occur in some animals) 3. Direct route illustrated in fig 4.13a: a. 2n parent produces 2n gametes instead of 1n gametes b. 2 2n gametes fuse to form 4n offspring c. 4n individual will produce 2n gametes if meiosis works properly 4. Consequences (more on these later) a. offspring may be reproductively isolated from parents (unable to be reproduce with normal 1n gamete from parent), resulting in new species b. duplication of genome offers same possibilities for mutations to produce novel traits as does gene duplication (it s just on a larger scale) 5. Rate of polyploid formation in plants is very high possibly as high as the
Biology 203 - Evolution Dr. Kilburn, page 11 rate of point mutations; nearly 50% of angiosperm species are polyploid! V. Measuring genetic variation A. Determining genotypes 1. Given simple Mendelian traits with co-dominance, can determine genotypes directly from phenotypes 2. Most traits are far more complex; most common method is some form of electrophoresis to measure variation in either proteins or DNA a. protein electrophoresis illustrated in box 4.1: i. samples of proteins are placed in a gel and exposed to electric field ii. differences in amino acid sequence among alleles of the protein can result in difference in charge:mass iii. differences in charge:mass will affect how far the proteins diffuse through the gel iv. gel can be stained so that all the alleles of an individual protein are visible a) different alleles will produce bands at different locations on the gel b) homozygotes will have one band c) heterozygotes will have two bands v. protein electrophoresis will underestimate genetic diversity to the extent that different alleles have the same charge:mass ratio b. DNA electrophoresis = DNA fingerprinting = RFLP analysis (see handout at end of notes) i. special enzymes called restriction endonucleases cut DNA into small fragments ii. different restriction endonucleases recognize different short DNA sequences as the places to cut the DNA iii. so, if two individuals have different DNA sequences, using the same restriction endonucleases to cut their DNA will result in samples with
Biology 203 - Evolution Dr. Kilburn, page 12 different fragment lengths iv. once cut, DNA is run through a gel as above but now the distance the DNA migrates will be a function of the length of each fragment v. Gel is stained to show location of DNA fragments vi. differences among individuals will show up as differences in the banding pattern of the gel 3. Note that we can also determine amino acid sequences and DNA sequences directly but it s more complex, so electrophoresis is used whenever appropriate B. Quantifying genetic variation: 1. three measures of genetic variation are commonly important: a. allele frequency: what proportion of the alleles in a population are of a specific type (e.g., what is the frequency of the?32 allele of CCR5?) i. need to identify number of individuals homozygous, heterozygous, or lacking the allele: ii. e.g., from table 4.4, using data for people from Iceland a) 283 people tested i) 75 +/+ (homozygous for normal CCR5) ii) 24 +/?32 (heterozygous) iii) 3?32/?32 (homozygous for?32 allele) iii. calculate by counting allele copies: a) 283 people X 2 alleles each = 566 alleles total in population b) 75 +/+ individuals X 0?32 alleles each = 0?32 alleles c) 24 +/?32 individuals X 1?32 each = 24?32 alleles d) 3?32/?32 individuals X 2?32 each = 6?32 alleles e) total = 30?32 alleles / 566 total alleles =.053 = 5.3%?32 in the population
Biology 203 - Evolution Dr. Kilburn, page 13 b. Percent polymorphism = % of loci in the population that have multiple alleles (note that this measures variation within the population as a whole) i. Let A = allele #1 for gene A; a = allele #2 for gene A, etc. ii. assume sample population of the following genotypes: a) AaBbCCDD b) AABBCCDD c) AAbbCCDD d) aabbccdd iii. calculate % polymorphism: a) 4 loci total in the population (A, B, C, and D) b) 2 loci are polymorphic (A and B) c) 2/4 = 50% polymorphism c. Mean heterozygosity = % of heterozygous loci in the average individual (so this measures variation across loci within an individual) i. use same sample population as above ii. calculate % heterozygosity for each individual: a) AaBbCCDD: 2 heterozygous loci/4 loci total = 50% hetero b) AABBCCDD: 0 heterozygous loci = 0% c) AAbbCCDD: 0 heterozygous loci = 0% d) aabbccdd: 0 heterozygous loci = 0% iii. average individual scores: 50%/4 individuals = 12.5% 2. Practice calculations: a. allele frequency using data from table 4.4 b. % polymorphism and mean heterozygosity from the following sample population: i. AaBbCCDD ii. AABBCcdd iii. AaBbCcDd
Biology 203 - Evolution Dr. Kilburn, page 14 iv. AAbbccdd v. AaBbCCDd 3. How much genetic variation do typical populations have? Tremendous variation among populations, but general findings are that a. most natural populations have substantial genetic variation b. based on studies to date, natural population has between 33 and 50% polymorphism (between 1/3 and ½ of loci have more than one allele) c. mean heterozygosity ranges from 4 to 15% (average individual is heterozygous at 4-15% of loci) d. substantial range of variation among species for both measures C. Why is there so much genetic variation? 1. Before techniques were developed for measuring genetic variation, assumption was that relatively little variation existed: selection would always favor one allele over another, so less adaptive alleles would be eliminated the amount of genetic variation documented in nature is far more than anticipated 2. So why is there so much variation (i.e., why doesn t selection eliminate much of it?)? Two general explanations have been proposed (we ll look at these in more detail later): a. selectionist (or balance) view: diversity is maintained by natural selection, via selection favoring i. rare individuals ii. heterozygotes iii. different alleles in different places and/or at different times b. neutrality model: most alleles of polymorphic loci are functionally equivalent, so selection doesn t act to favor one over another, and all are maintained
Biology 203 - Evolution Dr. Kilburn, page 15 VI. Summary/conclusions: A. Genomes especially eukaryotic genomes are highly complex and dynamic B. Genetic variation is generated by mutations acting on single genes, individual chromosomes, or whole genomes C. We have a variety of tools that allow us to quantify the amount of genetic variation in populations D. There s a lot of it there!
Biology 203 - Evolution Dr. Kilburn, page 16