Restriction Fragment Length Polymorphisms (RFLPs) -1 Readings: Griffiths et al: 7th Edition: Ch. 12 pp. 384-386; Ch.13 pp404-407 8th Edition: pp. 364-366 Assigned Problems: 8th Ch. 11: 32, 34, 38-39 7th Ch. 12: 21 Ch. 13: 4, 5, 9, 10, 12-14 Concepts: What are RFLPs and how do they act like genetic marker loci? 1. Some mutations alter the base pair sequence of restriction enzyme sites while others alter the distance between adjacent sites. Such alterations result in restriction fragment length polymorphisms (RFLP). 2. RFLPs represent allelic forms of a locus and behave as alleles in inheritance. 3. The location of RFLP can be mapped genetically and thereby used to map chromosomes. Restriction Fragments in Large Genomes The last few lectures have outlined the cloning of genes and other DNA fragments from large genomes. Now move to the analysis of such sequences. Steps: Chromosome ---------------------- -- --------------O---- Cosmid clone / E E \ -------- -- -------- Plasmid subclone ]------------------[ E E Each of the chromosomes, the cosmid clone and the plasmid subclone have the restriction fragment. One can isolate genomic DNA from an organism and digest it with a restriction enzyme. Enzyme cleaves the double stranded DNA at each sequence recognized by the enzyme. This digested DNA can be separated on an agarose gel. Digest with same enzyme (e.g. EcoRI) Do Southern Transfer - DNA transferred to membrane. Southern Transfer: Probed with 32P labeled unique insert DNA - hybridized to: 1. sub cloned insert in plasmid + insert - not to vector 2. cloned insert in cosmid + insert - not to vector or other fragments 3. restriction fragment present on chromosome in genomic DNA - not to any other fragments (unique DNA probe) Autoradiogram of Southern blotted gel 1
Homolog gene a+ E E E 1 --- -------- ------------ -------- ---------- ----- Homolog gene a- 2 --- -------- ------------ ------------------- ----- E 5Kb E Alleles -probe-- Southern Blot. Notes: 1. Hybridization intensity should be proportional to the amount of insert DNA (to which the probe can hybridize). 2. Thus, using a labeled probe and genomic DNA, one can identify the size of a restriction fragment at a particular site in the genome. 3. Diploid organisms have homologous chromosomes, thus the genome has two similar genes (sequences) at each locus -> 2 alleles. These two alleles may be the same or different sized fragments. Hypothetical Example: In this example: loss of middle Eco site is due to an ancestral mutation that changed the GAATTC sequence on homolog 1 to an unrecognized sequence (e.g. TAATTC) on homolog 2 -> (no cleavage). Or visa versa. The difference in the DNA sequence between the homologs makes the RFLP on each homolog an allelic form. Allele#1 = has the Eco site Allele#2 = lacks the Eco site We can see the effect when doing a genomic southern blot with the 2Kb Eco probe one can detect this allelic difference. 2
The "phenotype" of the allele is the size of the restriction fragment that hybridizes to the probe. One can detect both homologs (both alleles) on a genomic southern blot, thus one sees the phenotype for each allele. The phenotype is co-dominant - determine both alleles (no dominant or recessive here). The inheritance of these alleles is exactly like any other. Just as a+ or a- would be passed on, so would the 2Kb or 5Kb restriction fragment size as a phenotype (that directly reveals the genotype). Parents 2Kb/2Kb x 5Kb/5Kb F1 2Kb/5Kb F2: 2Kb/2Kb:2Kb/5Kb:5Kb/5Kb Ratio 1: 2: 1 2Kb and 5Kb alleles represent two different forms, or alleles at this "locus". This "locus" is the middle Eco site, where the DNA sequence differs between the homologous. Polymorphic locus multiple alleles at a locus exist in the population Remember a diploid individual can only have two allelic forms, one for each homolog The advantages of RFLP s include: Unlimited number of markers Co-dominant No developmental component No environmental factors involved in expression Degrees of Polymorphism Two forms in the population - dimorphic But some loci have many forms in a population - e.g. many different sized fragments - considered polymorphic in the population Distribution of Polymorphisms: Within a Genome: Some sequences are not polymorphic (monomorphic) - - essentially everyone in the population has the same sized fragment. - - (e.g. sequence encoding a protein - selection) Other sites are highly polymorphic, - many size fragments (alleles) in the population. (e.g. regions between genes - little/no selection against mutations) RFLP can be mapped genetically From previous example: Have two loci (gene a and the Eco site) that have allelic forms Do standard genetic cross: P1 a+ e+ X a- e- a+ e+ a- e- 3
F1 a+ e+ X a- e- test cross a- e - a- e- Score progeny for a+ or a- (assume a+ is dominant to a-) Score progeny for + or - for Eco site by southern blot of each progeny. test for 2Kb or 5Kb or both If a gene and Eco site are close to each other then Recombinant Frequency will be low (RF=low) -> few crossovers between a and Eco site. If a gene and Eco site are distant from each other then Recombinant Frequency will be high (RF approach 50%) -> many crossovers between a and Eco site. By extension RFLP can be mapped relative to each other by using two different probes and scoring appropriately for the respective phenotypes (fragment sizes). Restriction Fragment Length Polymorphisms (RFLPs) - 2 Locating Genes in Large Genomes Using RFLPs Readings: Griffiths et al: 8th Edition: pp. 394-402 7th Edition: Ch. 3 pp97; Ch 13 pp 406-408; 425-427; Ch. 14 pp 436-442 Concepts: How are RFLPs used to find different loci? 1. RFLPs can be used as alleles in pedigree analysis. 2. RFLPs can be directly associated with the sequence changes that cause a normal gene to be a mutant allele (e.g. sickle-cell anemia). 3. In most cases an RFLP is used only as a nearby genetic marker to find linkage with a phenotype such as an inherited disease. 4. RFLP are used in "DNA fingerprinting". RFLPs in Pedigree analysis - (Human applications) Advantages of RFLP analysis 1. RFLP can be found at almost every location (probe) in a genome as long as many different restriction enzymes are tried. Thus any randomly chosen unique DNA probe can usually serve as an RFLP marker locus probe. 2. RFLP analysis requires a small sample of DNA. Blood sample is usually enough to do many tests. Can culture white blood cells if more is needed. 3. RFLP analysis can be done before any disease symptoms appear (or expected to appear). It may even be done before birth. RFLPs can be directly associated with the base pair mutation: Rare situation, but demonstrates the concept. e.g. sickle-cell anemia - Fig 11-24 (8th) 13-29.(7th). 4
Sickle-cell anemia is an recessive, inherited genetic disease that causes round red blood cells to become sickled in shape. The mutation is a DNA sequence change that alters the normal glutamine to a valine amino acid in the b-globin polypeptide chain (GAG to GTG codon). This change results in the loss of a DdeI restriction site ( Dde I = Mst II). The sickle-cell hemoglobin gene lacks the Dde I site. This provides absolute linkage between the RFLP and a mutant gene (the mutation itself). (Not the usual situation.) RFLP linkage with an inherited disease -Fig 12-10 (8th) 13-3 (7th) RFLPs can be used as marker loci to find linkage with a disease gene locus. Probe P detects Morph 1 (two small bands) or Morph 2 (one large band - the sum of the two small bands). Examine a pedigree of an affected individual- disease D is due to a dominant mutation. Dd - affected has morph 1, 2 (heterozygote mother) dd - unaffected has morph 2 only homozygote (father) ->(classic test cross) 5
Note: We can not say that because morph 1 is in Dd that morph 1 = D Each pedigree must be treated independently. If linked the "linkage phase" could be either way. (Coupled versus Repulsion) (D is coupled to Morph 1 and in repulsion with morph 2). Similar disease mutations could arise on chromosomes with different morphs (RFLP patterns)-> different phase. Furthermore, similar diseases (phenotype) can arise from mutations in different gene loci (on different chromosomes - unlinked). Cross: Dd x dd gives 8 progeny Progeny: - each has one allele as morph 2 (Dad) and either morph 1 or 2 from affected Mother. This is a small sample - normally use many families and combine data. But one can conclude: If it were unlinked then D would have 50% morph 1 50% morph 2. So: D/d disease locus appears linked but not absolutely D phase morph 1 and d phase morph 2 ------D---------------------morph 1----------o-------- ------d---------------------morph 2----------o-------- One can explain the single discordance as a cross-over between the D/d locus and the RFLP locus. This was a simple example. Real mapping of human disease genes: Figure 12-9 (8th) 14-1a, b (7th) 6
On a large scale (many RFLP and many progeny) it has been possible to find linkage with most Human disease genes. Currently this is the method of choice for mapping a human genetic disease. One can test 100-1000 of RFLP sites on various chromosomes to find linkage. First find weak linkage with a RFLP probe, then refine the location with other RFLP sites nearby. "Zeroin" on genetic disease gene. Then use RFLP probe as probes to start chromosomal walk to clone the gene. There are "statistical flukes" mislead investigators. RFLP, VNTRs, and DNA fingerprinting RFLP can arise due to VNTR's VNTR are variable number tandem repeat. First example found in myoglobin gene. Short sequence of 33 base pairs (other examples vary from 15-100 bp) is repeated a variable number of times. Direct repeat - highly polymorphic - many allele morphs Figure 14-4 (7th) Use this repeat sequence as a DNA probe to genomic southern blot. Several loci for such a probe. Many morphs at each locus. Effect is such that each individual has a unique band pattern - a DNA fingerprint (Identical twins are identical). DNA fingerprint Fig 14-3 (7th) - of Blood stain matches a suspect. Current debate of what a "match" is. The technique is much more powerful at excluding people. If the banding pattern doesn't "match" then the samples must be different (e.g. from different individuals). Twins?? 7
Simple VNTR Clone one locus out and use it to identify a single specific locus probe where there is only one VNTR present at the locus. High degree of polymorphism permits one to follow many alleles through a pedigree. Example in Fig 12-11 (8th) 14-5 (7th) and 12-12 (8th) 14-6(7th) 4 morphs - can follow each through to children. This provides a more detailed analysis than is possible with typical RFLP analysis. Microsatellites Microsatellites are small repeated regions of DNA which usually have a 2 or 3 bp repeated sequence which is 10-50 repeats in length. more frequent and polymorphic than VNTRs repeating units of 2, 3, 4 nucleotides: TGTGTG CAACAACAA or AAATAAATAAAT often has many alleles present in the population (heterozygosity freq-70%) for example 10-30 AC dinucleotide repeats. Uses PCR to visualize For forensics. A number of markers are used, up to 15. Each locus can have a number of alleles. 16S-212-212, 215, 218, 221 etc. Examples 4 alleles at this locus, each in equal frequency, with a total of 10 loci. (0.25) 10 For other loci, can use frequency of alleles times each other. This approach most easily rules out who DNA it was not. 8
Probabilities, Gene Frequencies and Forensics While a single individual can only have two alleles at any one locus, in a population there can be many alleles, and these alleles can occur at different frequencies in different populations. The classic example is the ABO blood group system. There are three alleles in the population, A, B and O or i (a null allele). The frequency of any genotype will be the frequency of the two alleles in the population multiplied, (assuming random mating, etc). A Japanese AB would be (0.279*0.172 = 0.048 or 5% of the population). 9