Critical Reviews in Plant Sciences, 20(3):251 275 (2001) Gene Tagging with Random Amplified Polymorphic DNA (RAPD) Markers for Molecular Breeding in Plants S. A. Ranade, * Nuzhat Farooqui, Esha Bhattacharya, and Anjali Verma Plant Molecular Biology Division, National Botanical Research Institute, Rana Pratap Marg, Lucknow 226 001 (U.P.) India * Corresponding author. Plant Molecular Biology Division, National Botanical Research, Institute, Rana Pratap Marg, Lucknow 226001. (U.P.) India. Fax. no.: (91) 522 205836, 205839 E-mail: shirishranade@yahoo.com. ABSTRACT: Markers are of interest to plant breeders as a source of genetic information on crops and for use in indirect selection of traits to which the markers are linked. In the classic breeding approach, the markers were invariably the visible morphological and other phenotypic characters, and the breeders expended considerable effort and time in refining the crosses as the tight linkage or association of the desired characters with the obvious phenotypic characters was never unequivocally established. Furthermore, indirect selection for a trait using such morphological markers was not practical due to (1) a paucity of suitable markers, (2) the undesirable pleiotropic effects of many morphological markers on plant phenotype, and (3) the inability to score multiple morphological mutant traits in a single segregating population. With the advancement in molecular biology, the use of molecular markers in plant breeding has become very commonplace and has given rise to molecular breeding. Molecular breeding involves primarily gene tagging, followed by marker-assisted selection of desired genes or genomes. Gene tagging refers to the identification of existing DNA or the introduction of new DNA that can function as a tag or label for the gene of interest. In order for the DNA sequences to be conserved as a tag, important prerequisites exist. This review also summarizes the achievements in gene tagging that have been made over the last 7 to 8 years. KEY WORDS: RAPD, gene tagging. I. INTRODUCTION Markers are of interest to plant breeders as a source of genetic information on crops and for use in indirect selection of traits to which the markers are linked. In the classic breeding approach, the markers were invariably the visible morphological and other phenotypic characters. The inheritance of these characters was scored for in a cross. Further, the success of the selection of other desirable traits to which these (the visible, morphological characters) were apparently closely associated with was also invariably evaluated. This resulted in an indirect selection of the desired characters. In all of these processes, the tight linkage or association of the desired characters with the obvious phenotypic characters was never established unequivocally. Consequently, the 0735-2689/01/$.50 2001 by CRC Press LLC breeders expended considerable effort and time in refining the crosses. Therefore, until recently such indirect selection for a trait was not practical due to several reasons, such as a paucity of suitable markers, the undesirable pleiotropic effects of many morphological markers on plant phenotype, and the inability to score multiple morphological mutant traits in a single segregating population (Paterson et al., 1991). With the advancement in molecular biology, the use of molecular markers in plant breeding has become very commonplace and has given rise to molecular breeding. Molecular breeding involves primarily gene tagging, followed by marker-assisted selection of desired genes or genomes. Gene tagging refers to identification of existing DNA or introduction of new DNA that can function as a tag or label for the gene of interest. In order for the DNA se- 251
quences to be conserved as a tag, important prerequisites have been identified (see box in Figure 1). A number of molecular tags have now been determined for many genes in most of the important plants. These molecular tags include cloned restriction fragment length polymorphism (RFLP) probes, oligonucleotide RFLP probes, variable number tandem repeats (VNTR), microsatellite, minisatellite, and other DNA fingerprint loci, and specific as well as arbitrary sequence primers. For any or all of these to be used as tags, these must satisfy the given criteria. Gene tagging and marker-assisted selection is an essential component of molecular breeding and is based on saturation mapping of the genomes. This has opened up the possibility of identifying, mapping, tagging, and even isolating or transferring quantitative trait loci (QTLs). Thus, the most powerful application of DNA markers in plant breeding may be the ability to clone genes hitherto known only by phenotype. In the past, cloning such genes was difficult or impossible. With the advent of DNA marker technology and transposon tagging, important genes have now become accessible to molecular cloning. DNA markers provide the essential starting point for physical isolation of genomic regions containing the gene of interest (positional cloning). The efforts that are involved in tagging a gene can be used further as a part of marker-assisted selection program. As the economically important genes are tagged, they can even be transferred to unrelated species. Molecular breeding has an important role in crop-improvement programs. However, in the case of the difficult plants, molecular breeding can have an even more profound impact. Forest trees are the dominant plant life covering millions of hectares on the Earth and form vital plant communities that sustain a great diversity of life forms. Tree breeding programs have become the most important part of intensive forestry practices. The trees, however, due to large genome sizes and lack of any or substantial genetic linkage data, are considered to be among the difficult subjects for genetic studies (Lehner et al., 1995). Similarly, for all other plants where genetic data are scanty or crosses are difficult to achieve or in the case of the long-lived perennials, genetic linkage and mapping work is never easy to carry out. Consequently, in all such difficult cases, it is expected that complex trait dissection and molecular breeding will be better achieved through the use of gene tags or molecular markers (Lander and Schork, 1994) than through conventional breeding. The utility of molecular markers in tree breeding and improvement programs has been reviewed previously (Strauss et al., 1992; Tauer et al., 1992; Kremer et al., 1994). II. MARKERS USED FOR GENE TAGGING Markers based on variation in length of DNA fragments obtained by digestion with restriction endonuclease (RFLPs, Botstein et al., 1980) were the earliest to be developed for molecular breeding work. Such markers have several advantages over other markers. They can detect more number of loci and alleles, are phenotypically neutral, and can be scored at any stage of plant development. RFLP markers have been employed extensively to tag useful genes in several crop plants and trees. To list all of these is beyond the scope of this review. The trend in the recent years is, however, to combine RFLP markers with RAPD and other PCR-based markers to carry out saturation mapping and even marker-assisted selection for pyramiding desirable genes (Huang et al., 1997). RFLP technology has been reviewed previously (Paterson et al., 1991; Young, 1992; Lee, 1995; Winter and Kahl 1995). Despite the demonstrated usefulness of RFLP markers, the development of these markers involves a tedious, expensive, and multistep process that requires considerable investment in personnel, equipment, and chemical and safety concerns if radioactive probes are used. Furthermore, only one of several markers screened is polymorphic, and this can be of serious concern in cases of crosses involving closely related plants (Winter and Kahl, 1995). Finally, the RFLP technique requires repeated application and a large amount of DNA for each application. This makes the RFLP analysis a cost- and effort-intensive technology. Isozymes have been used as markers and genetic characters. However, the numbers of 252
FIGURE 1. The important prerequisites for the use of DNA sequences as gene tags are listed in the textbox. It is only when these prerequisites are fulfilled that gene tagging with DNA sequences will succeed. 253
isozymes that could be reliably assayed were limited by their assay conditions, and at best only 100 or so of the isozyme loci were detected. These low numbers of isozyme loci relative to the enormously large size of the plant genome were thus inadequate to help in saturation mapping, and therefore in gene tagging. Furthermore, in many cases the detection and assay of the enzyme were influenced by temporal and spatial factors, as a result of which the isozyme assay of genetic loci were ineffective. Thus, the isozymes have found lesser applications in gene tagging and markerassisted selection programs, despite being among the earliest to be developed for molecular breeding work (Tanksley and Orton, 1983). Recently, a molecular marker based on PCR has been developed that overcomes many of these limitations of RFLP and isozyme markers. The basic PCR was modified to develop a new form of molecular marker, the RAPD marker (Welsh and McClelland, 1990; Williams et al., 1990). Essentially, a single short primer of arbitrary sequence is selected at random to be used singly in PCR. This primer is expected to anneal to one or more sequence sites on both strands of the template DNA. Every time the primer anneals to at least two sites on the opposite strands such that the maximum distance between the two sites is less than 5 kbp and such that the 3'-OH ends of the primer at the two sites face each other, a discrete product is formed. A schematic description of the RAPD strategy is described in Figure 2. Further, as the starting plant DNA used as the template is much larger than both the primer and this maximum size of 5 kbp, there can be several such discrete products formed. Given the enormously large number of such short arbitrary sequence primers possible, this technique offers an excellent prospect for generating several polymorphic profiles that can be useful for gene tagging, MAS, and related techniques. RAPD technique has revolutionized genetic analyses in many plants and animals in the decade since it was first discovered. The polymorphism is identified as presence or absence of discrete bands and thus is of dominant nature. This was initially considered as being detrimental to carrying out detailed genetic analyses. Despite these limitations, however, several notable achievements have been made in tagging genes with RAPD profiles wholly or partially. The entire range of genetic populations, simple segregants and bulk samples have been elegantly used to tag diverse genes such as the genes for resistance to pest and pathogens, the genes encoding yield and growth function, and the genes encoding sex determination. Gene tagging using RAPD markers has at least three major advantages over other methods. First, a universal set of primers can be used and screened in a short period, second, isolation of cloned DNA probes or preparation of hybridization filters is not required; and third only a small quantity of genomic DNA is needed for each analysis. The genomic DNA can be easily obtained using simple and rapid methods (Dellaporta et al., 1983; Edward et al., 1991; Weing and Culter, 1993). Ragot and Hoisington (1993) have shown in their study that the cost per data point for RFLPs is less for large populations, while RAPDs are preferred in the case of small populations. This difference is, however, immaterial in many of the self-pollinated crops where RFLPs lack a sufficient level of polymorphism within a species or between related breeding material. RAPDs under these conditions appear to detect higher levels of polymorphism and are more valuable in gene tagging too, an area where breeders are focusing their attention currently. III. GENETIC POPULATIONS USED FOR TAGGING The actual applications of RAPD in gene tagging programs have generally involved the use of specific genetic populations. These include populations derived from or consisting of recurrent back cross selection progenies, recombinant inbred lines, near isogenic lines, and in a few cases single segregants of defined crosses. In those plants where such genetic populations are not available or easily obtainable, gene tagging has been achieved using bulked samples and segregants such that the contrasting characters to be tagged is analyzed in separate bulks. The advancement in gene tagging over the last 5 to 6 years is summarized in Tables 1 to 4. 254
FIGURE 2. RAPD strategy. The strategy for carrying out RAPD PCR in case of say two trees is diagrammatically represented in this figure. The gel photograph in the lower most panel is part of an actual experimental result in case of several neem variety DNAs, using primer OP-D18 (from Operon Technologies Inc., Alameda, CA, USA). Just two lanes are selected for display here to illustrate the RAPD strategy for gene tagging. The two trees are shown to differ in branching character and the strategy aims at the identification of a RAPD pattern that wholly or partly seems to correlate with the differences in branching patterns between the trees. 255
TABLE 1 Gene Tagging Using Direct Plant Types, Cultivars, Doubled Haploids, Haploid Mega- Gametophytes, Rootstock, and Other Similar Materials 256
257
TABLE 1 (continued) 258
TABLE 2 Gene Tagging Using Segregating and Backcross Progeny as Well as Using the F1 Hybrids and Subsequent Generation Progeny 259
TABLE 2 (continued) 260
TABLE 3 Gene Tagging Using Near-Isogenic Lines (NIL) 261
TABLE 3 (continued) 262