GENOME MAPPING IN PLANT POPULATIONS

Similar documents
MAPPING OF QUANTITATIVE TRAIT LOCI (QTL)

Mapping and Mapping Populations

DESIGNS FOR QTL DETECTION IN LIVESTOCK AND THEIR IMPLICATIONS FOR MAS

Human linkage analysis. fundamental concepts

POPULATION GENETICS Winter 2005 Lecture 18 Quantitative genetics and QTL mapping

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

Linkage & Genetic Mapping in Eukaryotes. Ch. 6

Conifer Translational Genomics Network Coordinated Agricultural Project

MAS refers to the use of DNA markers that are tightly-linked to target loci as a substitute for or to assist phenotypic screening.

Random Allelic Variation

Genetics of dairy production

Introduction to quantitative genetics

We can use a Punnett Square to determine how the gametes will recombine in the next, or F2 generation.

LINKAGE AND CHROMOSOME MAPPING IN EUKARYOTES

Report. General Equations for P t, P s, and the Power of the TDT and the Affected Sib-Pair Test. Ralph McGinnis

Linkage Disequilibrium. Adele Crane & Angela Taravella

Design and Construction of Recombinant Inbred Lines

Answers to additional linkage problems.

Experimental Design and Sample Size Requirement for QTL Mapping

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

Methods for linkage disequilibrium mapping in crops

Genotype AA Aa aa Total N ind We assume that the order of alleles in Aa does not play a role. The genotypic frequencies follow as

Conifer Translational Genomics Network Coordinated Agricultural Project

Lecture WS Evolutionary Genetics Part I - Jochen B. W. Wolf 1

Molecular markers in plant breeding

Plant Science 446/546. Final Examination May 16, 2002

Chapter 14: Mendel and the Gene Idea


Exploring the Genetic Basis of Congenital Heart Defects

Strategy for applying genome-wide selection in dairy cattle

Observing Patterns in Inherited Traits. Chapter 11

Conifer Translational Genomics Network Coordinated Agricultural Project

POPULATION GENETICS. Evolution Lectures 4

Review. Molecular Evolution and the Neutral Theory. Genetic drift. Evolutionary force that removes genetic variation

Gene Linkage and Genetic. Mapping. Key Concepts. Key Terms. Concepts in Action

Video Tutorial 9.1: Determining the map distance between genes

LAB. POPULATION GENETICS. 1. Explain what is meant by a population being in Hardy-Weinberg equilibrium.

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1

An introduction to genetics and molecular biology

Q.2: Write whether the statement is true or false. Correct the statement if it is false.

LECTURE 5: LINKAGE AND GENETIC MAPPING

Chapter 23: The Evolution of Populations. 1. Populations & Gene Pools. Populations & Gene Pools 12/2/ Populations and Gene Pools

LAB ACTIVITY ONE POPULATION GENETICS AND EVOLUTION 2017

Concepts: What are RFLPs and how do they act like genetic marker loci?

Would expect variation to disappear Variation in traits persists (Example: freckles show up in unfreckled parents offspring!)

Genetics - Problem Drill 05: Genetic Mapping: Linkage and Recombination

Genetics II: Linkage and the Chromosomal Theory

Genetic Variation. Genetic Variation within Populations. Population Genetics. Darwin s Observations

AP BIOLOGY Population Genetics and Evolution Lab

Computational Workflows for Genome-Wide Association Study: I

Methods for the analysis of nuclear DNA data

Chapter 5: Overview. Overview. Introduction. Genetic linkage and. Genes located on the same chromosome. linkage. recombinant progeny with genotypes

Chp 10 Patterns of Inheritance

Chicken Genomics - Linkage and QTL mapping

Genetic drift. 1. The Nature of Genetic Drift

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score

7-1. Read this exercise before you come to the laboratory. Review the lecture notes from October 15 (Hardy-Weinberg Equilibrium)

Topic 11. Genetics. I. Patterns of Inheritance: One Trait Considered

Genetic architecture of grain quality characters in rice (Oryza sativa L.)

Chapter 14. Mendel and the Gene Idea

Inheritance Biology. Unit Map. Unit

HARDY WEIBERG EQUILIBRIUM & BIOMETRY

Michelle Wang Department of Biology, Queen s University, Kingston, Ontario Biology 206 (2008)

5 Results. 5.1 AB-QTL analysis. Results Phenotypic traits

-Genes on the same chromosome are called linked. Human -23 pairs of chromosomes, ~35,000 different genes expressed.

CHAPTER 14 Genetics and Propagation

CHAPTER 4 STURTEVANT: THE FIRST GENETIC MAP: DROSOPHILA X CHROMOSOME LINKED GENES MAY BE MAPPED BY THREE-FACTOR TEST CROSSES STURTEVANT S EXPERIMENT

Exam 1 Answers Biology 210 Sept. 20, 2006

HARDY-WEINBERG EQUILIBRIUM

Fertility Factors Fertility Research: Genetic Factors that Affect Fertility By Heather Smith-Thomas

Targeted Recombinant Progeny: a design for ultra-high resolution mapping of Quantitative Trait Loci in crosses between inbred or pure lines

Genetics: Published Articles Ahead of Print, published on December 15, 2005 as /genetics

Mendel and the Gene Idea

Supplementary Note: Detecting population structure in rare variant data

SELECTION FOR REDUCED CROSSING OVER IN DROSOPHILA MELANOGASTER. Manuscript received November 21, 1973 ABSTRACT

The principles of QTL analysis (a minimal mathematics approach)

Genomic Selection Using Low-Density Marker Panels

Genome-Wide Association Studies (GWAS): Computational Them

Measurement of Molecular Genetic Variation. Forces Creating Genetic Variation. Mutation: Nucleotide Substitutions

Genetic load. For the organism as a whole (its genome, and the species), what is the fitness cost of deleterious mutations?

An introductory overview of the current state of statistical genetics

Mapping and selection of bacterial spot resistance in complex populations. David Francis, Sung-Chur Sim, Hui Wang, Matt Robbins, Wencai Yang.

Observing Patterns In Inherited Traits

Basic Concepts of Human Genetics

Initiating maize pre-breeding programs using genomic selection to harness polygenic variation from landrace populations

Plant Science 546. Final Examination May 12, Ag.Sci. Room :00am to 12:00 noon

Principles of Population Genetics

POPULATION GENETICS: The study of the rules governing the maintenance and transmission of genetic variation in natural populations.

Population genetics. Population genetics provides a foundation for studying evolution How/Why?

Ch. 14 Mendel and the Gene Idea

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics

How about the genes? Biology or Genes? DNA Structure. DNA Structure DNA. Proteins. Life functions are regulated by proteins:

Bio 311 Learning Objectives

11.1 Genetic Variation Within Population. KEY CONCEPT A population shares a common gene pool.

Genome-wide association studies (GWAS) Part 1

MENDELIAN GENETICS This presentation contains copyrighted material under the educational fair use exemption to the U.S. copyright law.

Figure 1: Testing the CIT: T.H. Morgan s Fruit Fly Mating Experiments

Dr. Mallery Biology Workshop Fall Semester CELL REPRODUCTION and MENDELIAN GENETICS

Linkage Analysis Computa.onal Genomics Seyoung Kim

THE Complex Trait Consortium (CTC) outlined a

Transcription:

This article may be cited as, Vinod K K (2006) Genome mapping in plant populations. In: Proceedings of the training programme on " Modern Approaches in Plant Genetic Resources Collection, Conservation and Utilization", Tamil Nadu Agricultural University, Coimbatore, India. Downloaded from http://kkvinod.webs.com. pp. 402-414. GENOME MAPPING IN PLANT POPULATIONS K.K.Vinod Rubber Research Institute of India Markers being detectable genetic loci, and having identified many of them polymorphic between two homozygous individuals, can be seen independently following Mendelian segregation among the progenies following recombination between these two parents. In combination, they follow linkage and recombination principles, paving way to determine the degree of nearness they might have in each chromosome (linkage group) they exist on. The process of arranging these markers in order based on their relative genetic distances between them is called mapping. Arrangement of a large set of markers distributed throughout the genome thus result in a number of marker groups which are independent of each other and without sharing any genetic distance information between them. These groups, equivalent to the basic number (X) of haploid genome of the individual are otherwise called as linkage groups or they are the chromosomes themselves. Basic principles of gene mapping 38 One approach to gene mapping (linkage analysis) uses families with a known pedigree structure. Individuals are genotyped at random markers spread across the genome. If a disease gene is close to one of the markers then, within the pedigree, the inheritance pattern at the marker will mimic the inheritance pattern of the disease itself. Linkage analysis has been highly successful at finding genes for simple genetic diseases: i.e., those in which a single major gene is responsible for the disease in a given pedigree, and environmental factors are not very important. A second approach to gene mapping (association, or disequilibrium mapping) uses associations at the population level. The idea is that a disease mutation arises on a particular haplotype background, and so individuals who inherit the mutation will also inherit the same alleles at nearby marker loci. This process is complicated by recombination and mutation. In a sense, association mapping is not fundamentally different from linkage analysis, but instead of using a family pedigree, we have an unknown population genealogy. Because the population genealogy is much deeper than a family pedigree, disequilibrium mapping permits much finer-scale mapping than does linkage analysis (Hastbacka et al., 1992). It has also been argued (Risch and Merikangis, 1996) that in conjunction with new technology for rapid genotyping, this method will Presented in CAS in Genetics and Plant Breeding training programme on "Modern Approaches in Plant Genetic Resources Collection, Conservation and Utilization" CPBG, TNAU, Coimbatore 641 003, Nov. 1 21, 2006

ultimately be more powerful than linkage analysis for isolating genes of small effect (which probably include most of the genes for common diseases). Methods of analysing this kind of data have been proposed using both nonparametric methods (Lazzeroni, 1998), and coalescent-based approaches (Graham and Thompson, 1998). Mapping based on linkage analyses Fig 1. Genetic basis of marker based mapping Linkage based mapping is based on the simple genetic principles, namely, linkage and recombination. Let there are two individuals, homozygous for two alleles of a two loci, A and B. The genotype of one individual is AABB and the other is aabb. They each produce only one type of male and female gamete, AB and ab. Crossing between them result in a F 1 progeny of constitution AaBb. The F 1 can be selfed or intercrossed to produce F 2 generation. F 1 being heterozygous, throws out segregants in F 2, based on the random combination obtained between male and female gametes produced either as AB, ab, Ab or ab. Of these four types of gametes, the first two resembling the gametes produced by the original homozygotic parents, are called parental types. The other two, must have been 403

resulted due to a crossing over between the locus A and B in the F1 heterozygote. They are called recombinant types or recombinants (Fig 1). Recombination is the process by which new combination of parental genes occur by exchanging the alleles of different loci by exchanging the chromosomal segments between homologous chromosomes carrying them. In a test cross, wherein the F 1 heterozygote (AaBb) is crossed with the homozygotic recessive parent (aabb), under normal independent segregation of these loci, with at least a single cross over between them, there would be equal number of parental types and recombinants (50% each). However, if the loci are closely placed enough in such a way that there is only restricted chances of crossing over between them, the proportion of the parental types will be high (>50%) in correspondence to the closeness of the two loci. In such cases we call the loci are linked and the phenomenon is called linkage. These loci which show deviation in their linkage pattern in pedigrees are called in a state of linkage disequlibrium (LD). LD is thus essential for mapping the genetic distance between two loci. The proportion of recombinants in the total progeny, thus provide information about the quantum of cross over took place between the loci, called recombination frequency or cross over value. This value gives an estimate of the distance between the loci, with the assumption that the amount cross over is proportionate to the distance between the two loci. In simple terms, thus the recombination value can be calculated as, No.of recombimants x 100 Recombination frequency (%) = Total no.of progeny One percentage of recombination is equivalent to one arbitrary map unit called as centimorgan or cm. For example, if the recombination frequency between two loci A and B is 5% and the same between B and C is 23%, and that between A and C is 26%, using this values, we can order these loci along a chromosome as given in Fig 2. A B C cm 6 23 Fig 2. Diagrammatic representation of markers A, B and C on the molecular linkage map Here, it may be noted that the observed distance between the loci A and C is not exactly additive, to the total of the distance between the intervening locus B. This is due to the presence of double or multiple cross overs that take place between the loci, which may not be detectable from the recombination frequency. This warrants mapping with closely placed markers, so that multiple cross over information can be eliminated considerably. However, estimating the genetic distances between a whole arrays of markers, distributed throughout the genome, and aligning them on linkage groups is a complex problem, which often 404

requires analytical power of a computer. However, at present there are many computer programs available for this purpose. MAPMAKER, QTL cartographer, MapManager, JoinMap etc., are some of the widely used programs. The most commonly used procedure in these programs is based on the maximum likelihood method. The output from these programs depicts linear relationship among the markers and the distance between the markers is measured in centimorgans (cm), so that they can be grouped into distinct groups called linkage groups based on the recombination frequency values. Generations and populations used for linkage based mapping The procedures of molecular mapping are mainly focused on populations developed from crosses between two inbred lines. There are two different types of single-cross populations: F 2 and F2-derived populations and recombinant inbred lines (RIL) populations. Other types of populations, such as backcross or random-mating populations can be used for mapping, but the ability to detect the information contained in such populations is generally lower compared to F2 or RIL populations. Random-mating populations are more difficult for mapping, because, the linkage disequilibrium is a key to detecting linked traits with markers. Modification of the genetic model discussed hereunder is necessary to accommodate different types of populations. These different options have different implications for both genotypic and phenotypic evaluation. Producing recombinant inbred lines (F 2-derived lines single-seed descended without selection to F6:7 or so generation) allows recombination to occur each generation, so that there are more chances for recombination to occur between two loci in RILs compared to F2 s. Precise localization of both marker loci and the linked gene depends upon the number of recombinations that occur between genes. Thus RILs allow higher resolution mapping because one can distinguish map positions better. On the other hand, the power of detecting a trait locus depends in part upon association of marker and trait alleles, so that the additional recombinations between trait locus and marker loci can reduce the power of tests for locus. This difficulty can be overcome by use of a sufficiently dense maker linkage map. RILs also differ from F 2 s in that the lines are homozygous at nearly all loci, so there is very little power to detect or estimate dominance. RILs are developed by single seed selections from individual plants of an F 2 population. (Because of this procedure, these lines are also called F 2 derived lines.) Single seed descent is repeated for several generations. At this point, all of the seed from an individual plant is bulked. For example, a F3:4 RI population underwent single seed descent through the F3 generation, and was bulked to develop the F4 (Fig 3). This population of seed can then be grown to obtain a large quantity of seed of each individual line. Importantly, each of the lines is fixed for many recombination events; thereby they contain the segregation 405

adequately fixed to maximum homozygosity (Table 1). No selection is exercised in the population. Fig 3. Methods of producing recombinant inbred lines Because RILs are essentially homozygous, only additive gene action can be measured. F2 s, therefore, may allow mapping with fewer marker loci, but the precision of the locus placements will be lower. The ability to detect and estimate dominance gene action is also maximized. However, phenotypic evaluation of F 2 individuals does not allow for replicated trials (unless with clonally-propagated species), which are critical to obtaining valid phenotypic measurements of quantitative traits. Thus, many prefer to work with F2:3 lines, which can be replicated, although the heterozygosity of each line is halved, so there is less power to detect dominance. If estimating dominance is not of great interest, RILs might be preferred, as greater quantities of seed can often be produced for testing. Table 1. Percentage of homozygosity in RIL generations RIL inbreeding generations % within line homozygosity at each locus F 3:4 75.0 F4:5 87.5 F5:6 92.25 F6:7 96.875 F7:8 98.4375 F8:9 99.21875 BC 1F 1 populations have the same amount of linkage disequilibrium as do F 2 generations, so are equivalent to F 2 s in terms of resolution of map positions. Backcross populations have only two genotypic classes at each marker and locus, either AA or AB if A is the recurrent parent. This means that if the allele from the recurrent parent is completely dominant over the allele from the donor parent, 406

then the genotypes QAQA and QAQB are expected to be equal and no locus will be detected in backcross populations. If backcrosses are made to parent B as well, and both backcross populations are phenotyped and genotyped, then all loci with complete dominance segregating in the populations may be detected. But evaluating both backcross populations seems to offer little advantage over evaluating F 2 populations. Many other options are available. Essentially all that is required to detect trait locus is that the locus and the markers are segregating in the population evaluated, there is some linkage disequilibrium between marker loci and locus, and that one has a reliable assay or measurements of the phenotypic trait(s). The estimates of locus effects, and the power and precision of locus estimates, however, may vary depending on the type of population evaluated. The objective of genetic mapping is to identify simply inherited markers in close proximity to genetic factors affecting quantitative traits (Quantitative trait loci, or QTL). This localization relies on processes that create a statistical association between marker and QTL alleles and processes that selectively reduce that association as a function of the marker distance from the QTL. When using crosses between inbred parents to map QTL, we create in the F1 hybrid complete association between all marker and QTL alleles that derive from the same parent. Recombination in the meioses that lead to doubled haploid, F 2, or recombinant inbred lines reduces the association between a given QTL and markers distant from it. Unfortunately, arriving at these generations of progeny requires relatively few meioses such that even markers that are far from the QTL (e.g., 10 cm) remain strongly associated with it. Such long-distance associations hamper precise localization of the QTL. One approach for fine mapping is to expand the genetic map, for example through the use of advanced intercross lines, such as F6 or higher generational lines derived by continual generations of outcrossing the F 2 (Darvasi and Soller, 1995). In such lines, sufficient meioses have occurred to reduce disequilibrium between moderately linked markers. When these advance generation lines are created by selfing, the reduction is disequilibrium is not nearly as great as that under random mating. Association mapping or Disequilibrium mapping A statistical association between a neutral marker allele and the phenotype occurs when marker alleles are in gametic phase disequilibrium (GPD) with alleles at a QTL. Two alleles at distinct loci are in positive GPD if they occur together more often than predicted on the basis of their individual frequencies. This definition of association says nothing concerning the physical position of the loci or of the alleles' joint effects on the phenotype. The term "gametic phase disequilibrium" is used synonymously with the term "linkage disequilibrium," but we use the former term since it avoids reference to linkage (as unlinked 407

markers can still be in GPD) and emphasizes that associated alleles must co-occur in gametes. The central problem with approaches for fine mapping using pedigree method is the limited number of meioses that have occurred and (in the case of advanced intercross lines) the cost of propagating lines to allow for a sufficient number of meioses. An alternative approach is "association mapping", taking advantage of events that created association in the relatively distant past. Assuming many generations, and therefore meioses, have elapsed since these events, recombination will have removed association between a QTL and any marker not tightly linked to it. Association mapping thus allows for much finer mapping than standard bi-parental cross approaches. Causes for gametic phase disequilibrium Marker allele QTL Q Allele q M m 0.4 0.1 0.2 0.3 0.6 0.4 0.5 0.5 In the table above, the combination of alleles (or haplotype) QM is observed with frequency, pqm = 0.4, while its predicted frequency is only pqpm = 0.3. The alleles Q and M are in GPD with disequilibrium coefficient, D = p QM p Qp M = cov(q,m) = 0.1. Since D can be expressed as a covariance, it can be bound its possible values by considering the case when the correlation is +/- 1, giving D < σ Q σ M = [p Q (1-p Q )p M (1-p M ) ] 1/2 (1) For a pair of diallelic loci, the expected value of the estimate of D is equal in magnitude irrespective of the haplotype frequencies used and can be calculated as D =p QMp qm -p Qmp qm. For each generation of random mating, D decays by a factor of (1 - r), where r is the recombination rate between the two loci considered. Thus, after t generations, only (1 -r) t of the initial disequilibrium remains. A variety of mechanisms generate linkage disequilibrium, and several of these can operate simultaneously. Some of the more common mechanisms are: 1. Populations expanding from a small number of founders. The haplotypes present in the founders will be more frequent than expected under equilibrium. Three cases can operate here. First, genetic drift affects GPD by this mechanism in that a population experiencing drift derives from fewer individuals than its present size. 408

Second, an individual with a new mutation as a founder, its descendants will predominantly receive the mutation and loci linked to it in the same phase. Linked marker alleles will therefore be in GPD with the mutant allele. Third, an extreme case arises in the F 2 population derived from the cross of two inbred lines. Here, all individuals derive from a single F 1 founder genotype and association between loci can be predicted based on there mapping distance (Lynch and Walsh 1998). 2. GPD arises in structured populations when allelic frequencies differ at two loci across subpopulations, irrespective of the linkage status of the loci. Admixed populations, formed by the union of previously separate populations into a single panmictic one, can be considered a case of a structured population where sub structuring has recently ceased. 3. Negative GPD will occur between loci affecting a character in populations under stabilizing or directional selection as a result of the Bulmer effect. In Bulmer effect, genetic variation is reduced over generations due to selection, and in proportion to how different the phenotype of a specific progeny's parents are in relation to the generation the parents come from. 4. Positive GPD will occur between loci affecting a character under disruptive selection. 5. When loci interact epistatically, haplotypes carrying the allelic combination favoured by selection will also be at higher-than-expected frequencies. Populations and association based selections Determining association between marker alleles and phenotype can be contemplated in two ways. First, the groups can be determined based on the phenotypic classes (Eg. Diseased and healthy) and the allele frequencies are compared across the groups. The second form is to grouping is done on the basis of marker genotypes, and the phenotype means are compared across the groups. However, many precautions are required while interpreting data as many associations can occur due to chances alone. Thus obvious weakness of groupcomparison studies is that the grouping method may result in groups that contain predominantly individuals from different subpopulations. To eliminate this weakness, family-based control methods seek case and control individuals or marker alleles within the same family. Transmission / disequilibrium test (TDT) Studies on the problem of population admixture in human disease mapping, Spielman et al. (1993) developed a test called transmission / disequilibrium test (TDT) to develop unbiased association estimators. For an outbred species, the test employs family trios consisting of both parents and a progeny that belongs to one category of a dichotomous trait. One of the parents must be heterozygous and carry one copy of the focal marker allele putatively linked to the trait allele. The test consists of determining the frequency of 409

transmission of the focal allele to affected progeny. A chi-square or binomial test can determine whether that frequency deviates from the expectation of 0.5. Two conditions are necessary for a significant deviation: the marker allele must be both in GPD with and also linked to the allele. In the TDT, both case and control marker alleles are in effect within the same heterozygote parent. Random Mendelian segregation therefore ensures that the distribution of the TDT statistic under the null hypothesis is unaffected by population structure or selection within the pedigree (Spielman and Ewens, 1996). Extending the application of TDT to quantitative traits, Bink et al., (2000) grouped the categories of genotypes based on the quantitative trait as selected and unselected bringing the class into a dichotomous pattern suitable for TDT. Further considering in the standard TDT, QTL x E would lead to environmental influences on the transmission frequency of the focal marker allele from a heterozygotic parent to affected progeny. Such an effect could be detected by grouping family trios according to their environment or level of exposure to a risk factor. Heterogeneity of transmission frequency across groups would provide evidence in favor of QTL x E (Schaid, 1999). Similarly, for the Monks and Kaplan test, environments would affect the magnitude of TMK in the presence of QTL x E. Existence of QTL x E could then be inferred if the variance of TMK across environments is significantly greater than zero. In observational studies where environments cannot be randomized across family trios, interpretation of such a result would need to be treated carefully: an association between environments and different subpopulations could also lead to heterogeneity of transmission or of TMK in the absence of QTL x E. Association mapping with multiple markers Extending the haplotype model to multiple markers, each linked markers are to treated to be at a supralocus, and the TDT is applied to this supralocus (McIntyre et al., 2000; Spielman and Ewens, 1996). Several methods have been developed to approach this problem, to pinpoint more precise location of QTL affecting the trait. One method employs identity by descent (IBD) probabilities with phenotype resemblance among the individuals. Templeton and Sing (1993) and Templeton et al., (1987) used haplotype marker profiles to construct a cladogram that estimates the evolutionary history and relationships among haplotypes. Assuming that a mutation occurred at some point in this history on one branch of the cladogram. Haplotypes along that branch will be IBD for the mutation and distinct from haplotypes along other branches. The branches of the cladogram therefore define nested sets of haplotypes that should have related associations with the phenotype. Templeton et al. (1987) developed a nesting algorithm to group haplotypes hierarchically enabling a nested analysis of variance. This approach does not localize the 410

mutation within the set of markers used to define haplotypes, it will increase the power to detect QTL in linkage disequilibrium with those markers. Meuwissen and Goddard (2000) used an approach to estimate the covariance matrix among haplotype effects that does help predict QTL position within the set of markers. Starting from assumptions concerning the population history since mutation caused polymorphism at the QTL (i.e., effective population size and number of generations since mutation) the algorithm repeatedly simulates haplotype evolution and samples the probability of IBD status across specified categories of identity by state (IBS) among markers within haplotypes. The covariance among haplotype effects is then based on their IBS and its inferred relation to IBD. Since the probability function P(IBD IBS) depends on QTL location within the set of markers, the assumed QTL location affects the haplotype covariance matrix. A maximum likelihood QTL position is inferred from the covariance matrix most consistent with the observed phenotypes. This approach assumes a single (monophyletic) polymorphism at the QTL while Templeton et al.'s cladogram approach does not. While Meuwissen and Goddard show that the approach is, within limits, robust to population size and mutation age assumptions, applying the analysis to real (rather than simulated) data would be of interest in testing it. Factors affecting association mapping Experiences from human genome project show that association mapping showed substantial benefit than the methods based on linkage analyses. Following are the factors that may affect the precision of association mapping. a. Magnitude of the GDP b. Effective size of the QTL allele c. Mod of gene action (dominance, additive or multiplicative) There are few software developed for association mapping, like, CoaSim, GeneRecon, HapCluster++, Blossoc, CoAnnealing available online. Application of molecular markers to selection Once markers have been detected that are associated with major genes or quantitative trait loci (QTL) they can be employed to practical plant breeding in the following broad fields. For genotype identification and genetic diversity In selection and breeding for resistance to diseases and pests In selection of lines for male fertility restoration Selection for high performing genotypes based on QTL compositing Purity analysis of seeds of varieties and hybrids 411

In a classical attempt of using markers for selection, Stuber et al. (1982) determined the allelic frequencies for eight isozyme loci that had been shown to be associated with yield. Using these frequencies they constructed a base-line by which a new population could be constructed. Using this marker base-line they constructed a new population. Before this they had also developed a high yielding maize population by selecting over ten cycles for increased yield. On comparison they found that the new population had essentially the same allelic frequencies as the high yielding population developed by selection. Next they measured the yield and ears/plant in the high yielding population developed via selection, and also in the population constructed based on isozyme frequencies. Data from this replicated experiment grown in several locations suggested that the gain realized by simply pooling on allelic frequencies of the high yield population was equal to two cycles of selection for yield and one and a half cycles of selection for ears/plant. The take home message from these results being modest selection gains can be realized by simple molecular marker based selection of genotype. Fig 5. Marker assisted selection by employing summation index of plus QTL Once markers have been detected that are associated with QTL, the logical next step is to perform selection on lines within a population. The obvious method would be to only advance those lines which contain those alleles with a positive effect on the quantitative trait, the selection criterion being the summation index showing maximum accumulation of the desired QTL (fig 5). This method is successfully employed in selection of lines for complex traits like nitrogen use efficiency (NUE) in fodder grasses. A collateral advantage of such an approach is that it offers a true validation of the putative genes (QTL) for the traits of interest. The associated response to marker selection namely indicates the presence of true NUE-related genes; in particular in the vicinity of markers strongly affected by the selection imposed. 412

Success of marker assisted selection is since then demonstrated in many crops. Significant allelic associations are located in recurrent selection in oats (De Koeyer et al., 2001), and in many qualitative instances like disease resistance etc. The genetic markers associated with rice amylose content are now being used by the USDA Rice Quality lab in Beaumont which screens some 8-10,000 breeding lines each year for all U.S. public rice breeding programs. Marker technology will provide a more accurate assessment of amylose content than the analytical methods previously used. Also in rice, markers associated with various genes which convey resistance to blast disease caused by Pyricularia grisea are being used to combine multiple blast resistance genes into new cultivars and provide resistance to a broader array of the many different races (biotypes) of blast that occur. These markers have also been useful in successful verification of the first time incorporation of a new gene (Pi-b) for blast resistance from a Chinese cultivar into adapted U.S. lines. This gene will be an important resource for breeders as they enhance the natural disease resistance of rice cultivars and reduce the need for some agricultural pesticides. REFERENCES Bink MCAM, Te Pas MFW, Harders FL and Janss, LLG (2000) A transmission/disequilibrium test approach to screen for quantitative trait loci in two selected lines of Large White pigs. Genetical Research 75, 115-121. Churchill GA, Doerge RW (1994) Empirical threshold values for quantitative trait mapping. Genetics, 138: 963-971. Damerval C, Maurice A, Josse JM, de Vienne D (1994) Quantitative trait loci underlying gene product variation: A novel perspective for analyzing regulation of genome expression. Genetics, 137: 289-301. Darvasi A and Soller, M (1995) Advanced intercross lines, an experimental population for fine genetic mapping. Genetics 141, 1199-1207. De Koeyer D L, Phillips RL and Stuthman DD (2001) Allelic shifts and quantitative trait loci in a recurrent selection population of oat. Crop Science, 41:1228-1234 Edwards MD, Stuber CW, Wendel JF (1987) Molecular-marker-facilitated investigations of quantitative-trait loci in maize. I. Numbers, genomic distribution, and types of gene action. Genetics, 116: 113-125. Graham J and Thompson EA (1998) Disequilibrium likelihoods for fine-scale mapping of a rare allele. American Journal of Human Genetics 63, 1517-1530. 413

Hastbacka J. et al. (1992) Linkage disequilibrium mapping in isolated founder populations: diastrophic dysplasia in Finland. Nature Genetics 2:204-211. Holland JB, Moser HS, O'Donoughue LS, Lee M (1997) QTLs and epistasis associated with vernalization responses in oat. Crop Science, 37: 1306-1316. Lazzeroni L (1997) Empirical linkage-disequilibrium mapping. Am. J. Hum. Gen. 62:159-170. Li Z, Pinson SRM, Park WD, Paterson AH, Stansel JW (1997) Epistasis for three grain yield components in rice (Oryza sativa L.). Genetics, 145: 453-465. Lynch M, Walsh B (1997) Genetics and analysis of quantitative traits. Sinauer Associates, Inc., Sunderland, MA. McIntyre LM, Martin ER, Simonsen KL, and Kaplan NL (2000) Circumventing multiple testing: A multilocus Monte Carlo approach to testing for association. Genetic Epidemiology 19, 18-29. Meuwissen THE, and Goddard ME (2000) Fine mapping of quantitative trait loci using linkage disequilibria with closely linked marker loci. Genetics 155, 421-430. Risch N and Merikangas K (1996) The future of genetic studies of complex human diseases. Science 273, 1516-1517. Schaid DJ (1999) Case-parents design for gene-environment interaction. Genetic Epidemiology 16, 261-273. Spielman RS and Ewens WJ (1996) The TDT and other family-based tests for linkage disequilibrium and association. American Journal of Human Genetics 59, 983-989. Spielman RS, McGinnis RE and Ewens WJ (1993) Transmission test for linage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). American Journal of Human Genetics 52, 506-516. Templeton AR, and Sing CF (1993) A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. IV. Nested analyses with cladogram uncertainty and recombination. Genetics 134, 659-669. Templeton AR, Boerwinkle E and Sing CF (1987) A cladistic analysis of phenotypic associations with haplotypes inferred from restirction endonuclease mapping. I. Basic theory and an analysis of alcohol dehydrogenase activity in Drosophila. Genetics117,343-351. 414