Marker-assisted selection in livestock case studies

Size: px
Start display at page:

Download "Marker-assisted selection in livestock case studies"

Transcription

1 Section III Marker-assisted selection in livestock case studies

2

3 Chapter 10 Strategies, limitations and opportunities for marker-assisted selection in livestock Jack C.M. Dekkers and Julius H.J. van der Werf

4 168 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish Summary This chapter reviews the principles, opportunities and limitations for detection of quantitative trait loci (QTL) in livestock and for their use in genetic improvement programmes. Alternate strategies for QTL detection are discussed, as are methods for inclusion of marker and QTL information in genetic evaluation. Practical issues regarding implementation of marker-assisted selection (MAS) for selection in breed crosses and for selection within breeds are described, along with likely routes towards achieving that goal. Opportunities and challenges are also discussed for the use of molecular information for genetic improvement of livestock in developing countries.

5 Chapter 10 Strategies, limitations and opportunities for marker-assisted selection in livestock 169 Introduction Since the 1970s, the discovery of technology that enables identification and genotyping of large numbers of genetic markers, and research that demonstrated how this technology could be used to identify genomic regions that control variation in quantitative traits and how the resulting QTL could be used to enhance selection, have raised high expectations for the application of gene- (GAS) or marker-assisted selection (MAS) in livestock. Yet, to date, the application of GAS or MAS in livestock has been limited (see e.g. review by Dekkers, 2004 and the case study chapters that follow). However, recent further advances in technology, combined with a substantial reduction in the cost of genotyping, have stimulated renewed interest in the large-scale application of MAS in livestock. Successful application of MAS in breeding programmes requires advances in the following five areas: Gene mapping: identification and mapping of genes and genetic polymorphisms. Marker genotyping: genotyping of large numbers of individuals for large numbers of markers at a reasonable cost for both QTL detection and routine application for MAS. QTL detection: detection and estimation of associations of identified genes and genetic markers with economic traits. Genetic evaluation: integration of phenotypic and genotypic data in statistical methods to estimate breeding values of individuals in a breeding population. MAS: development of breeding strategies and programmes for the use of molecular genetic information in selection and mating programmes. This chapter outlines the main strategies for the application of MAS in livestock and identifies and discusses the limitations and opportunities for successful MAS in commercial breeding programmes. It concludes by discussing limitations and opportunities for applying MAS in developing countries. Markers and linkage disequilibrium Over the past decades, a substantial number of alternate types of genetic markers have become available to study the genetic architecture of traits and for their use in MAS, including restriction fragment length polymorphisms (RFLPs), microsatellites, amplified fragment length polymorphisms (AFLPs) and single nucleotide polymorphisms (SNPs). Detailed information on these markers can be found elsewhere in this publication. Although alternate marker types have their own advantages and disadvantages, depending on their abundance in the genome, degree of polymorphism, and ease and cost of genotyping, what is crucial for their use for both QTL detection and MAS is the extent of linkage disequilibrium (LD) that they have in the population with loci that contribute to genetic variation for the trait. Linkage disequilibrium relates to dependence of alleles at different loci and is central to both QTL detection and MAS. Thus, a thorough understanding of LD and of the factors that affect the presence and extent of LD in populations is essential for a discussion of both QTL detection and MAS. Linkage disequilibrium Consider a marker locus with alleles M and m and a QTL with alleles Q and q that is on the same chromosome as the marker, i.e. the marker and the QTL are linked. An individual that is heterozygous for both loci would have genotype MmQq. Alleles at the two loci are arranged in haplotypes on the two chromosomes of a homologous

6 170 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish pair that each individual carries. An individual with genotype MmQq could have the following two haplotypes: MQ/mq, where the / separates the two homologous chromosomes. Alternatively, it could carry the haplotypes Mq/mQ. This alternative arrangement of linked alleles on homologous chromosomes is referred to as the marker-qtl linkage phase. The arrangement of alleles in haplotypes is important because progeny inherit one of the two haplotypes that a parent carries, barring recombination. The presence of linkage equilibrium (LE) or disequilibrium relates to the relative frequencies of alternative haplotypes in the population. In a population that is in linkage equilibrium, alleles at two loci are randomly assorted into haplotypes. In other words, chromosomes or haplotypes that carry marker allele M are no more likely to carry QTL allele Q than chromosomes that carry marker allele m. In technical terms, the frequency of the MQ haplotypes is equal to the product of the population allele frequency of M and the frequency of Q. Thus, if a marker and QTL are in linkage equilibrium, there is no value in knowing an individual s marker genotype because it provides no information on QTL genotype. If the marker and QTL are in linkage disequilibrium, however, there will be a difference in the probability of carrying Q between chromosomes that carry M and m marker alleles and, therefore, a difference in mean phenotype between marker genotypes would also be expected. The main factors that create LD in a population are mutation, selection, drift (inbreeding), and migration or crossing. See Goddard and Meuwissen (2005) for further background on these topics. The main factor that breaks down LD is recombination, which can rearrange haplotypes that Disequilibrium Figure 1 Break-up of LD over generations LD is continuously eroded by recombination r=.5 r=.2 r=.1 r=.05 Generation r=.01 r M Q exist within a parent in every generation. Figure 1 shows the effect of recombination (r) on the decay of LD over generations. The rate of decay depends on the rate of recombination between the loci. For tightly linked loci, any LD that has been created will persist over many generations but, for loosely linked loci (r > 0.1), LD will decline rapidly over generations. Population-wide versus within-family LD Although a marker and a linked QTL may be in LE across the population, LD will always exist within a family, even between loosely linked loci. Consider a double heterozygous sire with haplotypes MQ/mq (Figure 2). The genotype of this sire is identical to that of an F 1 cross between inbred lines. This sire will produce four types of gametes: non-recombinants MQ and mq and recombinants Mq and mq. As non-recombinants will have higher frequency, depending on the recombination rate between the marker and QTL, this sire will produce gametes that will be in LD. Furthermore, this LD will extend over a larger distance (Figure 1), because it has undergone only one generation of recombination. This specific type of LD, m q r=

7 Chapter 10 Strategies, limitations and opportunities for marker-assisted selection in livestock 171 Figure 2 Within-family marker-qtl LD Within-family LD among progeny of a sire Linkage phase can differ between families Sire 1 Sire 2 Sire 3 Sire 4 M Q M q M Q M q m q m Q however, only exists within this family; progeny from another sire, e.g. an Mq/mQ sire, will also show LD, but the LD is in the opposite direction because of the different marker-qtl linkage phase in the sire (Figure 2). On the other hand, MQ/ mq and Mq/mq sire families will not be in LD because the QTL does not segregate in these families. When pooled across families these four types of LD will cancel each other out, resulting in linkage equilibrium across the population. Nevertheless, the within-family LD can be used to detect QTL and for MAS provided the differences in linkage phase are taken into account, as will be demonstrated later. m Q m q QTL detection and types of markers for MAS Application of molecular genetics for genetic improvement relies on the ability to genotype individuals for specific genetic loci. For these purposes, three types of observable genetic loci can be distinguished, as described by Dekkers, 2004: direct markers: loci for which the functional polymorphism can be genotyped; LD-markers: loci in population-wide LD with the functional mutation; LE-markers: loci in population-wide linkage equilibrium with the functional mutation but which can be used for QTL detection and MAS based on withinfamily LD. For these alternate types of markers, different strategies are appropriate to detect QTL in livestock populations. These are summarized in Table 1 and will be described in more detail. Strategies for QTL detection in livestock differ from those used in plants because of the lack of inbred lines. QTL detection using LD markers within crosses Crossing two breeds that differ in allele and, therefore, haplotype frequencies, creates extensive LD in the crossbred population. This LD extends over large distances Table 1 Summary of strategies for QTL detection in livestock Type of population Within crosses Outbred population F2/Backcross Advanced intercross Half- or full-sib families Extended pedigree Non-pedigreed population sample Type of markers LD markers LE markers LD markers Genome coverage Genome-wide Genome-wide Candidate gene Genome-wide regions Marker density Sparse Denser Sparse More dense Few loci Dense Type of LD used Population-wide LD Within-family LD Population-wide LD Number of generations of recombination used for mapping 1 >1 1 >1 >>1 Extent of LD around QTL Long Smaller Long Smaller Small Map resolution Poor Better Poor Better High

8 172 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish because it has undergone only one generation of recombination in the F 2 (Figure 1). Thus, although these markers may be in LE with QTL within the parental breeds, they will be in partial LD with the QTL in the crossbred population if the marker and QTL differ in frequency between the breeds. This population-wide LD enables detection of QTL that differ between the parental breeds based on a genome scan with only a limited number of markers spread over the genome (~ every 15 to 20 cm). This approach has formed the basis for the extensive use of F 2 or backcrosses between breeds or lines for QTL detection, in particular in pigs, poultry and beef cattle (see Andersson, 2001 for a review). The extensive LD enables detection of QTL that are some distance from the markers but also limits the accuracy (map resolution) with which the position of the QTL can be determined. More extensive population-wide LD is also expected to exist in synthetic lines, i.e. lines that were created from a cross in recent history. These can be set up on an experimental basis through advanced intercross lines (Darvasi and Soller, 1995) or be available as commercial breeding lines. Depending on the number of generations since the cross, the extent of LD will have eroded over generations and will, therefore, span shorter distances than in F 2 populations (Figure 1). This will require a more dense marker map to scan the genome with equivalent power as in an F 2 but will enable more precise positioning of the QTL. QTL detection using LE markers in outbred populations As linkage phases between the marker and QTL can differ from family to family, use of within-family LD for QTL detection requires QTL effects to be fitted on a within-family basis, rather than across the population. Similar to F 2 or backcrosses, the extent of within-family LD is extensive and, thus, genome-wide coverage is provided by a limited number of markers but significant markers may be some distance from the QTL, resulting in poor map resolution. Thus, LE markers can be readily detected on a genome-wide basis using large half-sib families, requiring only sparse marker maps (~15 to 20 cm spacing). Many examples of successful applications of this methodology for detection of QTL regions are available in the literature, in particular for dairy cattle, utilizing the large paternal half-sib structures that are available through extensive use of artificial insemination (see Weller, Chapter 12). QTL detection using LE markers can also be applied to extended pedigrees by modelling the co-segregation of markers and QTL (Fernando and Grossman, 1989). These approaches use statistical models that are described further in the section on genetic evaluation using LE markers. Depending on the number of generations with phenotypes and marker genotypes that are included in the analysis, map resolution will be better than with analysis of half-sib families because multiple rounds of recombination are included in the data set. QTL detection using LD markers in outbred populations The amount and extent of LD that exists in the populations that are used for genetic improvement are the net result of all forces that create and break down LD and are, therefore, the result of the breeding and selection history of each population, along with random sampling. On this basis, populations that have been closed for many generations are expected to be in linkage equilibrium, except for closely linked loci.

9 Chapter 10 Strategies, limitations and opportunities for marker-assisted selection in livestock 173 Thus, in those populations, only markers that are tightly linked to QTL may show an association with phenotype (Figure 1), and even then there is no guarantee because of the chance effects of random sampling. There are two strategies to find markers that are in population-wide LD with QTL (see Table 1): evaluating markers that are in, or close to, genes that are thought to be associated with the trait of interest (candidate genes); a genome scan using a high-density marker map, with a marker every 0.5 to 2 cm. The success of both approaches obviously depends on the extent of LD in the population. Studies in human populations have generally found that LD extends over less than 1 cm. Thus, many markers are needed to obtain sufficient marker coverage in human populations to enable detection of QTL based on population-wide LD. Opportunities to utilize population-wide LD to detect QTL in livestock populations may be considerably greater because of the effects of selection and inbreeding. Indeed, Farnir et al. (2000) identified substantial LD in the Dutch Holstein population, which extended over 5 cm. Similar results have been observed in other livestock species (e.g. in poultry, Heifetz et al., 2005). The presence of extensive LD in livestock populations is advantageous for QTL detection, but disadvantageous for identifying the causative mutations of these QTL; with extensive LD, markers that are some distance from the causative mutation can show an association with phenotype. The candidate gene approach utilizes knowledge from species that are rich in genome information (e.g. human, mouse), effects of mutations in other species, previously identified QTL regions, and/or knowledge of the physiological basis of traits, to identify genes that are thought to play a role in the physiology of the trait. Following mapping and identification of polymorphisms within the gene, associations of genotype at the candidate gene with phenotype can be estimated (Rothschild and Plastow, 1999). Whereas the candidate gene approach focuses on LD within chosen regions of the genome, recent advances in genome technology have enabled sequencing of entire genomes, including of several livestock species; the genomes of the chicken and cattle have been sequenced and public sequencing of the genome of the pig is under way. In addition, sequencing has been used to identify large numbers of positions in the genome that include SNPs, i.e. DNA base positions that show variation. For example, in the chicken, over 2.8 million SNPs were identified by comparing the sequence of the Red Jungle Fowl with that of three domesticated breeds (International Chicken Polymorphism Map Consortium, 2004). This, combined with reducing costs of genotyping, now enables detection of QTL using LD-mapping with high-density marker maps. QTL detection using combined LD and linkage analysis in outbred populations As markers may not be in complete LD with the QTL, both population-wide associations of markers with QTL and cosegregation of markers and QTL within families can be used to detect QTL. Using these combined properties of being both LD and LE markers, methods have been developed to combine LD and linkage information. These methods are further explored under genetic evaluation models in what follows.

10 174 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish Incorporating marker information in genetic evaluation programmes The value of genotypic information for predicting the genetic merit of animals is dependent on the predictive ability of the marker genotypes. The three types of molecular loci described previously differ not only in methods of detection but also in methods of their incorporation in genetic evaluation procedures. Whereas direct and, to a lesser degree, LD markers, allow selection on genotype across the population, use of LE markers must allow for different linkage phases between markers and QTL from family to family, i.e. LE markers are family specific and family specific information must be derived. As discussed later in this chapter, this makes LE markers a lot less attractive for use in breeding programmes. In this section, the different types of models that have been proposed for genetic evaluation based on marker information are described and this is followed by a brief description of some practical issues regarding implementation of such methods and the likely routes towards achieving that goal. Modelling QTL effects in genetic evaluation By using QTL information in genetic evaluation, in principle, part of the assumed polygenic variation is substituted by a separate effect due to a genetic polymorphism at a known locus. This has the immediate effect of having a much better handle on the Mendelian sampling process, as phenotypic co-variance can be evaluated based on specific genetic similarity rather than on an average relationship. For example, on average two full sibs share 50 percent of their alleles, but at a specific locus it is now possible to know whether these full sibs carry exactly the same complete genotype (both paternal and maternal alleles are in common), or actually have a completely different genotype. The actual degree of similarity of full sibs at a QTL can thus vary between 0 and 1. This additional information helps to better evaluate the genetic merit due to specific QTL, and to better predict offspring that do not yet have phenotypic measurements. A number of different approaches have been described to accommodate marker information in genetic evaluation. Roughly, these methods can be distinguished through their modelling of the QTL effect and through the type of genetic marker information used. The QTL effect can be modelled as random or fixed, while the molecular information comes from LE, LD or direct markers. With a fixed QTL model, regression on genotype probabilities would be used in genetic evaluation to account for the effect of QTL polymorphisms. In the simplest additive QTL model, suitable for estimating breeding values, simple regressions could be included on the probability of carrying the favourable mutation. Regression can be on known genotypes (class variables), or probabilities can be derived for ungenotyped animals in a general complex pedigree (Kinghorn, 1999). A fixed QTL model is sensible if few alleles are known to be segregating, and where dominance and/or epistasis are important. The model also assumes effects being the same across families. The effects of various genotypes could be fitted separately, giving power to account for dominance and epistasis in case of multiple QTL. For selection purposes, a fixed QTL effect, if additive, would be added to the polygenic estimated breeding values (EBVs), similar to breed effects in across-breed evaluations. The advantage of a fixed QTL model is the limited number of effects that need to be fitted.

11 Chapter 10 Strategies, limitations and opportunities for marker-assisted selection in livestock 175 Alternatively, QTL effects could be modelled as random effects, with each individual having a different QTL effect. Co-variances are based on the probability of QTL alleles being identical by descent rather than on numerator relationships as in the usual animal model with polygenic effects. With full knowledge about segregation, this would effectively fit all founder alleles as different effects. The random QTL model was first described by Fernando and Grossman (1989), where for each animal both the paternal and the maternal allele were fitted. Without loss of information, these effects can be collapsed into one genotypic effect for each animal (Pong-Wong et al., 2001). The random QTL model makes no assumptions about number of alleles at a QTL and it automatically accommodates possible interaction effects of QTL with genetic background (families or lines). Therefore, the random QTL model is less reliant on assumptions about homogeneity of QTL effects. The random QTL model is a natural extension to the usual mixed model and seems therefore a logical way to incorporate genotype information into an overall genetic evaluation system. These models result in EBVs for QTL effects along with a polygenic EBV. The total EBV is the simple sum of these estimates. One of the main computational limitations of this method, however, is the large number of equations that must be solved, which increases by two per animal for each QTL that is fitted. Thus, the number of QTL regions that can be incorporated is limited. Genetic evaluation using direct markers When the genotype of an actual functional mutation is available, no pedigree information is needed to predict the genotypic effect, as QTL genotypes are measured directly. When there is only a small number of alleles, the number of specific genotypes is limited. In genetic evaluation, it would seem appropriate to treat the genotype effect as a fixed effect, i.e. the assumption is that genotype differences are the same in different families and herds or flocks. Such assumptions might be reasonable for a bi-allelic QTL model in a relatively homogeneous population. Alternatively, random QTL models could be used with different effects for different founder alleles, or even QTL by environment interactions. In both fixed and random QTL models, genotype probabilities can be derived for individuals with missing genotypes. Genetic evaluation using LE markers When the genotype test is not for the gene itself, but for a linked marker, QTL probabilities derived from marker genotypes will be affected by the recombination rate between marker and QTL and by the extent of LD between the QTL and marker across the population. If LD between the QTL and a linked marker only exists within families, marker effects or, at a minimum, marker-qtl linkage phase must be determined separately for each family. This requires marker genotypes and phenotypes on family members. If linkage between the marker and QTL is loose, phenotypic records must be from close relatives of the selection candidate because associations will erode quickly through recombination. With progeny data, marker-qtl effects or linkage phases can be determined based on simple statistical tests that contrast the mean phenotype of progeny that inherited alternate marker alleles from the common parent. A more comprehensive approach is based on Fernando and Grossman s (1989) random QTL model, where marker information from complex pedigrees can be used

12 176 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish to derive co-variances between QTL effects, yielding best linear unbiased prediction (BLUP) of breeding value for both polygenic and QTL effects. Random effects of paternal and maternal QTL alleles are added to the standard animal model with random polygenic breeding values. The varianceco-variance structure of the random QTL effects, also known as the gametic relationship matrix (GRM), is based on probabilities of identity by descent (IBD), and is now derived from co-segregation of markers and QTL within a family. Probabilities of IBD derived from pedigree and marker data link QTL allele effects that are expected to be equal or similar, therefore using data from relatives to estimate an individual s QTL effects. For example, if two paternal halfsibs i and j have inherited the same paternal allele for markers that flank the QTL (with recombination rate r), they are likely IBD for the paternal QTL allele and the correlation between the effects of their paternal QTL alleles will be (1-r) 2. The method is appealing, but computationally demanding for large-scale evaluations, especially when not all animals are genotyped and complex procedures must be applied to derive IBD probabilities. Genetic evaluation using LD markers Most QTL projects have moved towards fine mapping where the final result is a marker or marker haplotype in LD with the QTL, if not the direct mutation. A haplotype of marker alleles close enough to the putative QTL is likely to be in LD with QTL alleles. Such a marker test provides information about QTL genotype across families, and is in a sense not very different from a direct marker. The most convenient way to include genotypic information from marker haplotypes in genetic evaluation systems is through the random QTL model. In their original paper, Fernando and Grossman (1989) derived IBD from genotype data on single markers and recombination rates between marker and QTL. However, the random QTL model is more versatile, and co-variances based on IBD probabilities can also use information beyond pedigree, based on LD. The latter can be derived from marker or haplotype similarity, e.g. based on a number of marker genotypes surrounding a putative QTL. Meuwissen and Goddard (2001) proposed using both linkage and LD information to derive IBD-based co-variances (termed LDL analysis). Lee and van der Werf (2005) showed that with denser markers, the value of linkage information, and therefore pedigree, reduces. Hence, when QTL positions become more accurately defined, genetic information from close markers (within a few cm) can be used increasingly to derive LD-based IBD probabilities, thereby defining co-variances between random QTL effects without the need for a family structure or information through pedigree. Lee and van der Werf (2006) have shown that LD information results in a very dense GRM. Genetic evaluation, which is usually based on mixed model equations that are relatively sparse, is currently not feasible computationally for the LDL method for a large number of individuals and alternative models are needed. One approach is to model population-wide LD by simply including the marker genotype or haplotype as a fixed effect in the animal model evaluation, as suggested by Fernando (2004). An advantage of modelling population-wide LD effects as fixed rather than random is that fewer assumptions about population history are needed. A disadvantage is that estimates are not BLUPed, i.e. regressed towards a mean depending on

13 Chapter 10 Strategies, limitations and opportunities for marker-assisted selection in livestock 177 the amount of information that is available to estimate their effects. This will be important if some of the genotype or haplotype effects cannot be estimated with substantial accuracy because the number of individuals with that genotype or haplotype is limited. Haplotype effects could also be fitted as random, but more development is needed in this area. Whole genome approach for genetic evaluation using high-density LD markers With more and more QTL being discovered, the polygenic component will slowly be replaced by multiple QTL effects, the inheritance of each being followed by marker brackets or more generally by information on haplotypes. Nejati-Javaremi, Smith and Gibson (1997) presented the concept of the total allelic relationship, where the co-variance between two individuals was derived from allelic identity by descent, or by state (based on molecular marker information), with each location weighted by the variance explained by that region. This approach contrasts with the average relationships derived from pedigree that are used in the numerator relationship matrix. Nejati- Javaremi, Smith and Gibson (1997) showed that using total allelic relationship resulted in a higher selection response than pedigree based relationships, because it more accurately accounts for the variation in the additive genetic relationships between individuals. Therefore, the gain of following inheritance at specific genome locations contributes to more accurate genetic evaluation, and is able to deal more specifically with within and between loci interactions and with specific modes of inheritance at different QTL. When large-scale marker genotyping becomes cheap and available to breeders at low cost, this approach could even be used for non-detected QTL and genetic evaluation could be based on a whole genome approach (Meuwissen, Hayes and Goddard, 2001). In this approach, marker haplotypes are fitted as independent random effects for each, e.g. 1 cm region of the genome. In the work by Meuwissen, Hayes and Goddard (2001), variances associated with each haplotype were either assumed to be equal for each chromosomal region or estimated from the data using Bayesian procedures with alternate prior distributions. In essence, this procedure estimates breeding values for each haplotype, and EBVs of individuals are computed by simply summing EBVs for the haplotypes that they contain. Using this procedure, Meuwissen, Hayes and Goddard (2001) demonstrated through simulation, that for populations with an effective population size of 100 and a spacing of 1 or 2 cm between informative markers across the genome, sufficient LD was present to predict genetic values with substantial accuracy for several generations based on associations of marker haplotypes with phenotype on as few as 500 individuals. It should be noted that, in the approach proposed by these authors, no polygenic effect is included since all regions of the genome are included in the model. It may, however, be useful to include a polygenic effect because LD between markers and QTL will not be complete for all regions. In addition, this model assumes that haplotype effects are independent within and across regions. Incorporating IBD probabilities to model co-variances between haplotypes within a region as in Meuwissen and Goddard (2000), and by incorporating co-variances between adjacent regions caused by LD between regions, could lead to further improvements but would also lead to increasing computational demands.

14 178 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish In general, for the purpose of increased genetic change of economically important quantitative traits, and in the context of well recorded and efficient breeding programmes, there is no need to have knowledge of functional mutations since nearby markers will have a high predictive value about genetic merit. Moreover, the benefit from the extra investment and time spent on finding functional mutations might be superseded by the genetic change that can be made in the breeding programme in the meantime. Implementation of marker-assisted genetic evaluation It is important to note that, for most of the gene marker tests currently on the market, integration with existing systems for genetic evaluation is not obvious. This is because the gene testing is either for a Mendelian characteristic, or it predicts phenotypic differences for traits that are not the same as those in current genetic evaluation. Moreover, breeders would not only be interested in more accurate EBVs based on gene markers, but they would also want to know the actual QTL genotypes for their breeding animals. This information on individual genotype will become less relevant if more gene tests become available and if testing becomes cheaper and more widespread. This might still take some years. Thus, as gene marker testing is gradually introduced, it is more likely to create additional selection criteria to consider and it will take some time before QTL information is seamlessly and optimally integrated in existing genetic evaluation programmes. In particular, if genetic evaluation is based on information from many different breeding units, such as in cattle or sheep, genotyping information will initially be available for only a small proportion of the breeding animals, possibly not justifying a total overhaul of the system for genetic evaluation. Simple ad hoc procedures where QTL effects are estimated and presented separately as additional effects are initially a more likely route to implementation. Solutions for fixed QTL genotype effects, along with genotype probabilities as outputs of genetic evaluation, might be interesting to breeders and, compared with random QTL effects, may be more likely to be presented and used separately from polygenic EBVs. This would also be the case for genotypic information on Mendelian characters, where there is no polygenic component. Incorporating MAS in selection programmes Molecular information can be used to enhance both the processes of integrating superior qualities of different breeds and within-breed selection. These strategies are further described below. Between-breed selection Crossing breeds results in extensive LD, which can be capitalized upon using MAS in a number of ways. If a large proportion of breed differences in the trait(s) of interest are due to a small number of genes, gene introgression strategies can be used. If a larger number of genes is involved, MAS within a synthetic line is the preferred method of improvement. Marker-assisted introgression Introgression of the desirable allele at a target gene from a donor to a recipient breed is accomplished by multiple backcrosses to the recipient, followed by one or more generations of intercrossing. The aim of the backcross generations is to produce

15 Chapter 10 Strategies, limitations and opportunities for marker-assisted selection in livestock 179 individuals that carry one copy of the donor QTL allele but that are similar to the recipient breed for the rest of the genome. The aim of the intercrossing phase is to fix the donor allele at the QTL. Marker information can enhance the effectiveness of the backcrossing phase of gene introgression strategies by: (i) identifying carriers of the target gene(s) (foreground selection); and (ii) enhancing recovery of the recipient genetic background (background selection). The effectiveness of the intercrossing phase can also be enhanced through foreground selection on the target gene(s). If the target gene cannot be genotyped directly, carrier individuals can be identified based on markers that flank the QTL at <10 cm, because of the extensive LD in crosses. The markers must have breed-specific alleles in order to identify line origin. For the introgression of multiple target genes, gene pyramiding strategies can be used during the backcrossing phase to reduce the number of individuals required (Hospital and Charcosset, 1997; Koudandé et al., 2000). For background selection, markers are used that are spread over the genome at <20 cm intervals, such that most genes that affect the trait will be within 10 cm from a marker. Combining foreground and background selection, selection will be for the donor breed segment around the target locus but for recipient breed segments in the rest of the genome. Foreground selection will result in selection for both the target locus and for donor breed loci that are linked to this locus, some of which could have an unfavourable effect on performance. To reduce this so-called linkage drag around the target locus, in the molecular score used for background selection greater emphasis can be given to markers that are in the neighbourhood of the target locus (apart from the flanking markers, which are used in foreground selection). Most studies have considered markerassisted introgression (MAI) of single QTL (e.g. Hospital and Charcosset, 1997) but often several QTL must be introgressed simultaneously. Koudandé et al. (2000) showed that large populations are needed to obtain sufficient individuals that are heterozygous for all QTL in the backcrossing phase. This would make MAI not feasible in livestock breeding programmes. In many cases, however, immediate fixation of introgressed QTL alleles may not be required. Instead, the objective of the backcrossing phase can be to enrich the recipient breed with the favourable donor QTL alleles at sufficiently high frequency for selection following backcrossing. The effectiveness of such strategies was demonstrated by Chaiwong et al. (2002). Marker-assisted improvement of synthetic lines In MAI studies it is usually assumed that the aim is to recover the recipient breed genotype, except for the donor QTL. An alternative objective could be to aim simply for individuals with highest merit. Selection would then be for QTL genotype as well as EBV, estimated across breeds or lines. This EBV selection would replace background selection, as recovery of the recipient genotype is achieved through selection on genetic merit rather than through selecting for breed of origin. This strategy would be more competitive if the original breeds overlap in merit, and indeed, as was shown by Dominik et al. (2007), background selection based on anonymous markers would be less profitable. Strategies for using markers to select within a hybrid population were first proposed by Lande and Thompson (1990). These strategies capitalize on populationwide LD that initially exists in crosses

16 180 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish between lines or breeds. Thus, marker-qtl associations identified in the F 2 generation can be selected for several generations, until the QTL or markers are fixed or the disequilibrium disappears. Zhang and Smith (1992) evaluated the use of markers in such a situation with selection on BLUP EBV. Although both studies considered the ideal situation of a cross with inbred lines, there will be opportunities to utilize a limited number of markers to select for favourable QTL regions that are detected in crosses between breeds, thereby enhancing the development of superior synthetics. Pyasatian, Fernando and Dekkers (2006) investigated use of the whole genome approach of Meuwissen, Hayes and Goddard (2001) for MAS in a cross by including all markers as random effects in the model for genetic evaluation. They showed that this resulted in substantially greater responses to selection than selection on identified QTL regions only. Due to the much greater LD, whole genome selection in a cross can be accomplished with a much smaller number of markers compared with the number required for whole genome selection in an outbred population. Within-breed selection The procedures described previously for incorporating markers in genetic evaluation result in estimates of breeding values associated for QTL, together with estimates of polygenic breeding values. Alternatively, if molecular data are not incorporated into genetic evaluations, as will be the case for more ad hoc approaches and for gene tests for Mendelian characteristics, separate selection criteria will be available that capture the molecular information. The following three selection strategies can then be distinguished (Dekkers, 2004): select on the QTL information alone; tandem selection, with selection on QTL followed by selection on polygenic EBV; selection on the sum of the QTL and polygenic EBV. Selection on QTL or marker information alone ignores information that is available on all other genes (polygenes) that affect the trait and is expected to result in the lowest response to selection unless all genes that affect the trait are included in the QTL EBV. This strategy does not, however, require additional phenotypes other than those that are needed to estimate marker effects, and can be attractive when phenotype is difficult or expensive to record (e.g. disease traits, meat quality, etc.). Selection on the sum of the QTL and polygenic EBV is expected to result in maximum response in the short term, but may be suboptimal in the longer term because of losses in polygenic response (Gibson, 1994). Indexes of QTL and polygenic EBV can be derived that maximize longer-term response (Dekkers and van Arendonk, 1998) or a combination of short- and longerterm responses (Dekkers and Chakraborty, 2001). However, if selection is on multiple QTL and emphasis is on maximizing shorter-term response, selection on the sum of QTL and polygenic EBV is expected to be close to optimal. Optimizing selection on a number of EBVs, indexes and genotypes, while also considering inbreeding rate and other practical considerations is not a trivial task. Kinghorn, Meszaros and Vagg (2002) have proposed a mate selection approach that could be used to handle such problems, and it can be expected that with more widespread use of genotypic information for a larger number of regions, specific knowledge about individual QTL becomes less interesting and will simply contribute to prediction of whole EBV or whole genotype.

17 Chapter 10 Strategies, limitations and opportunities for marker-assisted selection in livestock 181 Meuwissen and Goddard (1996) published a simulation study that looked at the main characteristics determining efficiency of MAS using LE markers. They found that MAS could improve the rate of genetic improvement up to 64 percent by selecting on the sum of QTL and polygenic EBV. Their work also demonstrated that MAS is mainly useful for traits where phenotypic measurement is less valuable because of: (i) low heritability; (ii) sex-limited expression; (iii) availability only after sexual maturity; and (iv) necessity to sacrifice the animal (e.g. slaughter traits). Selection of animals based on (most probable) QTL genotype will allow earlier and more accurate selection, increasing the short- and medium-term selection response. Most simulation studies have assumed complete marker genotype information but in practice only a limited number of individuals will be genotyped. However, in an advanced breeding programme with complete information on phenotype and pedigree information, marker and QTL genotype probabilities could be derived for un-genotyped animals and genotyping strategies could be optimized to achieve a high value for the investments made. Marshall, Henshall and van der Werf (2002) looked at strategies to minimize genotyping cost in a sheep breeding programme. Close to maximal gain could be achieved when genotyping was undertaken only for high ranking males and animals whose marker genotype probability could not be derived with enough certainty based on information on relatives. Marshall, van der Werf and Henshall (2004) also looked at progeny testing of sires to determine family-specific marker-qtl phase within a breeding nucleus. Again, testing of a limited number of males provided a lot of information about phase for several generations of breeding animals, as progeny tested sires have relationships with descendants. However, in breeding programmes for more extensive production systems (beef, sheep), pedigree recording is often incomplete and only a small proportion of animals are genotyped. Moreover, these genotyped animals are not necessarily the key breeding animals. The utility of linked markers will be even more limited if pedigree relationships cannot be used to resolve genotype probabilities and marker-qtl phase of un-genotyped individuals. A second point of caution is that many studies on MAS have taken a single-trait approach and shown that genetic markers could have a large impact on responses for traits that are difficult to improve by phenotypic selection. However, within the context of a multitrait breeding objective, the overall impact of such markers on the breeding goal may be less because a greater response for one trait often appears at the expense of another. For example, genetic markers for carcass traits improve the ability to select (i.e. earlier, with higher accuracy) for such traits, but selection emphasis for other traits is reduced. Therefore, the overall effect of MAS on the breeding programme will generally be much smaller than predicted for single trait MAS-favourable cases. The main effects of MAS would be to shift the selection response in favour of the marked traits, rather than achieving much additional overall response. Hence, while it will be easier to select for carcass and disease resistance, further improvement for these traits will be at the expense of genetic change for production traits (growth, milk). The impact of MAS on the rate of genetic gain may be limited in conventional breeding programmes (ranging up to perhaps 10 percent extra gain) unless

18 182 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish the variation in profitability is dominated by traits that are hard to measure. However, new technologies often lead to other breeding programme designs being closer to optimal. Genotypic information has extra value in the case of early selection and where within-family variance can be exploited, which is particularly the case in programmes where reproductive technologies are used. Reproductive technologies usually lead to early selection and more emphasis on between-family selection. DNA marker technology and reproductive technologies are therefore highly synergistic and complementary (van der Werf and Marshall, 2005) and gene markers have much more value in such programmes. Gene marker information is also clearly valuable in introgression programmes, as demonstrated by simulation (Chaiwong et al., 2002; Dominik et al., 2006) as well as in practice (Nimbkar, Pardeshi and Ghalsasi, 2005). Yet, although these examples are favourable to the value of gene marker information, the added value of MAS still relies heavily on a high degree of trait and pedigree recording. Opportunities for MAS in developing countries Complete phenotypic and pedigree information is often only available in intensive breeding units. Therefore, in the context of low input production systems, some questions can be raised concerning the validity and practicality of the simulation studies described above, and it would be more difficult to realize the value of marker information. It would be harder and more expensive to determine the linkage phase in the case of using linked markers. Moreover, even if the genetic marker were a direct or LD marker, its effect on phenotype would have to be estimated for the population and the environment in which it is used. This would require phenotypes and genotypes on a sample of a rather homogeneous population to avoid spurious associations that could result from unknown population stratification. Therefore, a gene marker for a QTL is likely to be most successful in an environment with intensive pedigree and performance recording. Nevertheless, in low input environments, direct and LD markers will be more useful than LE markers because the latter require routine recording of phenotypes and genotypes to estimate QTL effects within families. In addition to MAS within local breeds, several other strategies for breed improvement could be pursued in developing countries, including gene introgression and MAS within synthetic breeds. This would be most advantageous for introducing specific disease resistance alleles into breeds with improved production characteristics to make them more tolerant to the environments encountered in developing countries. Gene introgression is, however, a long and expensive process and only worthwhile for genes with large effects. MAS within synthetic breeds, e.g. a cross between local and improved temperate climate breeds, can allow development of a breed that is based on the best of both breeds (e.g. Zhang and Smith, 1992). Because of the extensive LD within the cross, a limited number of markers would be needed. Care should, however, be taken to avoid the impact of genotype x environment interactions if MAS is implemented in a more controlled environment.

19 Chapter 10 Strategies, limitations and opportunities for marker-assisted selection in livestock 183 References Andersson, L Genetic dissection of phenotypic diversity in farm animals. Nature Revs. Genet. 2: Chaiwong, N., Dekkers, J.C.M., Fernando, R.L. & Rothschild, M.F Introgressing multiple QTL in backcross breeding programs of limited size. Proc. 7 th Wld. Congr. Genet. Appl. Livest. Prodn. Electronic Communication No. 22: 08. Montpellier, France. Darvasi, A. & Soller, M Advanced intercross lines: an experimental population for fine genetic mapping. Genetics 141: Dekkers, J.C.M. & van Arendonk, J.A.M Optimizing selection for quantitative traits with information on an identified locus in outbred populations, Genet. Res. 71: Dekkers, J.C.M Commercial application of marker- and gene-assisted selection in livestock: strategies and lessons. J. Anim. Sci. 82: E313 E328. Dekkers J.C.M. & Chakraborty, R Potential gain from optimizing multi-generation selection on an identified quantitative trait locus. J. Anim. Sci. 79: Dominik, S., Henshall, J., O Grady, J. & Marshall, K.J Factors influencing the efficiency of a marker assisted introgression program in merino sheep. Genet. Sel. Evol. In press. Farnir, F., Coppieters, W., Arranz, J.-J., Berzi, P., Cambisano, N., Grisart, B., Karim, L., Marcq, F., Moreau, L., Mni, M., Nezer, C., Simon, P., Vanmanshoven, P., Wagenaar, D. & Georges, M Extensive genome-wide linkage disequilibrium in cattle. Genome Res. 10: Fernando, R.L Incorporating molecular markers into genetic evaluation. Session G6.1. Proc. 55th Meeting of the European Association of Animal Production. 5 9 September 2004, Bled, Slovenia. Fernando, R.L. & Grossman, M Marker-assisted selection using best linear unbiased prediction. Genet. Sel. Evol. 21: Gibson, J.P Short-term gain at the expense of long-term response with selection of identified loci, Proc. 5 th Wld. Cong. Genet. Appl. Livest. Prodn. CD-ROM Communication No. 21: University of Guelph, Canada. Goddard, M.E. & Meuwissen, T.H.E The use of linkage disequilibrium to map quantitative trait loci. Austr. J. Exp. Agric. 45: Heifetz, E.M., Fulton, J. E., O Sullivan, N., Zhao, H., Dekkers, J.C.M. & Soller, M Extent and consistency across generations of linkage disequilibrium in commercial layer chicken breeding populations. Genetics 171: Hospital, F. & Charcosset, A Marker-assisted introgression of quantitative trait loci. Genetics 147: International Chicken Polymorphism Map Consortium A genetic variation map for chicken with 2.8 million single nucleotide polymorphisms. Nature 432: Kinghorn, B.P Use of segregation analysis to reduce genotyping costs. J. Anim. Breed. Genet. 116: Kinghorn, B.P., Meszaros, S.A. & Vagg, R.D Dynamic tactical decision systems for animal breeding, Proc. 7 th Wld. Congr. Genet. Appl. Livest. Prodn. Communication No Montpellier, France. Koudandé, O.D., Iraqi, F., Thomson, P.C., Teale, A.J. & van Arendonk, J.A.M Strategies to optimize marker-assisted introgression of multiple unlinked QTL. Mammal. Genome 11:

20 184 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish Lande, R. & Thompson, R Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124: Lee, S.H. & van der Werf, J.H.J The role of pedigree information in combined linkage disequilibrium and linkage mapping of quantitative trait loci in a general complex pedigree. Genetics 169: Lee, S.H. & van der Werf, J.H.J An efficient variance component approach implementing an average information REML suitable for combined LD and linkage mapping with a general complex pedigree. Genet. Sel. Evol. 38: Marshall, K., Henshall, J. & van der Werf, J.H.J Response from marker-assisted selection when various proportions of animals are marker typed: a multiple trait simulation study relevant to the sheep meat industry. Anim. Sci. 74: Marshall, K.J., van der Werf J.H.J. & Henshall, J Exploring major gene - marker phase typing strategies in marker assisted selection schemes. Anim. Sci. 78: Meuwissen, T.H.E. & Goddard, M.E The use of marker haplotypes in animal breeding schemes. Genet. Sel. Evol. 28: Meuwissen, T.H.E. & Goddard, M.E Fine mapping of quantitative trait loci using linkage disequilibria with closely linked marker loci. Genetics 155: Meuwissen, T.H.E. & Goddard, M.E Prediction of identity by descent probabilities from marker haplotypes. Genet. Sel. Evol. 33: Meuwissen, T.H.E., Hayes, B. & Goddard, M.E Prediction of total genetic value using genome-wide dense marker maps. Genetics 157: Nejati-Javaremi, A., Smith, C. & Gibson, J.P Effect of total allelic relationship on accuracy and response to selection. J. Anim. Sci. 75: Nimbkar, C., Pardeshi, V. & Ghalsasi, P Evaluation of the utility of the FecB gene to improve the productivity of Deccani sheep in Maharashtra, India. pp In H.P.S. Makkar & G.J. Viljoen, eds. Applications of gene-based technologies for improving animal production and health in developing countries. Netherlands, Springer. Pong-Wong, R., George, A.W., Woolliams, J.A. & Haley, C.S A simple and rapid method for calculating identity-by-descent matrices using multiple markers. Genet. Sel. Evol. 33: Pyasatian, N., Fernando, R.L. & Dekkers, J.C.M Genomic selection for composite line development using low density marker maps. In Proc. 8 th Wrld. Congr. Genet. Appl. Livest. Prodn., Paper Belo Horizonte, Brazil. Rothschild, M.F. & Plastow, G.S Advances in pig genomics and industry applications. AgBiotechNet. 1: 1 7. van der Werf, J.H.J. & Marshall, K Combining gene-based methods and reproductive technologies to enhance genetic improvement of livestock in developing countries, pp In H.P.S Makkar & G.J. Viljoen, eds. Applications of gene-based technologies for improving animal production and health in developing countries. Netherlands, Springer. Zhang, W. & Smith, C Computer simulation of marker-assisted selection utilizing linkage disequilibrium. Theor. Appl. Genet. 83:

21 Chapter 11 Marker-assisted selection in poultry Dirk-Jan de Koning and Paul M. Hocking

22 186 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish Summary Among livestock species, chicken has the most extensive genomics toolbox available for detection of quantitative trait loci (QTL) and marker-assisted selection (MAS). The uptake of MAS is therefore not limited by technical resources but mostly by the priorities and financial constraints of the few remaining poultry breeding companies. With the cost of genotyping decreasing rapidly, an increase in the use of direct trait- single nucleotide polymorphism (SNP)-associations in MAS can be predicted.

23 Chapter 11 Marker-assisted selection in poultry 187 Current status of chicken breeding programmes Poultry production has been the fastest growing livestock industry over the last decades especially in middle- and lowincome countries (Taha, 2003). In 2001, poultry production accounted for 70 million tonnes of poultry meat and 47 million tonnes of eggs (Arthur and Albers, 2003). Among poultry, chicken account for 85 percent of meat production and 96 percent of egg production (Bilgili, 2001; Arthur and Albers, 2003; Taha, 2003). While chickens have been domesticated and selected for thousands of years, modern poultry breeding started during the 1950s. One of the most notable features is the diversification between chickens bred for meat production (broilers) and those bred for table egg production (layers). This is a result of the negative genetic correlation in chicken between growth and reproductive traits. Within breeds, there is a separation into male and female lines that are crossed to produce commercial hybrids. In broilers, male lines are selected for growth and carcass quality whereas in female lines less emphasis is placed on growth and more on reproductive traits such as egg production and hatchability. In table egg-laying chickens, male lines are selected for high egg production and high egg weight whereas in female lines selection may emphasize rate of lay with less attention to egg size. In both broiler and layer lines the primary selection goal is the improvement of feed efficiency and economic gain. Significant heterosis for fitness traits in poultry is well established and all commercial poultry (chickens, turkeys and ducks) are hybrids that are produced in a selection and multiplication pyramid that is illustrated in Figure 1. Crossing male and female lines maximizes heterosis at the grandparent and parent levels of the hierarchy, and allows traits that have been genetically improved in different lines to be combined in the commercial birds. The power of this structure to deliver large economic gains in chickens is a result of their high reproductive rate and short generation interval and is clearly illustrated by this example of an egg-laying improvement programme. Even greater numerical efficiency is possible in broilers: a single pen containing ten females and one male at the nucleus level might produce 150 great-grandparents after selection (line D of Figure 1); these will produce 50 female offspring each or grandparents in a year and these grandparents will generate female parent stock during the succeeding year. These hybrid parent females will each produce over 130 male and female offspring and generate nearly 50 million commercial broilers or tonnes of meat. The figure illustrates the rapidity with which genetic improvement at the nucleus level can be disseminated to commercial flocks and the fact that relatively few pure-line birds are needed to produce very large numbers of commercial layers. The existence of this breeding structure results in rapid transmission of genetic change to commercial flocks (about four years), including traits that might be improved by MAS. Conversely, undesirable genetic change can also be disseminated very quickly to a very large number of birds. In practice, far more birds are kept at the nucleus level than shown in Figure 1 where the numbers presented are purely for illustrative purposes. Status of functional genomics in chicken Among the various livestock species, chicken has the most comprehensive genomic tool-

24 188 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish Figure 1 Multiplication pyramid for a four-strain commercial hybrid layer Generation Male lines Female lines Time Pure line selection A 1 x 10 B 10 x 100 C 10 x 100 D 100 x Year 0 Grandparents A 200 X B C X D Year 1 Parents AB X CD Year 2 Commercial hybrids ABCD 32.2 million Years 3 to 4 The boxes in bold are the multiplication (crossing) phase. The top line of the boxes indicates the lowest level of the pure-line (nucleus) selection stage. The numbers in the boxes refer to the minimal numbers of birds in line A, and corresponding numbers in lines B, C and D, that are required to generate hybrid laying hens at the commercial level and the numbers that result from this process. The time scale is indicated on the right hand side of the figure assuming a generation interval of one year. Adapted from Bowman, box. The chicken genome consists of 39 pairs of chromosomes: eight cytologically distinct macrochromosomes, the sex chromosomes Z and W and 30 pairs of cytologically indistinguishable microchromosomes. Linkage maps were developed initially using three separate mapping populations (Bumstead and Palyga, 1992; Crittenden et al., 1993; Groenen et al., 1998) that were later merged to provide a consensus map with markers (Groenen et al., 2000). A good overview of the consensus linkage map and the cytogenetic map can be found in the First Report of Chicken Genes and Chromosomes 2000 and its successor in 2005 (Schmid et al., 2000, 2005). All chicken maps can be viewed at More recently, the chicken genome became the first livestock genome to be sequenced with a six-fold coverage (six full genome equivalents) (Hillier et al., 2004). The chicken genome sequence can be browsed via a number of Web sites, which are summarized at org/resources/databases.html. The genome sequence effort was accompanied by partial sequencing of three distinct poultry breeds (a broiler, a layer and a Chinese Silky), to identify SNPs between and among these and the reference sequence of the Red Jungle Fowl. This resulted in an SNP map consisting of about 2.8 million SNPs (Wong et al., 2004). The chicken polymorphism database (ChickVD) can be browsed at:

25 Chapter 11 Marker-assisted selection in poultry 189 (Wang et al., 2005). The SNP map will facilitate the development of genome-wide SNP assays, containing between and SNPs per assay. For the study of gene expression, there are various complementary DNA (cdna) microarrays available, varying from targeted arrays (immune, neuroendocrine, embryo) to whole genome generic arrays. Recently, a whole-genome Affymetrix chip was developed in collaboration with the chicken genomics community (www. affymetrix.com and org/resources/affymetrix-faq1.htm). Altogether, this provides a very comprehensive toolbox to study the functional genomics of chicken, whether this be an individual gene or the entire genome. Current uptake of MAS in chicken Implementation of MAS requires knowledge of marker-trait associations based on QTL and candidate gene studies, and ideally from studies of the underlying genetic mechanisms. There have been a large number of QTL studies in chicken covering a wide range of traits including growth, meat quality, egg production, disease resistance (both infectious diseases and production diseases) and behaviour. These studies have recently been reviewed (Hocking, 2005). A total of 27 papers reported 114 genome-wide significant QTL from experimental crosses largely involving White Leghorn and broiler lines. A summary of the QTL that have been detected is presented in Table 1. While the abundance of QTL would indicate ample opportunity for MAS in chicken, it must be noted that nearly all studies were carried out in experimental crosses and hence the results do not reflect QTL within selected populations. However, these results do provide a good starting point to search for QTL within commercial populations, as demonstrated for growth and carcass traits where many published QTL also explained variation within a broiler dam line (de Koning et al., 2003; de Koning et al., 2004). To the authors knowledge, there are no other QTL studies within commercial lines of poultry in the public domain. Of the QTL from experimental crosses, only a small number has been followed up by fine mapping analyses and the responsible gene mutation has only been described for some disease resistance QTL (Liu et al., 2001a, b; Liu et al., 2003). A good example of how QTL mapping combined with functional studies can identify functional variants is for Marek s disease. Marek s disease (MD) is an infec- Table 1 Quantitative traits and chromosomal locations in experimental chicken crosses Trait Chromosome (number of QTL) Total QTL Number of papers Behaviour/fear 1,2(3),3,4(2),7,10,27, E Body fat 1(2),3,5,7(2),15, Body weight 1(7),2(4),3(4),4(5),5,8(2),11,12,13(2),27(3)Z(2) 32 9 Carcass quality 1(2),2,3,4(2).5(2),6(2),7(3),8(2),9,13(2),27,Z(2) 21 1 Disease resistance 1(4),2(2),3(2),4(2),5(5),6(2),7,8,14,18,27,Z Egg number 8,Z(2) 3 1 Egg quality 2,11,Z 3 2 Egg weight 1,2,3,4(3),14,23,Z 9 3 Feed intake 1,4 2 2 Sexual maturity Z(2) 2 2 Source: Hocking, 2005.

26 190 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish tious viral disease caused by a member of the herpes virus family and costs the poultry industry about US$1 000 million per annum. An F 2 cross between resistant and susceptible lines was challenged experimentally and genotyped, providing the data for a QTL analysis that resulted in a total of seven QTL for susceptibility to MD (Vallejo et al., 1998; Yonash et al., 1999). Subsequently, the founder lines of the F 2 cross were used for a micro-array study to identify genes that were differentially expressed between the two lines following artificial infection. Fifteen of these genes were mapped onto the chicken genome and two of them mapped to a QTL region for resistance to MD (Liu et al., 2001a). At the same time, protein interaction studies between a viral protein (SORF2) and a chicken splenic cdna library revealed an interaction with the chicken growth hormone (GH) (Liu et al., 2001b). This led to the detection of a polymorphism in the GH gene that was associated with differences in the number of tumours between the susceptible and the resistant line (Liu et al., 2001b). GH coincided with a QTL for resistance and was differentially expressed between founder lines (Liu et al., 2001a). Alongside the various genome scans for QTL, a large number of candidate gene studies have been carried out. The majority of studies summarized in Table 2 have been conducted on White Leghorn strains and have utilized restriction fragment length polymorphisms (RFLPs), SNPs or single strand conformation polymorphisms (SSCPs). These techniques require both that the gene is known and that the experimenter is able to sequence part of the gene to detect polymorphisms that distinguish the experimental lines. Candidate gene studies have been used in two ways. First, candidate genes may be used merely as a marker for a trait (typically disease) based on prior knowledge and, second, and much less often, to search for the mutation within a gene that is associated with phenotypic variation in a trait. Currently, potential (candidate) genes for a QTL may be obtained from a knowledge of physiology (Dunn et al., 2004) or comparative linkage maps (i.e. locating genes that are in the location of the QTL based on common areas of the generich genomes of different species, usually human and mouse). There are likely to be many more of the second type of candidate gene studies as information from large-scale gene expression and proteomic experiments begin to suggest novel gene candidates for traits of commercial and biological importance. It should also be noted that there is good evidence that genetic variation is not limited to genomic DNA: associations between polymorphisms in mitochondrial genes and MD resistance, body weight and egg shell quality were reported by Li et al. (1998a, b). Despite great enthusiasm for breeding companies to be involved in functional genomics research in poultry, there are very few applications of MAS in commercial poultry breeding. One existing example is the use of blood group markers to improve resistance to MD where selection of haplotypes B 21 and B 12 based on conventional serological tests has been widely used (McKay, 1998). In discussions with the industry it is clear that most interest is in QTL or candidate genes for resistance to diseases like MD or ascites, a genetic condition associated with pulmonary hypertension, leading to mortality in fast growing birds. There is also considerable interest among breeders of layer lines for egg quality, especially egg shell quality because of its importance for food

27 Chapter 11 Marker-assisted selection in poultry 191 Table 2 Association of candidate genes with quantitative traits in poultry Trait Chromosomes 1 Gene symbols References Age at first egg 1,2,3 GH, NPY, ODC Feng et al., 1997; Dunn et al., 2004; Parsanejad et al., 2004 Disease resistance (E. coli) 16 MHC1, MHC4, TAP2 Yonash et al., 1999 Disease resistance (MD 2 ) 1,NK GH, LY6E Kuhnlein et al., 1997; Liu et al., 2001a,b and 2003 Disease resistance (Sal 3 ) 4,6,7,16,19, 1,17,NK TNC, PSAP, NRAMP1 4, MHC1,CASP1, IAP1, TLR4, TLR5 Double yolked eggs 10 GNRHR Dunn et al., 2004 Hu et al., 1997; Lamont et al., 2002; Leveque et al. 2003; Liu and Lamont, 2003; Iqbal et al., 2005 Egg production Z,1,20 GHR, GH, PEPCK Feng et al., 1997; Kuhnlein et al., 1997; Parsanejad et al., 2003 Egg weight 1 IGF1 Nagaraja et al., 2000 Eggshell quality 1,3,20 IGF1, ODC, PEPCK Nagaraja et al., 2000; Parsanejad et al., 2003, 2004 Body fat 1,1,5,Z GH, IGF1, TGFβ3, GHR Feng et al., 1998; Fotouhi et al., 1993; Li et al., 2003; Zhou et al., 2005 Feed efficiency 3,20 ODC, PEPCK Parsanejad et al., 2003 and 2004 Body weight/carcass quality 1,3,5,Z,1,1,1 IGF1, ODC, TGFβ3, GHR, APOA2, PIT1 Organ weight (spleen) 3,5,32 TGFβ2, TGFβ3, TGFβ4 5 Li et al., 2003 Feng et al., 1998; Li et al., 2003; Jiang et al., 2004; Parsanejad et al., 2004; Li et al., 2005; Zhou et al., 2005 Skeletal traits 1,3,5,32 IGF1, TGFβ2, TGFβ3, TGFβ4 5 Li et al., 2003; Zhou et al., NK = gene has not yet been assigned to a chromosome. 2 Marek s Disease. 3 Salmonellosis. 4 Now known as Slc11a1. 5 TGFβ4 in the paper is now known to be TGFβ1. safety. For production traits such as growth and egg numbers, breeders make sufficient progress using traditional selection methods, and they expect little improvement from MAS for such traits unless markers can be used to increase the accuracy of selection. Nonetheless, among breeders of broiler stock there is interest in markers for traits that are difficult to measure such as feed efficiency and meat quality in addition to disease resistance. Potential for MAS in chicken The technical aspects and potential implications of implementing MAS in livestock are discussed in Chapter 10 and Dekkers (2004), and van der Beek and van Arendonk (1996) evaluated the technical aspects of MAS in poultry breeding. A review of the potential of MAS in poultry is provided by Muir (2003) but this includes many of the technical issues that are common across livestock species. This chapter therefore focuses on poultryspecific issues, and readers are referred to Chapter 10 or Muir (2003) for a more comprehensive overview of applications and limitations of MAS. Muir (2003) identified two cases where MAS could increase the selection intensity in poultry breeding: (i) traits that are measured later in life or are costly to measure (such as egg production and feed efficiency for broiler breeders); and (ii) selection within full-sib families for sex-limited traits (e.g. male chicks for egg production). Accuracy of selection can also be improved via MAS when selecting between full-sib families for sex-limited traits and traits that cannot be measured directly on one or both

28 192 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish sexes and/or have a low heritability (e.g. egg production, disease resistance, carcass quality and welfare traits). Limiting factors for application of MAS (Muir, 2003) include biological factors (reproductive capacity) and many theoretical considerations related to the effectiveness of MAS (e.g. diverting selection pressure from polygenes to a single marked gene), which are generally applicable to MAS in livestock (Dekkers, 2004; Chapter 10). One of the concerns of Muir (2003) is the expected lack of major QTL for traits that have been under selection for many generations (following simulation results). However, recent QTL studies within commercial lines of pigs (Evans et al., 2003; Nagamine et al., 2003) and poultry (de Koning et al., 2003, 2004) have demonstrated that many sizeable QTL are still segregating in commercial populations despite decades of selection. There is strong academic interest in chicken genomics outside agriculture from, among others, developmental biologists and evolutionary geneticists, and this has contributed greatly to the development of the current functional genomics toolbox available for chicken. Among livestock species, chickens are best placed to pioneer new approaches where QTL studies are complemented by gene expression studies (Liu et al., 2001a) or where they become fully integrated within genetical genomics (de Koning, Carlborg and Haley, 2005; de Koning and Haley, 2005). If poultry breeders decide to embrace MAS, one of the main questions is whether they are prepared to re-structure their breeding programmes around MAS or implement these around their current breeding strategies. Adopting the terminology of Dekkers (2004), there are three levels of MAS: gene-assisted selection (GAS) where the functional mutation and its effects are known; linkage disequilibrium MAS (LD-MAS) where a marker (or marker haplotypes) is in population-wide disequilibrium with a QTL; and linkage equilibrium MAS (LE-MAS) where markers are in Hardy-Weinberg equilibrium with the QTL at the population level, but linkage disequilibrium exists within families. A fourth type of MAS that was recently proposed is genome-wide MAS (GW- MAS), where dense markers (i.e. SNPs) across the genome are used to predict the genetic merit of an individual without targeting any individual QTL or measuring (expensive) phenotypes on every generation (Meuwissen, Hayes, and Goddard, 2001). Integrating current evaluations with MAS is most straightforward for GAS and LD- MAS because the QTL effect can be included in routine evaluations as a fixed effect (Chapter 10). LE-MAS, on the other hand, requires extensive genotyping and fairly complicated statistical procedures (Wang, Fernando and Grossman, 1998), while GW- MAS reduces the genome to a black-box but does not require selection of QTL using arbitrary thresholds. Furthermore, the dense marker information required for GW-MAS may dispense with often faulty pedigree records because all pedigree information is encoded in the genome-wide genotypes. In terms of quantitative genetic theory, there are ongoing developments in the tools required to detect and evaluate QTL in arbitrary pedigrees, moving away from strictly additive-dominance models to epistasis and parent-of-origin effects (Liu, Jansen and Lin, 2002; Shete and Amos, 2002). At the same time, the technology to analyse more than SNPs in a single assay is available, and a cost of as little as US$0.02 per genotype is likely for chicken

29 Chapter 11 Marker-assisted selection in poultry 193 SNPs in the near future. However, the fine mapping and characterization of identified QTL remain costly and time-consuming processes and are often restricted to the most promising QTL, resulting in hundreds of QTL that will never make it past the stage of mapping to a 30 cm confidence interval. While current research and developments in poultry functional genomics are relevant to all four possible applications of MAS to livestock, poultry breeders need to decide at what level they want to exploit molecular information and for which traits. The emerging picture is that breeders are more comfortable with known gene mutations as this provides an easy route to implementation as well as knowledge about the underlying biology. Furthermore, there is concern that the marker-trait linkage will break down over a relatively few generations of selection in large commercial flocks. While candidate gene studies would provide the quickest route to implementation, fine mapping and characterization of QTL (e.g. using expression studies) may reveal gene variants that are not obvious candidate genes for quantitative traits. Potential for MAS in poultry in developing countries Owing to the relatively low value of single animals, the high reproductive rate in poultry and good portability of eggs or day-old hatchlings, the concentration of resources is very high in the poultry breeding industry and all poultry breeding is privately owned. Fifty years ago there were many primary breeders in each and every industrialized country, but not so long ago there were only 20 breeding companies worldwide. Today, three groups of primary breeders dominate the international layer market. Equally, in the chicken meat industry, there are four major players in broiler breeding worldwide (Flock and Preisinger, 2002). The concentration process is probably now complete, and the present players are sufficient to meet the global supply for million eggs as final products. A similar trend is expected in the pig industry, where international breeding companies of hybrid products are increasing their market share (Preisinger, 2004). For large-scale farming of broilers and layers in developing countries there are additional challenges with regard to heat stress and potential disease pressure. With increasing poultry production in developing countries, breeding companies may give priority to using breeding and molecular tools to address these additional challenges. While chickens are very efficient in converting grain into valuable meat and egg protein, and smallholder chicken production can be valuable for sustaining the livelihoods of farmers in the developing world, this type of poultry production would require robust dual-purpose (meat and egg) birds, rather than specialized broiler and layer lines. It is unlikely that the commercial breeders will develop such lines but there may be scope for national or international research organizations to do so. Any MAS would have to be done at the institutional level where the line is developed and would necessitate prior knowledge of trait-marker associations at the farm level. The implementation of whole genome SNP approaches to farm level recording might facilitate progress in this area but the challenges, both practical and theoretical, are formidable. Concluding remarks Among livestock species, chicken have by far the most comprehensive genomic toolbox. However, uptake of MAS will

30 194 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish depend strongly on whether the industry wishes to supplement its current selection programme with a known gene variant or whether it is prepared to restructure breeding programmes around MAS. Compared with, for instance, the dairy cattle industry, the poultry breeding community may be slower to embrace emerging complex approaches to MAS. This is somewhat surprising because the closed structure of the poultry breeding pyramid offers much better protection of intellectual property than the dairy cattle industry where semen from highest ranking bulls is available for all. On the other hand, the fact that blood groups have been used to select for resistance to MD suggests that poultry breeders have some experience and skills in this type of selection. Poultry breeding companies contribute significantly to poultry genomics research but may not be fully convinced about the economic feasibility of MAS. To implement MAS successfully, a company must tackle the problems of identifying the traits to select and their economic significance, the lack of current knowledge of the genes or markers associated with these traits, and their association with other economic selection criteria. The current toolbox provides the means to answer some of these questions but there are obvious concerns about human and capital resources and the potential loss of gains in other traits in a competitive market. Coupled with these reservations must be the very evident success of current breeding programmes in achieving many desirable commercial goals. Acknowledgements The Roslin Institute is supported by a core strategic grant from the Biotechnology and Biological Sciences Research Council (BBSRC). The authors gratefully acknowledge information on breeding objectives from representatives of the poultry breeding industry. References Arthur, J.A. & Albers, G.A.A Industrial perspective on problems and issues associated with poultry breeding. In W. M. Muir & S. E. Aggrey, eds. Poultry genetics, breeding and biotechnology, pp Wallingford, UK, CABI Publishing. Bilgili, S.F Poultry products and processing in the international market place. Proc. Internat. Anim. Agric. & Food Sci. Conf. Indianapolis, IN, USA (available at Bowman, J.C An introduction to animal breeding. The Institute of Biology s Studies in Biology No 46. London, Edward Arnold. Bumstead, N. & Palyga, J A preliminary linkage map of the chicken genome. Genomics 13: Crittenden, L.B., Provencher, L., Sanatangello, L., Levin, I., Abplanalp, H., Briles, R.W., Briles, W.E. & Dodgson, J.B Characterization of a Red Jungle Fowl by White Leghorn backcross reference population for molecular mapping of the chicken genome. Poultry Sci. 72: de Koning, D.J., Carlborg, O. & Haley. C.S The genetic dissection of immune response using gene-expression studies and genome mapping. Vet. Immun. & Immunopath. 105: de Koning, D.J. & Haley, C.S Genetical genomics in humans and model organisms. Trends Genetics 21: de Koning, D.J., Haley, C.S., Windsor, D., Hocking, P.M., Griffin, H., Morris, A., Vincent, J. & Burt, D.W Segregation of QTL for production traits in commercial meat-type chickens. Genetic. Res. 83:

31 Chapter 11 Marker-assisted selection in poultry 195 de Koning, D.J., Windsor, D., Hocking, P.M., Burt, D.W., Law, A., Haley, C.S., Morris, A., Vincent, J. & Griffin, H Quantitative trait locus detection in commercial broiler lines using candidate regions. J. Anim. Sci. 81: Dekkers, J.C.M Commercial application of marker- and gene-assisted selection in livestock: strategies and lessons. J. Anim. Sc. 82 E-Suppl: E313 E328. Dunn, I.C., Miao, Y.W., Morris, A., Romanov, M.N., Wilson, P.W. & Waddington, D A study of association between genetic markers in candidate genes and reproductive traits in one generation of a commercial broiler breeder hen population. Heredity 92: Evans, G.J., Giuffra, E., Sanchez, A., Kerje, S., Davalos, G., Vidal, O., Illan, S., Noguera, J.L., Varona, L., Velander, I., Southwood, O.I., de Koning, D.J., Haley, C.S., Plastow, G.S. & Andersson, L Identification of quantitative trait loci for production traits in commercial pig populations. Genetics 164: Feng, X.P., Kuhnlein, U., Aggrey, S.E., Gavora, J.S. & Zadworny, D Trait association of genetic markers in the growth hormone and the growth hormone receptor gene in a White Leghorn strain. Poultry Science 76: Feng, X.P., Kuhnlein, U., Fairfull, R.W., Aggrey, S.E., Yao, J. & Zadworny, D A genetic marker in the growth hormone receptor gene associated with body weight in chickens. J. Heredity 89: Flock, D.K. & Preisinger, R Breeding plans for poultry with emphasis on sustainability. 7th World Congress on Genetics Applied to Livestock Production, August, 2002, Montpellier, France, Session 24: Sustainable breeding plans in developed countries. Communication Fotouhi, N., Karatzas, C.N., Kuhnlein, U. & Zadworny, D Identification of growth-hormone DNA polymorphisms which respond to divergent selection for abdominal fat-content in chickens. Theor. Appl. Genet. 85: Groenen, M.A., Cheng, H.H., Bumstead, N., Benkel, B.F., Briles, W.E., Burke, T., Burt, D.W., Crittenden, L.B., Dodgson, J., Hillel, J., Lamont, S., de Leon, A.P., Soller, M., Takahashi, H. & Vignal, A A consensus linkage map of the chicken genome. Genome Research 10: Groenen, M.A., Crooijmans, R.P., Veenendaal, A., Cheng, H.H., Siwek, M. & van der Poel, J.J A comprehensive microsatellite linkage map of the chicken genome. Genomics 49: Hillier, L.W., Miller, W., Birney, E., Warren, W., Hardison, R.C., Ponting, C.P., Bork, P., Burt, D.W., Groenen, M.A., Delany, M.E., Dodgson,.J.B., Chinwalla, A.T., Cliften, P.F., Clifton, S.W., Delehaunty, K.D., Fronick, C., Fulton, R.S., Graves, T.A., Kremitzki, C., Layman, D., Magrini, V., McPherson, J.D., Miner, T.L., Minx, P., Nash, W.E., Nhan, M.N., Nelson, J.O., Oddy, L.G., Pohl, C.S., Randall-Maher, J., Smith, S.M., Wallis, J.W., Yang, S.P., Romanov, M.N., Rondelli, C.M., Paton, B., Smith, J., Morrice, D., Daniels, L., Tempest, H.G., Robertson, L., Masabanda, J.S., Griffin, D.K., Vignal, A., Fillon, V., Jacobbson, L., Kerje, S., Andersson, L., Crooijmans, R.P., Aerts, J., van der Poel, J.J., Ellegren, H.. Caldwell, R.B., Hubbard, S.J., Grafham, D.V., Kierzek, A.M., McLaren, S.R., Overton, I.M., Arakawa, H., Beattie, K.J., Bezzubov, Y., Boardman, P.E., Bonfield, J.K., Croning, M.D., Davies, R.M., Francis, M.D., Humphray, S.J., Scott, C.E., Taylor, R.G., Tickle, C., Brown, W.R., Rogers, J., Buerstedde, J.M., Wilson, S.A., Stubbs, L., Ovcharenko, I., Gordon, L., Lucas, S., Miller, M.M., Inoko, H., Shiina, T., Kaufman, J., Salomonsen, J., Skjoedt, K., Wong, G.K., Wang, J., Liu, B., Wang, J., Yu, J., Yang, H., Nefedov, M., Koriabine, M., Dejong, P.J., Goodstadt, L., Webber, C., Dickens, N.J., Letunic, I., Suyama, M., Torrents, D., Von, M.C., Zdobnov, E.M., Makova, K., Nekrutenko, A.,

32 196 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish Elnitski, L., Eswara, P., King, D.C., Yang, S., Tyekucheva, S., Radakrishnan, A., Harris, R.S., Chiaromonte, F., Taylor, J., He, J., Rijnkels, M., Griffiths-Jones, S., Ureta-Vidal A., Hoffman, M.M., Severin, J., Searle, S.M., Law, A.S., Speed, D., Waddington, D., Cheng, Z., Tuzun, E., Eichler, E., Bao, Z., Flicek, P., Shteynberg, D.D., Brent, M.R., Bye, J.M., Huckle, E.J., Chatterji, S., Dewey, C., Pachter, L., Kouranov, A., Mourelatos, Z., Hatzigeorgiou, A.G., Paterson, A.H., Ivarie, R., Brandstrom, M., Axelsson, E., Backstrom, N., Berlin, S.,Webster, M.T., Pourquie, O., Reymond, A., Ucla, C., Antonarakis, S.E., Long, M.J.J., Emerson, J.J Betran, E., Dupanloup, I., Kaessmann, H., Hinrichs, A.S., Bejerano, G., Furey, T.S., Harte, R.A., Raney, B., Siepel, A., Kent, W.J., Haussler, D., Eyras, E., Castelo, R., Abril, J.F., Castellano, S., Camara, F., Parra, G., Guigo, R., Bourque, G., Tesler, G., Pevzner, P.A., Smit, A., Fulton, L.A., Mardis, E.R. & Wilson, R.K Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432: Hocking, P.M Review of QTL results in chicken. Wld. Poultry Sci. J. 61: Hu, J.X., Bumstead, N., Barrow, P., Sebastein, G., Olien, L., Morgan, K. & Malo, D Resistance to salmonellosis in the chicken is linked to NRAMP1 and TNC. Genome Res. 7: Iqbal, M., Philbin, V.J., Withanage, G.S.K., Wigley, P., Beal, R.K., Goodchild, M.J., Barrow, P., McConnell, I., Maskell, D.J., Young, J., Bumstead, N., Boyd, Y. & Smith A.L Identification and functional characterization of chicken Toll-like receptor 5 reveals a fundamental role in the biology of infection with Salmonella enterica serovar typhimurium. Infect. & Immunity 73: Jiang, R., Li, J.,Qu, Li, H. & Yang, N A new single nucleotide polymorphism in the chicken pituitary- specific transcription factor (POULF1) gene associated with growth rate. Anim. Genet. 35: Kuhnlein, U., Ni, L., Weigend, S., Gavora, J.S., Fairfull, W. & Zadworny, D DNA polymorphisms in the chicken growth hormone gene: response to selection for disease resistance and association with egg production. Anim. Genet. 28: Lamont, S.J., Kaiser, M.G. & Liu, W Candidate genes for resistance to Salmonella enteritidis colonization in chickens as detected in a novel genetic cross. Vet. Immun. & Immunopath. 87: Leveque, G., Forgetta, V., Morroll, S., Smith, A.L., Bumstead, N., Barrow, P., Loredo-Osti, J.C., Morgan, K. & Malo, D Allelic variation in TLR4 is linked to susceptibility to Salmonella enterica serovar typhimurium infection in chickens. Infect. & Immunity 71: Li, S., Aggrey, S.E., Zadworny, D., Fairfull, W. & Kuhnlein, U. 1998a. Evidence for a genetic variation in the mitochondrial genome affecting traits in White Leghorn chickens. J. Hered. 89: Li, S., Zadworny, D., Aggrey, S.E. & Kuhnlein, U. 1998b. Mitochondrial pepck: A highly polymorphic gene with alleles co- selected with Marek s disease resistance in chickens. Anim. Genet. 29: Li, H., Deeb, N., Zhou, H., Mitchell, A.D., Ashwell, C.M. & Lamont, S.J Chicken quantitative trait loci for growth and body composition associated with transforming growth factor-beta genes. Poult. Sci. 82: Li, H., Deeb, N., Zhou, H., Ashwell, C.M. & Lamont, S.J Chicken quantitative trait loci for growth and body composition associated with the very low density apolipoprotein-ii gene. Poult. Sci. 84:

33 Chapter 11 Marker-assisted selection in poultry 197 Liu, W. & Lamont, S.J Candidate gene approach: Potential association of caspase-1, inhibitor of apoptosis protein-1, and Prosaposin gene polymorphisms with response to Salmonella enteritidis challenge or vaccination in young chicks. Anim. Biotech. 14: Liu, H.C., Cheng, H.H., Tirunagaru, V., Sofer, L. & Burnside, J. 2001a. A strategy to identify positional candidate genes conferring Marek s disease resistance by integrating DNA microarrays and genetic mapping. Anim. Genet. 32: Liu, H.C., Kung, H.J., Fulton, J.E., Morgan, R.W. & Cheng. H.H. 2001b. Growth hormone interacts with the Marek s disease virus SORF2 protein and is associated with disease resistance in chicken. Proc. Natl. Acad. Sci. USA 98: Liu, H.C., Niikura, M., Fulton, J.E. & Cheng. H.H Identification of chicken lymphocyte antigen 6 complex, locus E (LY6E, alias SCA2) as a putative Marek s disease resistance gene via a virus-host protein interaction screen. Cytogenet. & Genome Res. 102: Liu, Y., Jansen, G.B. &. Lin, C.Y The covariance between relatives conditional on genetic markers. Genet. Select. Evoln. 34: McKay, J.C A poultry breeder s approach to avian neoplasia. Avian Pathology 27: S74 S77. Meuwissen, T.H.E., Hayes, B.J. & Goddard, M.E Prediction of total genetic value using genome-wide dense marker maps. Genetics 157: Muir, W.M Incorporating molecular information in breeding programmes: applications and limitations. In W.M. Muir & S.E. Aggrey, eds. Poultry Genetics, Breeding and Biotechnology, pp Wallingford, UK, CABI Publishing. Nagamine, Y., Haley, C.S., Sewalem, A. & Visscher, P.M Quantitative trait loci variation for growth and obesity between and within lines of pigs (Sus scrofa). Genetics 164: Nagaraja, S.C., Aggrey, S.E., Yao, J., Zadworny, D., Fairfull R.W. & Kuhnlein, U Trait association of a genetic marker near the IGF-I gene in egg-laying chickens. J. Heredity 91: Parsanejad, R., Torkamanzehi, A., Zadworny, D. & Kuhnlein, U Alleles of cytosolic phosphoenolpyruvate carboxykinase (PEPCK): trait association and interaction with mitochondrial PEPCK in a strain of white leghorn chickens. Poultry Sci. 82: Parsanejad, R., Praslickova, D., Zadworny, D. & Kuhnlein, U Ornithine decarboxylase: haplotype structure and trait associations in White Leghorn chickens. Poultry Sci. 83: Preisinger, R Internationale Tendenzen der Tierzüchtung und die Rolle der Zuchtunternehmen. Züchtungskunde 76(6): Schmid, M., Nanda, I., Guttenbach, M., Steinlein, C., Hoehn, M., Schartl, M., Haaf, T., Weigend, S., Fries, R., Buerstedde, J.M., Wimmers, K., Burt, D.W., Smith, J., A Hara, S., Law, A., Griffin, D.K., Bumstead, N., Kaufman, J., Thomson, P.A., Burke, T., Groenen, M.A., Crooijmans, R.P., Vignal, A., Fillon, V., Morisson, M., Pitel, F.,Tixier-Boichard, M.T., Ladjali-Mohammedi, K., Hillel, J., Maki-Tanila, A., Cheng, H.H., Delany, M.E., Burnside, J. & Mizuno, S First report on chicken genes and chromosomes Cytogenet. & Cell Genet. 90: Schmid, M., Nanda, I., Hoehn, H., Schartl, M., Haaf, T., Buerstedde, J.M., Arakawa, H., Caldwell, R.B., Weigend, S., Burt, D.W., Smith, J., Griffin, D.K., Masabanda, J.S., Groenen, M.A., Crooijmans, R.P., Vignal, A., Fillon, V., Morisson, M., Pitel, F., Vignoles, M., Garrigues, A., Gellin, J., Rodionov, A.V., Galkina, S.A., Lukina, N.A., Ben-Ari, G., Blum, S., Hillel, J., Twito, T., Lavi, U., David, L., Feldman, M.W., Delany, M.E., Conley, C.A., Fowler, V.M., Hedges, S.B., Godbout, R., Katyal, S., Smith, C., Hudson, Q., Sinclair, A. & Mizuno, S Second report on chicken genes and chromosomes Cytogen.& Genome Res. 109:

34 198 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish Shete, S. & Amos, C.I Testing for genetic linkage in families by a variance components approach in the presence of genomic imprinting. Am. J. Human Genet. 70: Taha, F.A The poultry sector in middle income countries and its feed requirements: the case of Egypt. Outlook Report No. WRS03-02, pp Washington, DC, Economic Research Service, USDA. Vallejo, R.L., Bacon, L.D., Liu, H.C., Witter, R.L., Groenen, M.A., Hillel, J. & Cheng, H.H Genetic mapping of quantitative trait loci affecting susceptibility to Marek s disease virus induced tumors in F2 intercross chickens. Genetics 148: van der Beek, S. &. van Arendonk, J.A.M Marker-assisted selection in an outbred poultry nucleus. Anim. Sc. 62: Wang, T., Fernando, R.L. & Grossman, M Genetic evaluation by best linear unbiased prediction using marker and trait information in a multibreed population. Genetics 148: Wang, J., He, X., Ruan, J., Dai, M., Chen, J., Zhang, Y., Hu, Y., Ye, C., Li, S., Cong, L., Fang, L., Liu, B., Li, S., Wang, J., Burt, D.W., Wong, G.K., Yu, J., Yang, H. & Wang, J ChickVD: a sequence variation database for the chicken genome. Nucl. Acid Res. 33: D438 D441. Wong, G.K., Liu, B., Wang, J., Zhang, Y., Yang, X., Zhang, Z., Meng, Q., Zhou, J., Li, D., Zhang, J., Ni, P., Li, S., Ran, L., Li, H., Zhang, J., Li, R., Li, S., Zheng, H., Lin, W., Li, G., Wang, X., Zhao, W., Li, J., Ye, C., Dai, M., Ruane, J., Zhou, Y., Li, Y., He, X., Zhang, Y., Wang, J., Huang, X., Tong, W., Chen, J., Ye, J., Chen, C., Wei, N., Li, G., Dong, L., Lan, F., Sun, Y., Zhang, Z., Yang, Z., Yu, Y., Huang, Y., He, D., Xi, Y., Wei, D., Qi, Q., Li, W., Shi, J., Wang, M., Xie, F., Wang, J., Zhang, X., Wang, P., Zhao, Y., Li, N., Yang, N., Dong, W., Hu, S., Zeng, C., Zheng, W., Hao, B., Hillier, L.W., Yang, S.P., Warren, W.C., Wilson, R.K., Brandstrom, M., Ellegren, H., Crooijmans, R.P., van der Poel, J.J., Bovenhuis, H., Groenen, M.A., Ovcharenko, I., Gordon, L., Stubbs, L., Lucas, S., Glavina, T., Aerts, A., Kaiser, P., Rothwell, L., Young, J.R., Rogers, S., Walker, B.A., van, H.A., Kaufman, J., Bumstead, N., Lamont, S.J., Zhou, H., Hocking, P.M., Morrice, D., de Koning, D.J., Law, A., Bartley, N., Burt, D.W., Hunt, H., Cheng, H.H., Gunnarsson, U., Wahlberg, P., Andersson, L., Kindlund, E., Tammi, M.T., Andersson, B., Webber, C., Ponting, C.P., Overton, I.M., Boardman, P.E., Tang, H., Hubbard, S.J., Wilson, S.A., Yu, J., Wang, J. & Yang, H A genetic variation map for chicken with 2.8 million singlenucleotide polymorphisms. Nature 432: Yonash, N., Bacon, L.D. Witter, R.L. & Cheng. H.H High resolution mapping and identification of new quantitative trait loci (QTL) affecting susceptibility to Marek s disease. Anim. Genet. 30: Zhou, H., Mitchell, A.D., McMurtry, J.P., Ashwell, C.M. & Lamont, S.J Insulin-like growth factor-1 gene polymorphism associations with growth, body composition, skeleton integrity, and metabolic traits in chickens. Poult. Sci. 84:

35 Chapter 12 Marker-assisted selection in dairy cattle Joel Ira Weller

36 200 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish Summary Considering the long generation interval, the high value of each individual, the very limited female fertility and the fact that nearly all economic traits are expressed only in females, it would seem that cattle should be a nearly ideal species for application of marker-assisted selection (MAS). As genetic gains are cumulative and eternal, application of new technologies that increase rates of genetic gain can be profitable even if the nominal annual costs are several times the value of the nominal additional annual genetic gain. Complete genome scans for quantitative trait loci (QTL) based on the granddaughter design have been completed for most commercial dairy cattle populations, and significant acrossstudy effects for economic traits have been found on chromosomes 1, 3, 6, 9, 10, 14 and 20. Quantitative trait loci associated with trypanotolerance have been detected in a cross between the African N Dama and the Boran breeds as the first step in the introgression of these genes into breeds susceptible to trypanosomosis. In dairy cattle, the actual DNA polymorphism has been determined twice, for QTL on BTA 6 and BTA 14. In both cases the polymorphism caused a non-conservative amino acid change, and both QTL chiefly affect fat and protein concentration. Most theoretical studies have estimated the expected gains that can be obtained by MAS to be in the range of a 5 to 20 percent increase in the rates of genetic gain obtained by traditional selection programmes. Applied MAS programmes have commenced for French and German Holsteins. In both programmes genetic evaluations including QTL effects are computed by variants of marker-assisted best linear unbiased prediction (MA-BLUP).

37 Chapter 12 Marker-assisted selection in dairy cattle 201 Introduction Compared with other agricultural species, dairy cattle are unique in terms of the value of each animal, their long generation interval and the very limited fertility of females. Thus unlike plant and poultry breeding, most dairy cattle breeding programmes are based on selection within the commercial population. Similarly, detection of quantitative trait loci (QTL) and markerassisted selection (MAS) programmes are generally based on analysis of existing populations. The specific requirements of dairy cattle breeding have led to the generation of very large data banks in most developed countries, which are available for analysis. In this chapter, dairy cattle breeding programmes in the developed and developing countries are reviewed and compared. The important issues in the application of MAS are then outlined. These include economic considerations based on phenotypic selection, the current status of cattle marker maps, methods to detect QTL and to estimate QTL effects and location suitable for dairy cattle, the current state of QTL detection in dairy cattle, methods to incorporate information from genetic markers in genetic evaluation systems, methods to identify the actual polymorphisms responsible for observed QTL and description of the reported results, methods and theory for MAS in dairy cattle, the current status of MAS and, finally, the future prospects for MAS in dairy cattle. Dairy cattle breeding programmes in developed countries In most developed countries, dairy cattle breeding programmes are based on the progeny test (PT) design. The PT is the design of choice for moderate to large dairy cattle populations, including the United States Holsteins, which include over ten million animals. An example of the Israeli PT design is given in Figure 1. Figure 1 The Israeli Holstein breeding programme 50 candidate bulls 100 elite cows 200 elite cows 3 foreign bulls 4 local bulls cows cows 20 bulls daughters records on daughters 5 bulls are selected 45 bulls are culled

38 202 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish This population consists of approximately cows of which 90 percent are milk recorded. Approximately 20 bulls are used for general service. Each year about 300 elite cows are selected as bull dams. These are mated to the two to four best local bulls and an equal number of foreign bulls to produce approximately 50 bull calves for progeny testing. At the age of one year, the bull calves reach sexual maturity, and approximately semen samples are collected from each young bull. These bulls are mated to approximately first parity cows to produce about daughters, or 100 daughters per young bull. Gestation length for cattle is nine months. Thus the young bulls are approximately two years old when their daughters are born, and are close to four when their daughters calve and begin their first lactation. At the completion of their daughters first lactations, most of the young bulls are culled. Only four to five are returned to general service, and a similar number of the old proven sires are culled. By this time the selected bulls are approximately five years old. Various studies have shown that rates of genetic gain by a PT scheme are about 0.1 to 0.2 genetic standard deviations of the selection index per year (Nicholas and Smith, 1983; Israel and Weller, 2000). The PT was devised to take advantage of the nearly unlimited fertility of males. However, compared with breeding schemes for other species, the PT has several major weaknesses. First, for a PT system to be effective, the population must include at least several tens of thousands of animals with recording on production traits and paternity. Inaccurate recording can significantly reduce rates of genetic gain (Israel and Weller, 2000). Second, generation intervals, especially along the sire-to-dam and sire-to-sire paths, are much longer than the biological requirements. The increase in generation interval reduces genetic gain per year. As artificial insemination (AI) institutes generally pay a premium price for male calves of elite cows, these cows are often given preferential treatment in order to increase their genetic evaluations (Powell and Norman, 1988). The small number of bulls actually used for general service, and the even smaller number of bulls used as bull sires, tends to reduce the effective population size, which increases inbreeding and decreases genetic variance in the population. The effective population size of the United States Holstein population with ten million cows has been estimated at about 100 (Farnir et al., 2000). Finally, there is virtually no selection along the dam-to-dam path. Generally, percent of healthy female calves produced are used as replacements. Various studies have suggested that selection intensities along the dam-to-dam path could be increased by application of multiple ovulation and embryo transfer (MOET) and sexed semen. Costs of both technologies are still prohibitively high to be applied to the entire population, as shown below. To overcome this problem for MOET, Nicholas and Smith (1983) proposed a nucleus breeding scheme. In nucleus schemes, the selection population consists of several hundred individuals, and bulls are not progeny tested. Instead, bulls are selected based on the genetic evaluations of their dams and sisters, which shortens the generation interval on the sire-to-dam and sire-to-sire paths, but reduces the reliabilities of the genetic evaluations. Dams of bulls and cows are selected based chiefly on their own production records, and MOET is applied to increase the number of progeny per dam. As the selection population consists of only several hundred individuals, MOET costs

39 Chapter 12 Marker-assisted selection in dairy cattle 203 are manageable if costs are spread over the entire national dairy industry. Rates of genetic gain within the nucleus are thus higher than can be obtained by a national PT design. This gain is transferred to the general population through the use of bulls from the nucleus population. In addition to the greater overall rate of genetic gain, the nucleus scheme has the advantage that it is necessary to collect data on a much smaller population, which should reduce costs and increase accuracy. The disadvantages of MOET are that overall costs and rates of increase of inbreeding will be greater unless steps are taken to reduce inbreeding. However, these steps will also slightly decrease rates of genetic gain. In practice, no country has replaced its standard PT scheme with a nucleus breeding programme. Dairy cattle breeding in developing countries The genus Bos includes five to seven species, of which Bos taurus and Bos indicus are the most widespread and economically important. B. taurus is the main dairy cattle species, and is found generally in temperate climates. Several tropical and subtropical cattle breeds are the result of crosses between B. taurus and B. indicus, which interbreed freely. In the tropics, cows need at least some degree of tolerance to environmental stress due to poor nutrition, heat and disease challenge to sustain relatively high production levels (Cunningham, 1989). Tropical breeds are adapted to these stresses but have low milk yield, whereas more productive temperate breeds cannot withstand the harsh tropical conditions, to the point of not being able to sustain their numbers (de Vaccaro, 1990). Furthermore, most tropical countries are developing countries, which lack systematic large-scale milk and pedigree recording. A number of studies have been conducted on crosses between imported and local breeds in the tropics. Generally, the F 1 B. taurus x B. indicus crosses are economically superior to either of the purebred strains (FAO, 1987). The heterosis effect of the F 1 cross is due to genes for disease resistance from the local parent, and genes for milk production from the imported strain (Smith, 1988; Cunningham, 1989). However, this heterosis is lost in future generations if the F 1 is backcrossed to either parental strain. Madalena (1993) presented an F 1 continuous replacement scheme to capitalize on its superiority. Recently, Kosgey, Kahi and van Arendonk (2005) proposed a closed adult nucleus MOET scheme to increase milk production in tropical crossbred cattle. Economic considerations in applying MAS to dairy cattle For any new technique to be economically viable, overall gains must be greater than overall costs. This also applies to using MAS within a dairy cattle breeding programme. However, unlike investment in new equipment, genetic gains never wear out, i.e. breeding is unique in that genetic gains are cumulative and eternal. Thus, as shown by Weller (1994, 2001) investments in MAS or other techniques that enhance breeding programmes are economically viable even if nominal costs are greater than nominal gains. For example, consider an ongoing breeding programme with a constant rate of genetic gain per year. Assume that the annual rate of genetic gain has a nominal economical value of V. The cumulative discounted returns to year T, R v, will be a function of the nominal annual returns, the discount rate, d, the profit horizon, T, and the number of years from the beginning of the programme until

40 204 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish first returns are realized, t. R v is computed as follows (Hill, 1971): R v t rd - r = V (1 - r T + 1 d 2 d ) ( T - t + 1) r - 1- r {1} where r d = 1/(1+d). For example, with d = 0.08, T = 20 years, and t = 5 years, R v = 32.58V. That is, the cumulative returns are equal to nearly 33 times the nominal annual returns. For an infinite profit horizon, Equation {1} reduces to: R Vr t v = = 2 2 t 2 (1 rd ) d (1 + d) and R v = V. The value of nominal annual genetic gain will now be compared with the annual costs of a breeding programme, assuming a fixed nominal cost per year. Costs, unlike genetic gain, only have an effect in the year they occur. Assuming that annual costs are equal during the length of the breeding programme, and that first costs occur in the year after the base year, C T, the net present value of the total costs of the breeding programme is computed as follows: C T Ccrd (1 r = 1 r d T d ) where C c = annual costs of the breeding programme. Using the same values for T and d, C T = 9.82C c. Thus, with a profit horizon of 20 years, cumulative profit is positive if V > 0.31C c. For an infinite profit horizon, C T = 12.5C c, and profit will be positive if V > 0.1C c. Therefore, a breeding programme can be profitable even if the nominal annual costs are several times the value of the nominal annual genetic gain. For example, consider V d T + 1 d {2} {3} the United States of America dairy cattle population, which consists of about ten million cows. Annual genetic gain is about 100 kg milk per year. The value of a 1 kg gain in milk production has been estimated at US$0.1 (Weller, 1994). Thus, the nominal annual value of a 10 percent increase in the rate of genetic gain (10 kg per year) is: V = (10 kg per cow per year) (US$0.1 per kg)( cows) = US$ per year {4} The cumulative value with a profit horizon of 20 years and an 8 percent discount rate would be US$326 million, and break-even annual costs for a technology that increases annual genetic gain by 10 percent are US$32 million per year. Thus, it would be profitable to spend quite a lot for a relatively small genetic gain. The value of genetic gain to a specific breeding enterprise will generally be less than the gain to the general economy. This is because most of the gains obtained by breeding will be passed on to the consumers. Brascamp, van Arendonk and Groen (1993) considered the economic value of MAS based on changes in returns from semen sales for a breeding organization operating in a competitive market. In this case, a breeding firm that adopts a MAS programme can increase its returns either by increasing its market share or increasing the mean price of a semen dose. Although the value of genetic gain will be less, relatively small changes in genetic merit can result in large changes in market share. current status of marker maps in cattle Cattle have 29 pairs of autosomes and one pair of sex chromosomes. All the autosomes are acrocentric, and map units are

41 Chapter 12 Marker-assisted selection in dairy cattle 205 scored from the centromere. Chromosomes are denoted with the prefix BTA (B. taurus). Similar to other mammals, the bovine DNA includes base pairs (bp), and the map length is approximately cm. The human genome is estimated to encode protein-coding genes (International Human Genome Sequencing Consortium, 2004), and it can be assumed that the number of genes in other mammals, including cattle, should be quite similar. Thus, a single map unit, on average, includes approximately eight genes and one million bp. As in other animal species, microsatellites are still the marker of choice for map construction due to their prevalence and high polymorphism. Although single nucleotide polymorphisms (SNPs) are much more prevalent, genetic maps based on SNPs are still in the future. More than SNPs have been identified in humans, but only several thousand have been validated in cattle ( ca/hosted/bovine%20genomics/), and rates of polymorphism are generally unknown. With the completion of the sixfold coverage of the bovine genome by the Bovine Genome Sequencing Project at Baylor College of Medicine ( bcm.tmc.edu/projects/bovine/) many more SNPs will be identified. Several genetic maps are available on the internet. The United States Meat Animal Research Center (MARC) ( usda.gov/) includes thousands of markers, chiefly microsatellites. The ArkDB database system, hosted at Roslin Institute, includes data from several published maps ( The Commonwealth Scientific and Industrial Research Organization (CSIRO) livestock industries cattle genome marker map is built upon data provided by the University of Sydney s comparative location database (www. livestockgenomics.csiro.au/perl/gbrowse. cgi/cattlemap/). This map combined all publicly-available maps into a single integrated map that currently includes markers. Methods of QTL detection suitable for commercial dairy cattle populations Detection of QTL requires generation of linkage disequilibrium (LD) between the genetic markers and QTL. In plants, this is generally accomplished by crosses between inbred lines but, for the reasons noted in the introduction, this is not a viable option for dairy cattle in developed countries, in which all analyses must be based on analysis of the existing population. Detection of QTL in developing countries is considered below. For advanced commercial populations, the daughter and granddaughter designs, which make use of the existence of large half-sib families, are most appropriate for QTL analysis (Weller, Kashi and Soller, 1990). These designs are presented in Figures 2 and 3. Both designs are similar to the backcross design for crosses between inbred lines in that only the alleles of one parent are followed in the progeny. Thus, similar to the backcross design, dominance cannot be estimated. These designs differ from crosses between inbred lines in that several families are analysed in which the linkage phase between QTL and genetic markers may differ. In addition, any specific QTL will be heterozygous in only a fraction of the families included in the analysis. Thus, QTL effects must be estimated within families, and these designs are therefore less powerful per individual genotyped than designs based on crosses between inbred lines. The granddaughter design has the advantage of greater statistical power per

42 206 Marker-assisted selection Current status and future perspectives in crops, livestock, forestry and fish Figure 2 The daughter design M A m a M? m? A? a? Only a single family is shown, although in practice several families will be analysed jointly. The sire is assumed to be heterozygous for a QTL and a linked genetic marker. The two alleles of the marker locus are denoted M and m, and the two alleles of the QTL are denoted A and a. Alleles of maternal origin are denoted by question marks. individual genotyped. As each genotype is associated with multiple phenotypic records, the power per individual genotyped in the granddaughter design can be four-fold the power of the daughter design (Weller, Kashi and Soller, 1990). The disadvantage of this design is that the appropriate data structure (hundreds of progeny tested bulls, sons of a limited number of sires) is found only in the largest dairy cattle populations. Both daughter and granddaughter designs are less powerful per individual genotyped than designs based on analysis of inbred lines. Furthermore, the half-sib designs have the disadvantage that progeny with the same genotype as the sire are uninformative, because the progeny could have received either paternal allele. Additional experimental designs have also been proposed. Coppieters et al. (1999) proposed the great-granddaughter design. One of the disadvantages of the granddaughter design is that the number of progeny-tested sons of most sires is too low to obtain reasonable power to detect QTL of moderate effects. Coppieters et al. (1999) proposed that power can be increased by also genotyping progeny-tested grandsons of the grandsire. Inclusion of the grandsons is complicated by the fact that there is another generation of meiosis between the grandsire and his grandson. A significant drawback of all the designs considered above is that they give no indication of the number of QTL alleles segregating in the population or their rela-