遗传学报 Acta Genetica Sinica, December 006, 33 (1):1073 1080 ISSN 0379-417 Comparison of Different Foreground and Background Selection Methods in Marker-Assisted Introgression BAI Jun-Yan 1,, ZHANG Qin,1, JIA Xiao-Ping 3 1. College of Animal Science and Technology, Henan Science and Technology University, Luoyang 471003, China;. Key Laboratory of Animal Genetics and Breeding of the Ministry of Agriculture, College of Animal Science and Technology, Chinese Agriculture University, Beijing 100094, China; 3. College of Biological Science, China Agricultural University, Beijing 100094, China Abstract: Three different methods for foreground selection and four different methods for background selection were compared in terms of the efficiency of marker-assisted introgression of a QTL allele from a donor line into a recipient line and also in terms of the recovery of the recipient genetic background. The results showed that for the introgression of a donor QTL allele, a direct selection on the QTL itself (when the QTL genotype can be directly identified) would ensure that the allele is successfully introgressed and rapidly fixed. However, when a direct selection on the QTL is not feasible, an indirect selection using two closely linked flanking markers can be used, which also shows similar results. For the recovery of the recipient genetic background, if the goal is to recover the whole genetic background of the recipient, genomic similarity selection or marker index selection would be the best choice: Only three generations of backcrosses were required to recover over 98% of the recipient genome. Whereas if the goal is to recover certain background traits of the recipient, MBLUP selection would give the best results, which achieved not only over 99% recovery of the recipient QTL alleles for the background traits after three generations of backcrosses, but also showed the best genetic improvement of these traits. Key words: Marker-assisted introgression; foreground selection; background selection; MBLUP Marker-assisted introgression (MAI) is one of the major applications of molecular information in animal breeding [1-3] with the aim of introgression of one or more favorable genes from a line (donor) to another (recipient), without altering the genetic background of the recipient as much as possible. It consists of three phases: first, cross between donor and recipient lines to produce F 1 ; second, repeated backcrosses of the F 1 individuals and the subsequent progeny with the recipient to recover its genetic background; and third, intercross of the backcross progeny to fix the introgressed genes in the population. During the backcross and intercross phases, both foreground selection and background selection are simultaneously required. Foreground selection is the selection of individuals carrying the donor genes with the help of the marker information to ensure that the genes are not lost during the backcross phase and are fixed rapidly during the intercross phase. Background selection is the selection of genetic background of the recipient with the help of marker information, to accelerate the recovery rate of recipient genetic background. Many theoretical and simulation studies on MAI [4-14] have proved that MAI is an effective method for gene introgression. Meanwhile, it has been successfully applied in animal breeding practices. For example, the introgression of the halothane-negative allele into a Piétrain line that had a Received: 006-01-05; Accepted: 006-04-05 This work was supported by the National Key Basic Research Program (No. 006CB10104), the National Natural Science Fundation of China (No. 30430500) and the Project of Talent Scientific Research Fund of Henan Science and Technology University (No.05-156). 1 Corresponding author. E-mail: qzhang@cau.edu.cn
1074 遗传学报 Acta Genetica Sinica Vol.33 No.1 006 high frequency of the halothane-positive allele [15], the introgression of the naked-neck gene from a rural low-body-weight breed into a commercial broiler line [16], and the introgression of the Booroola gene FecB into dairy sheep breeds [17]. Many factors affect the efficiency of MAI, among which the methods of foreground and background selections are very critical. In this study, different foreground and background selections methods were compared using computer simulation. Markerassisted BLUP (MBLUP) method [18], a method proposed for marker-assisted selection in a within breed (line) selection program, was first introduced for background selection in the MAI system. 1 Materials and Methods 1. 1 Data simulation 1. 1. 1 QTLs and markers Three traits were considered. Each trait was affected by both polygenes and a QTL with two alleles, Q and q, with Q being the favorable allele and q, the unfavorable allele. There was no dominance between them. In the donor line, all individuals had genotype QQ for the QTL related to the first trait, which was referred to as the foreground trait, and qq for the QTLs related to the other two traits. In contrast, in the recipient line, all individuals had qq for the first trait and QQ for the other traits, which were referred to as the background traits. In this study, the introgression of the Q allele of the foreground trait QTL from the donor line into the recipient line was carried out. The whole genome was composed of 10 chromosomes, each having a length of 100 cm. Six markers were evenly distributed on each chromosome, with map distances between the markers being 0 cm. Each marker had two alleles, which were fixed in the donor and recipient lines, respectively. The three QTLs were located on three chromosomes at positions 55 cm, 45 cm, and 65 cm, respectively. It was assumed that for the foreground trait, the QTL had genotypic values 4.5 for QQ and 4.5 for qq, for the first background trait, the QTL had genotypic values 5.5 for QQ and 5.5 for qq, and for the second background trait, 3.5 for QQ and 3.5 for qq. 1. 1. Phenotypic values The phenotypic values are generated from model: y = v+ u+ e, where v is the QTL genotype value, u is the polygenic effect, and e is the random residual effect. For individuals in the donor and recipient lines, u is sampled from a normal distribution u N(0, σ ). For all the progeny individuals, u = 0.5u + 0.5u + m, where u s and u d represent paternal and maternal polygenic effects, respectively, and m represents the deviation attributed to Medelian sampling, which follows a normal distribution of s u N(0, ( σ / )(1 ( Fs + Fd)/)), with F s and F d being the inbreeding coefficients of the father and mother, respectively. In all the generations, e is generated from N(0, σ e ). 1. 1. 3 Breeding system and parameter settings It was assumed that the donor line was composed of 10 males and the recipient line was composed of 10 males and 00 females. One half of the recipient females were mated to the donor males to produce F 1 individuals, and the other half were mated to the males of the same population to maintain the recipient population. During the backcross phase, 10 males were selected from the progeny population and mated with 100 females that were randomly selected from the recipient population to produce the next generation. At the same time, the 10 males and the remaining 100 females of the recipient population were randomly mate with each other to produce the next generation of the recipient line. During the intercross phase, in each generation, 10 males and 100 females were selected to mate with each other for the next generation. During the whole process, half-sib and full-sib matings were avoided. Each mating produced 10 offsprings, and the sex of each offspring was assigned at random with a probability of 0.5 for either sex. There was no overlap between generations. The phenotypic variances for the foreground trait and the two background traits were set to be 0, 50, and 10, respectively. The heritabilities of the three traits d
BAI Jun-Yan et al.: Comparison of Different Foreground and Background Selection Methods in Marker-Assisted Introgression 1075 were set to be 0.50, 0.30, and 0.40, respectively. The selection in the backcross and intercross phases was first carried out on the basis of the donor QTL allele (foreground) and then the genetic background of the recipient. Three different methods for foreground selection and four different methods for background selection (described below) were used. Thirty replicates of simulation were carried out for each method. 1. Selection methods 1.. 1 Foreground selection (1) Direct selection: The genotype of the foreground QTL could be directly observed, and the heterozygous individuals were selected. () Indirect selection with one linked marker: The foreground QTL was unknown; therefore, its genotype was identified using information of the nearest marker (the distance between the marker and the QTL was 5 cm). The individuals with heterozygous marker genotypes were selected. (3) Indirect selection with two flanking markers: The genotype of the foreground QTL was determined using information of the two nearest flanking markers (the distances were 5 cm and 15 cm, respectively). The individuals with heterozygous genotypes at both the markers were selected. 1.. Background selection (1) Random selection: Randomly selecting individuals, i.e. no selection was made on the genetic background of the recipient. () Genomic similarity selection: Selection was based on the ratio of the number of markers that are homozygous for the recipient allele to the total number of markers [6]. (3) Marker index selection: Selection was based on an index derived from combining phenotype and marker information [7]. Imp = bm Im + bpp (1) where I m m = av is marker score; is the i= 1 i i weight of the ith marker, which has a value of 0.5 for the markers at both ends of the chromosome and 1 for a i all other markers; v i is the merit of the ith marker, which has a value of 1 if the two alleles of the marker are from the recipient line and 0 if they are not; P is the trait phenotypic value; b and b are weights m h given to the marker score and phenotypic value, defined as b (1 h ) and b h (1 p), respectively, with being the trait heritability and p the proportion of the contribution of the QTL to the genetic variance. (4) MBLUP selection: Selection was based on the estimated aggregate breeding value of the two background traits, which is derived by first estimating the breeding values using both phenotypic information and marker information via marker-assisted BLUP (MBLUP) [18] separately for the two background traits, and then summing them into an index with equal weight. The model for MBLUP was m p y = Wv+ Zu+ e, () where y is the vector of the trait phenotypic values, v is the vector of the random QTL allelic effects with a mean of 0 and a variance-covariance matrix Gσ v (G is the gametic relationship matrix at the QTL and σ v is the QTL allelic variance ), u is the vector of the random polygenic effects with a mean of 0 and a variance-covariance matrix Aσ u (A is the additive genetic relationship matrix and σ u is the additive genetic variance ), e is the vector of residual effects with a mean of 0 and a variance-covariance matrix Iσ e (I is an identity matrix), and W and Z are the incidence matrices of v and u, respectively. The estimates of v and u can be obtained by solving the following mixed model equations 1 Z Z + k1a Z W uˆ Zy 1 = W Z W W k ˆv Wy + G, (3) where k 1 and k are equal to respectively. e u p σ σ and σ σ, The method explained by Wang et al. [19] and Liu et al. [0] was used to calculate G 1, and the equations e v
1076 遗传学报 Acta Genetica Sinica Vol.33 No.1 006 were solved using the algorithm of iteration on data and the convergence criterion 10D-9. Results. 1 Frequency of the favorable allele of the foreground QTL Fig. 1 shows the change in frequency of the favorable allele of the foreground QTL in the backcross and intercross generations. Only the results of random background selection are shown here, because this frequency is irrespective of background selection method. It can be noted that during the backcross phase, the frequency of the favorable allele was, as expected, kept at 0.5 in all the five backcross generations with respect to direct foreground selection, but it decreased to 0.4 and 0.0 in the final backcross generation with respect to flanking marker selection and one marker selection, respectively. During the intercross phase there was a rapid increase in the frequency of the favorable allele: After two intercross generations, it reached 1.00, 0.94, and 0.74 with respect to direct selection, flanking marker selection, and one marker selection, and after five intercross generations, it was 1.00, 0.97, and 0.75, respectively. These results indicate that indirect selection of the foreground QTL using linked marker information could not ensure correct selection of individuals for backcrossing or intercrossing because of possible recombination between markers and QTL, particularly when using only one marker information. Therefore, if a direct selection on the foreground QTL is not feasible, two flanking markers should be used instead of only one marker for indirect selection.. Recovery of the recipient genetic background Recovery of the recipient genetic background can be measured as the percentage of the recipient marker alleles in the population. Fig. shows the recovery of the recipient genetic background in the backcross and intercross generations using different background selection methods. Only the results of flanking marker selection for the foreground are shown here because the recovery of the recipient genetic background is irrespective of foreground selection method. After two backcross generations, the recovery of the recipient genetic background reached 86.64%, 87.34%, 9.88%, and 9.65% with respect to random selection, MBLUP selection, genomic similarity selection, and index selection, respectively. After four backcross generations, it increased to 96.4%, 96.38%, 99.99%, and 99.54%, respectively. Fig. 1 Change in frequency of the favorable allele of the foreground QTL Generations 1 to 5: backcross phase; Generations 6 to 10: intercross phase. Fig. Recovery of the recipient genetic background, measured as the percentage of the recipient marker alleles in the population Generations 1 to 5: backcross phase; Generations 6 to 10: intercross phase. NO: random selection; MBLUP: marker- assisted best linear unbiased prediction; GS: genomic similarity selection; MIS: marker index selection.
BAI Jun-Yan et al.: Comparison of Different Foreground and Background Selection Methods in Marker-Assisted Introgression 1077 During the intercross phase, the percentage of the recipient genetic background remained constant with respect to the four background selection methods.. 3 Frequency of the favorable allele of the background QTLs The changes of the frequencies of the favorable alleles of the two background QTLs with respect to different background selection methods are shown in Fig. 3 (the foreground selection was flanking marker selection). For the first background QTL, after two generations of backcrosses, the frequencies of the favorable allele were 0.87, 0.9, 0.95, and 0.96 with respect to random selection, MBLUP selection, genomic similarity selection, and index selection, respectively; after five backcross generations, they increased to 0.98, 1.00, 1.00, and 1.00, respectively. Similar changes were observed for the second background QTL. During the intercross phase, the allele frequencies of the two background QTLs remained unchanged.. 4 Genetic trends for background traits The genetic trends, measured as the change of the mean breeding values in different generations, for the two background traits with respect to different background selection methods are shown in Fig. 4 (the foreground selection was the flanking marker selection). The mean breeding values with respect to index selection and MBLUP selection continuously increased during both backcross and intercross phases, with MBLUP showing a larger increase, whereas Fig. 3 Background QTL frequencies with respect to different background selections Generations 1 to 5: backcross phase; Generations 6 to 10: intercross phase. NO: random selection; MBLUP: Marker-assisted best linear unbiased prediction; GS: genomic similarity selection; MIS: marker index selection. Fig. 4 Genetic trends of background traits with respect to different background selections Generations 1 to 5: backcross phase; Generations 6 to 10: intercross phase. NO: random selection; MBLUP: Marker-assisted best linear unbiased prediction; GS: genomic similarity selection; MIS: marker index selection.
1078 遗传学报 Acta Genetica Sinica Vol.33 No.1 006 with respect to random selection and genomic similarity selection, the trends showed only a slight increase during the backcross phase and remained unchanged during the intercross phase. 3 Discussion In this study, different foreground and background selection methods, which could be used in a marker-assisted introgression system, were compared in terms of the efficiency of introgressing a QTL allele from a donor line into a recipient line, the recovery of the entire genetic background of the recipient, and the recovery of certain traits of the recipient. For the introgression of a donor QTL allele, a direct selection on the QTL itself (when the QTL genotype can be directly identified) would ensure a stable donor QTL allele frequency of 0.5 during the backcross phase and rapid fixation of the allele after two generations of intercross. When a direct selection on the QTL is not feasible, an indirect selection using two closely linked flanking markers could also gave similar results. However, an indirect method using only one linked marker would result in a decrease of the allele frequency during the backcross phase and fail to fix the allele during the intercross phase (Fig. 1) because of the possible recombination between the marker allele and the QTL allele, which would lead to false foreground selection. For the recovery of the recipient genetic background, if the goal is to recover the whole genetic background of the recipient, genomic similarity selection [6] or marker index selection [7] would be the best choice. Only three generations of backcross were required to recover over 98% of the recipient genome (Fig. ). Whereas if the goal is to recover certain background traits of the recipient, MBLUP selection would give the best results, which not only achieved over 99% recovery of the recipient QTL alleles for the background traits after three generations of backcross (Fig. 3) but also resulted in the best genetic improvement of these traits (Fig. 4). The recovery of the recipient genetic background is a major issue in marker-assisted introgression. The genomic similarity selection and marker index selection can ensure the recovery of the entire genetic background of the recipient. However, they require the information on the markers distributed over the whole genome. It might be very expensive to genotype all these markers; therefore, these methods may not be feasible in animal breeding practices. In some cases, the major concern is not the whole genetic background but certain important economic traits of the recipient. In these cases, the MBLUP may be a good choice, which uses only the information on the markers linked to the QTLs affecting the traits of interest, and therefore is much less expensive. Moreover, MBLUP can lead to desired genetic improvement of these traits. The efficiency of MAI will be affected by many factors in addition to the methods of foreground and background selection. These include population size, marker allele frequencies in donor and recipient population (the marker alleles may not be fixed as assumed in this study), and number of marker alleles (which may be more than two as assumed in this study). Different situations may require different selection strategies. Furthermore, in some cases, it may be desirable to introgress favorable alleles of two or more QTLs from a donor line or different donor lines. Although the same principles for one QTL allele introgression described in this article can be applied, some modifications in the selection strategies are certainly required. Further studies are needed to account for these situations. References: [1] Hillel J, Schaap T, Haberfeld A, Jeffreys A J, Plotzky Y, Cahaner A, Lavi U. DNA fingerprints applied to gene introgression in breeding programs. Genetics, 1990, 14: 783-789. [] Young N D, Tanksley S D. RFLP analysis of the size of chromosomal segments retained around the Tm- locus of tomato during backcross breeding. Theor Appl Genet, 1989, 77: 353-359. [3] Dekkers J C M. Commercial application of marker- and gene-assisted selection in livestock: Strategies and lessons. J Anim Sci, 004, 8(E. Suppl.): E313-E38.
BAI Jun-Yan et al.: Comparison of Different Foreground and Background Selection Methods in Marker-Assisted Introgression 1079 [4] Hospital F, Chevalet C, Mulsant P. Using markers in gene introgression breeding programs. Genetics, 199, 13: 1199-110. [5] Groen A E, Timmermans M M J. The use of genetic marker increase the efficiency of introgression--a simulation study. Proceedings of the XIX Worlds Poultry Congress, 199, 1: 53-57. [6] Groen A F, Smith C. A stochastic simulation study on the efficiency of marker-assisted introgression in livestock. Joumal of Animal Breeding and Genetics, 1995, 11: 161-170. [7] Visscher P M, Haley C S, Thompson R. Marker-assisted introgression in backcross breeding programs. Genetics, 1996, 144: 193-193. [8] Hospital F, Charcosset A. Marker-assisted introgression of quantitative trait loci. Genetics, 1997, 147: 1469-1485. [9] Frisch M, Bohn M, Melchinger A E. Comparison of selection strategies for marker-assisted backcrossing of a gene. Crop Sci, 1999, 39: 195-1301. [10] Frisch M, Melchinger A E. Marker-assisted backcrossing for simultaneous introgression of two genes. Crop Sci, 001, 41: 1716-175. [11] Rjbaut J M, Jiang C, Hoisington D. Simulation experiments on efficiencies of gene introgression by backcrossing. Crop Sci, 00, 4: 557-565. [1] Chaiwong N, Dekkers J C M, Fernando R L, Rothschild M F. Introgressing gressing multiple QTL in backcross breeding programs of limited size. In: Proc 7th World Cong Genet Appl Livest Prod. Montpellier, France, 00, 19-3. [13] Lande R, Thompson R. Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics, 1990, 14: 743-756. [14] Zhou Y C, Wu W R, Qi J M. Approximate estimation of minimal sample size required for marker-assisted backcross breeding. Acta Genetica Sinica, 003, 30(7): 65-630. [15] Hanset R, Dasnoi C, Scalais S, Michaux C, Grobet L. Effets de l introgression dons le genome Pie train de l allele normal aux locus de sensibilite a l halothane. Genet Select Evol, 1995, 7: 77-88. [16] Yancovich A, Levin I, Cahaner A, Hillel J. Introgression of the avian naked neck gene assisted by DNA fingerprints. Anim Genet, 1996, 7: 149-155. [17] Gootwine E, Yossefi S, Zenou A, Bor A. Marker assisted selection for FecB carriers in Booroola Awassi crosses. Proc 6th World Cong Genet Appl Livest Prod. Armidale, Australia, 1998, 4: 161-164. [18] Fernando R L, Grossman M. Marker assisted selection using best linear unbiased prediction. Genet Select Evol, 1989, 1:467-477. [19] Wang T, Fernando R L, van der Beek S, Grossman M, van Arendonk J A M. Covariance between relatives for a markered quantitative trait locus. Genet Select Evol, 1995, 7: 51-74. [0] Liu H Y, Zhang Q, Zhang Y. Relative efficiency of marker assisted selection when marker and QTL are incompletely linked. Chinese Science Bulletin, 001, 46(4): 058-063.
1080 遗传学报 Acta Genetica Sinica Vol.33 No.1 006 标记辅助导入中不同前景和背景选择方法的比较 白俊艳 1,, 张勤 3, 贾小平 1. 河南科技大学动物科技学院, 洛阳 471003;. 中国农业大学动物科学技术学院, 农业部畜禽遗传育种重点实验室, 北京 100094; 3. 中国农业大学生物学院, 北京 100094 摘要 : 标记辅助导入是分子遗传信息应用于动物育种的一个重要方面, 其目的是在标记信息的辅助下将一个品种 ( 供体 ) 中的一个或多个优良基因导入另一个品种 ( 受体 ), 同时还要尽可能地保持受体群体原有的遗传背景 标记辅助导入的过程包括 3 个阶段, 第一阶段是杂交, 即供体与受体杂交产生 F 1 代个体, 第二阶段是回交, 即 F 1 个体以及后续各个世代的后代个体重复地与受体回交, 以使受体的遗传背景得到恢复, 第三阶段是横交, 即重复回交后得到的个体彼此间交配, 以便获得供体基因的纯合个体, 使该基因在群体中固定 在回交和横交阶段, 都要对参与交配的个体进行选择 在选择中, 要分别进行前景选择和背景选择, 前景选择是对供体基因的选择, 选择携带有供体基因个体参加配种, 从而使该基因在回交过程中不会丢失, 并在横交过程中能尽快固定, 背景选择是对受体遗传背景的选择, 选择那些含有受体基因组比例较高的个体参加配种, 从而加快恢复受体遗传背景的速度 本研究通过计算机模拟对不同的前景选择方法和不同的背景选择方法进行了比较 前景选择方法包括对受体基因的直接选择 ( 假设该基因可以直接测定 ) 利用单个连锁标记的间接选择和利用两侧标记的间接选择 3 种, 背景选择方法包括随机选择 基因组相似性选择 指数选择和标记辅助 BLUP(MBLUP) 选择 4 种 研究结果表明, 对于前景选择来说, 对供体基因的直接选择能保证该基因在回交的各个世代中保持一个稳定的频率 (0.5) 并在横交阶段迅速固定 ( 个世代 ), 用两侧标记的间接选择也能得到类似的结果, 但如果仅利用单个连锁标记进行选择, 则会导致供体基因的频率在回交阶段中有所下降, 并在横交阶段不能被固定 对于背景选择来说, 如果最终的目的是要完全恢复受体的遗传背景, 基因组相似性选择或标记指数选择是最好的选择方法, 它们可使受体的遗传背景在回交 3 个世代后就恢复到 98% 以上, 而随机选择或 MBLUP 选择需要至少 5 个世代的回交才能达到这个水平 但如果最终的目的只是要恢复受体的某些优良性状, 则 MBLUP 选择是值得推荐的方法, 它可使影响这些性状的受体基因频率在回交 3 个世代后就达到 99% 以上, 而且还能在整个基因导入过程中给这些性状带来最大的遗传进展 虽然用标记指数选择也有相似的结果, 但与之相比,MBLUP 的成本要低得多, 更具有实际可行性 关键词 : 标记辅助导入 ; 前景选择 ; 背景选择 ;MBLUP 作者简介 : 白俊艳 (1975-), 女, 博士, 副教授, 研究方向 : 动物分子数量遗传与动物育种 E-mail: junyanbai@163.com