Construction of dense linkage maps on the fly using early generation wheat breeding populations

Size: px
Start display at page:

Download "Construction of dense linkage maps on the fly using early generation wheat breeding populations"

Transcription

1 DOI /s Construction of dense linkage maps on the fly using early generation wheat breeding populations J. T. Eckard J. L. Gonzalez-Hernandez S. Chao P. St Amand G. Bai Received: 4 March 2014 / Accepted: 15 May 2014 Ó Springer Science+Business Media Dordrecht 2014 Abstract In plant species, construction of framework linkage maps to facilitate quantitative trait loci mapping and molecular breeding has been confined to experimental mapping populations. However, development and evaluation of these populations is detached from breeding efforts for cultivar development. In this study, we demonstrate that dense and reliable linkage maps can be constructed using extant breeding populations derived from a large number of crosses, thus eliminating the need for extraneous population development. Using 565 segregating F 1 progeny from 28 four-way cross breeding populations, a linkage map of the hexaploid wheat genome consisting of 3,785 single nucleotide polymorphism (SNP) loci and 22 simple sequence repeat loci was developed. Map estimation was facilitated by application of mapping algorithms for general pedigrees Electronic supplementary material The online version of this article (doi: /s ) contains supplementary material, which is available to authorized users. J. T. Eckard J. L. Gonzalez-Hernandez (&) Department of Plant Science, South Dakota State University, Brookings, SD, USA jose.gonzalez@sdstate.edu S. Chao USDA, Agricultural Research Service, Fargo, ND, USA P. St Amand G. Bai USDA, Agricultural Research Service, Manhattan, KS, USA implemented in the software package CRI-MAP. The developed linkage maps showed high rank-order concordance with a SNP consensus map developed from seven mapping studies. Therefore, the linkage mapping methodology presented here represents a resource efficient approach for plant breeding programs that enables development of dense linkage maps on the fly to support molecular breeding efforts. Keywords Linkage mapping Consensus map Pedigree analysis Wheat breeding High-throughput genotyping Introduction Genetic linkage maps, consisting of linked marker loci ordered along chromosomes, provide the essential framework for identifying genomic regions involved in trait expression and detection of marker-trait associations to enable molecular breeding. For plant species, linkage mapping has largely been confined to an experimental paradigm, in which a purpose-built population derived from a cross between two inbred lines is used for map construction. These experimental mapping populations have been attractive tools for genetic mapping due to their simplicity of development, power for quantitative trait loci (QTL) detection and applicability of available mapping algorithms and user-friendly software implementations (Doerge

2 2002). Despite these attractive attributes, experimental mapping populations have important limitations for the development of genetic linkage maps owing to their narrow genetic base and detachment from applied breeding efforts (Crepieux et al. 2004). The increasing availability of high-throughput marker genotyping platforms (Kilian et al. 2003; Akbari et al. 2006; Akhunov et al. 2011; Allen et al. 2011; Deschamps et al. 2012) provides a means for the development of dense linkage maps (Bowers et al. 2012). However, the narrow genetic base of conventional mapping populations means that only a small fraction of these marker loci are polymorphic and thus informative for mapping in any given population. Development of a linkage map with dense marker coverage has therefore required integration of maps from several experimental populations to form a consensus map (Wu et al. 2008). However, the process of developing and genotyping several large populations that have no direct contribution to cultivar development is inefficient from the perspective of an applied breeding program. Furthermore, conventional mapping algorithms can only facilitate joint likelihood estimation of the linkage map if each of the constituent populations is of the same structure (Wu et al. 2008). Therefore, consensus mapping is computationally inefficient since it often relies on interpolation of disparate estimates from each constituent population. To overcome the limitations of conventional mapping populations, researchers have proposed using broad-based populations derived from multi-parent crosses. For example, the maize nested association mapping (NAM) population consists of 5,000 recombinant inbred lines derived by crossing 25 diverse founders to a single common founder (Yu et al. 2008). Early generation four-way cross populations have been used for genetic mapping in cotton and wheat, facilitating the development of higher density genetic maps relative to biparental populations (Trebbi et al. 2008; Qin et al. 2008). This concept of developing nway intercross mapping populations has been extended to recombinant inbred lines, with such populations referred to multi-parent advanced generation intercross (MAGIC) populations (Cavanagh et al. 2008). A MAGIC population of over 1,500 RILs from a four-way cross among elite wheat cultivars was used to map 1,162 simple sequence repeat (SSR), single nucleotide polymorphism (SNP) and DArT loci (Huang et al. 2012) and later used to map 4,300 SNPs (Cavanagh et al. 2013). MAGIC populations have also been developed for rice using eight-way crosses among elite indica and japonica lines (Bandillo et al. 2013). These rice populations enabled the identification 17,387 polymorphic SNP loci using genotype-bysequencing methods (Bandillo et al. 2013). The preceding results indicate that multi-parent mapping populations, also referred to as second generation (Rakshit et al. 2012) or next generation (Morrell et al. 2012) mapping populations, provide a powerful resource for the construction of dense linkage maps. A major motivation for utilizing these multi-parent mapping populations is that they more closely resemble the broad genetic base and multiallelic/multi-genic inheritance of breeding populations and thus provide more direct inference for QTL mapping applications (Holland 2007). However, considerable time and resources are required for the development and evaluation of these complex experimental populations, which detracts from applied breeding efforts for cultivar development. For example, development of an n-way cross for deriving a MAGIC population requires n/2 generations of intercrossing to generate the base population, followed by 6 7 generations of inbreeding to develop the RILs (Rakshit et al. 2012). Plant breeders are continually developing a large number of segregating populations through carefully planned crosses among numerous elite parents. Primary segregating populations from these crosses consist of relatively small sibships that have yet to be subjected to intense selection by breeders. Collectively, these early generation breeding populations represent a substantial pool of informative genetic recombinations that can be used for the development of dense genetic maps. Linkage maps developed using a large number of early generation breeding populations should therefore provide comparable marker density and reliability to those developed using consensus mapping or multi-parent mapping populations. Thus, utilizing existing populations available in plant breeding programs for the purpose of genetic linkage mapping should provide an efficient alternative to development of experimental mapping populations. In combination with high-throughput and nondestructive genotyping technologies, this strategy would allow breeders to develop dense linkage maps on the fly to support their molecular breeding efforts.

3 Development of linkage maps using plant breeding populations requires mapping algorithms that can handle partially informative marker data, ambiguous linkage phases and simultaneous analysis of numerous small sibships of arbitrary population structure. Due to the prevailing experimental paradigm, such generalized mapping algorithms are not incorporated into software used for linkage mapping in plant populations (Cheema and Dicks 2009). However, these mapping algorithms are commonly used for linkage analysis in humans and animals populations, where purpose-built populations are not available. Therefore, mapping algorithms used for multi-point linkage analysis in general pedigrees (Lander and Green 1987; Weaver et al. 1992) can be adopted for the purpose of constructing linkage maps in plant breeding populations. In this study, we apply multi-point linkage analysis of general pedigrees to develop a dense genetic linkage map from a 9,000 SNP array using early generation wheat breeding populations. Our objectives are to (1) confirm that dense linkage maps can be obtained using primary segregating populations from breeding programs and (2) assess the accuracy of such derived linkage maps by evaluating their concordance with a recently released SNP consensus map of the wheat genome. Materials and methods Plant materials Segregating F 1 populations were developed from 28 four-way crosses among 10 winter wheat founder lines (Table 1). These wheat breeding populations were developed for the purpose of pyramiding resistance loci for Fusarium head blight. Founders included two backcross derived lines Wesley-Fhb1-BC06 and Wesley-Fhb1-BC56 (Wesley/2*ND2928), an experimental line AL (Alsen/NE00403//NE ), hard winter wheat cultivars Lyman (KS93U134/Arapahoe), Overland (Millennium sib//seward/archer), NE06545 (KS B-I5-1/Alliance), NI08708 (CO980829/Wesley), McGill (NE92458/Ike), and soft winter wheat cultivars Ernie (Pike/MO9965) and Freedom (GR876/OH217). A total of 565 four-way F 1 plants were derived from the 28 four-way crosses, with an average of 20 four-way F 1 plants per cross (Table 1). Founder lines and four-way F 1 plants were vernalized and then transplanted as individual plants in inch pots in a greenhouse. DNA extraction Approximately 2 g of healthy leaf tissue was collected from each founder and four-way F 1 plant. For founder lines, tissue samples from multiple plants were pooled into a single sample. Leaf tissue was transferred to liquid nitrogen immediately upon collection and subsequently stored at -80 C to prevent degradation. DNA was isolated from the leaf tissue using a midiprep phenol/chloroform extraction protocol adapted from Karakousis and Langridge (2003). Briefly, leaf tissue was flash frozen in liquid nitrogen and then ground to a fine powder using a mortar and pestle. Ground leaf tissue was then mixed with 5 ml of DNA extraction buffer (1 % n-lauroylsarcosine, 100 mm Tris-base, 100 mm NaCl, 10 mm EDTA, 2 % polyvinyl polypyrrolidone, ph 8.5) and 5 ml of phenol/chloroform/isoamyl alcohol 25:24:1 saturated with 10 mm Tris (ph 8.0) for nucleic acid separation. After mixing and centrifugation, the supernatant was transferred to a 10:1 solution of isopropanol and sodium acetate for overnight precipitation of nucleic acids. Precipitated nucleic acid was pelletized by centrifugation and washed with 70 % molecular grade ethanol to remove salts. After drying, the pellet was suspended in 10 mm Tris buffer (ph 8.0) containing 40 lg/ml of RNase A. Genotyping Founder lines and all 565 four-way F 1 plants were genotyped at 26 polymorphic simple SSR marker loci. SSR genotyping was conducted at the USDA-ARS Hard Winter Wheat Genetics Research Unit, Manhattan, KS. PCR was conducted in 14 ll PCR, consisting of 40 ng of template DNA, 0.1 lm of each primer, 0.2 mm of each dntp, 19 ammonium sulfide PCR buffer, 2.5 mm MgCl 2 and 0.6 unit of Taq polymerase. A touchdown PCR program described by Zhang et al. (2012) was used for amplification. Differentially labeled primers were used for fluorescence detection of SSR amplicons from multiplex PCR as described by Zhang et al. (2012). PCR products were separated and detected using an ABI Prism 3730 Genetic Analyzer, and allele calls were made from resulting fluorescence

4 Table 1 Summary of breeding populations used for linkage map development, including the pedigrees, number of progeny and number of plants used for SNP genotyping Population Pedigree Four-way F1 plants SNP genotyped 01 Wesley-Fhb1-BC56/NE06545//Ernie/Overland Ernie/Wesley-Fhb1-BC06//Ernie/NE Ernie/Wesley-Fhb1-BC06//Lyman/AL Ernie/Wesley-Fhb1-BC56//Ernie/Lyman Ernie/Wesley-Fhb1-BC56//NI08708/Lyman Ernie/Lyman//Ernie/Wesley-Fhb1-BC Ernie/Overland//Freedom/Wesley-Fhb1-BC Ernie/Overland//Overland/Wesley-Fhb1-BC Ernie/Overland//NI08708/Wesley-Fhb1-BC Ernie/NE06545//McGill/Wesley-Fhb1-BC Ernie/McGill//Lyman/Wesley-Fhb1-BC Freedom/Wesley-Fhb1-BC06//Ernie/Overland Freedom/Wesley-Fhb1-BC06//Lyman/AL Freedom/Wesley-Fhb1-BC06//Overland/Wesley-Fhb1-BC Freedom/Wesley-Fhb1-BC56//Ernie/NE Freedom/Ernie//Overland/Wesley-Fhb1-BC Freedom/Ernie//NI08708/Wesley-Fhb1-BC Freedom/Overland//Lyman/AL Freedom/NI08708//Wesley-Fhb1-BC56/NE AL /Overland//Lyman/Wesley-Fhb1-BC AL /Overland//NI08708/Lyman Lyman/Wesley-Fhb1-BC56//Ernie/Lyman Lyman/Wesley-Fhb1-BC56//NI08708/Lyman Overland/Wesley-Fhb1-BC56//Ernie/Lyman Overland/Wesley-Fhb1-BC56//Ernie/NE Overland/McGill//Lyman/Wesley-Fhb1-BC NI08708/Wesley-Fhb1-BC06//Ernie/NE NI08708/Lyman//Overland/Wesley-Fhb1-BC Total population peaks using GeneMarker version 1.6 (SoftGenetics, LLC). SSR marker genotypes were scored after visual assessment of fluorescence profiles to correct erroneous and ambiguous allele calls. A subset of 18 populations consisting of 372 fourway F 1 plants (Table 1) were genotyped for approximately 9,000 SNP marker loci. SNP genotyping was conducted using an Infinium 9,000 SNP iselect Beadchip assay developed for wheat (Cavanagh et al. 2013). The assay was performed using the Illumina BeadStation and iscan instruments at the USDA-ARS Biosciences Research Laboratory, Fargo, ND. GenomeStudio version (Illumina) was used for cluster analysis and SNP genotype calling. The minimum GenTrain score (a measure of the reliability of SNP calling based on cluster distribution) was reduced to 5 in GenomeStudio to facilitate delineation of compressed but unambiguous SNP clusters. Genotype clusters were then visually assessed for each SNP and manually revised to improve genotype calling. SNP loci represented by more than three genotypic clusters, and those SNP loci with[20 % deviation from the expected heterozygote frequency under Mendelian segregation were excluded from the analysis to avoid complications of polyploid inheritance. Mendelian inheritance errors for both SSR and SNP loci were detected using the prepare function of

5 CRI-MAP version (Green et al. 1990). Genetic impurities in the founders were diagnosed as outliers (i.e., samples with a low GenTrain score) within the respective homozygote cluster and by unexpected segregation patterns. For those cases where genotyping errors could not be rectified, including cases where a founder line conferred more than one allele at a locus within the same population, the genotypic data were replaced with missing values. Linkage mapping Linkage analysis was performed using the software package CRI-MAP version (Green et al. 1990). CRI-MAP provides an interactive environment for multi-point maximum-likelihood estimation of linkage maps in general pedigrees. For the purpose of constructing the required pedigree data, each four-way cross was considered as a separate pedigree. The assumption of independence among pedigrees was made without any loss of generality, since the founder lines were phase known. The sex designation of founders and single-cross F 1 hybrids was generally assigned corresponding to the actual crossing scheme, while four-way F 1 plants were arbitrarily designated as females. All linkage analysis in CRI-MAP was conducted on an IBM M2 server with 24 processor cores and 128 GB of RAM. Maximum-likelihood estimates of pairwise recombination fractions among marker loci were obtained using the twopoint option of CRI-MAP. Pairwise recombination fractions and associated LOD scores were exported to JoinMap version 4.0 (van Ooijen 2006) for identification of linkage groups. Marker loci were hierarchically clustered in JoinMap based on independence test LOD scores (van Ooijen 2006). Linkage groups were designated as the hierarchical nodes beyond which no significant disaggregation occurred. This point was reached at LOD scores ranging from 1 to 35.0 for different linkage groups. Cross linkage statistics computed by JoinMap were used to combine fragmented linkage groups and unassigned marker loci. Chromosomal assignment of the linkage groups was enabled by cross referencing loci from each linkage group with available SSR and expressed sequence tag (EST) mapping data. Pairwise recombination fractions were then used to cluster marker loci with recombination fractions \01 into genetic bins. From each genetic bin, the locus with the greatest number of informative, phaseknown meioses was identified and considered to be the primary locus representing the genetic bin. These primary loci were used to estimate a linkage map of uniquely ordered loci, prior to incorporating the remaining loci to derive the final linkage map. This strategy reduced the number of possible marker orders that had to be initially interrogated, thus increasing the efficiency of the mapping algorithm. Linkage maps were constructed using the CRI- MAP build option. For each chromosome, several pairs of highly informative primary loci with a recombination fraction of were selected to initialize the map order. A framework linkage map was then constructed from each selected pair of loci by sequentially incorporating the remaining primary loci in decreasing order of informativeness. For this first round of map development, only those loci that mapped to an interval with a likelihood ratio 1,000:1 compared to all other intervals were retained in the map. The resulting set of linkage maps was compared, and the most complete map with the highest likelihood was retained for further development. The stringent likelihood threshold and interrogation of multiple initial map orders provided a reliable framework of highly informative markers for subsequent rounds of map development. Successive rounds of map development were performed to incorporate the primary loci that could not be uniquely ordered in the first round. The likelihood threshold for incorporation of loci was reduced between each round of map development. Specifically, the successive rounds of map development were conducted using likelihood thresholds of 100:1, 10:1 and 2:1. Between each round of map development, the CRI-MAP flips4 option was used to test the 24 possible permutations for each set of 4 consecutive loci in the current map order. Any local rearrangements that increased the likelihood were used to revise the current map order prior next round of map development. Any loci that mapped to end of the chromosome with a distance of[30 cm to the nearest locus were removed from the map and were assumed to either be misclassified to the chromosome or belong to unlinked regions of the same chromosome. After determining the most likely order of the primary loci, the CRI-MAP chrompic function was used to detect loci and individuals resulting in unlikely recombination patterns. Double recombination events

6 between a set of 3 consecutive loci were considered to be the result of genotyping errors, and these data were replaced with missing values. Double recombinations between loci separated by uninformative regions were retained. Unlikely crossover events that were prevalent at a locus within a specific pedigree were considered to be the result of genotyping errors on the founder lines, and the data for the entire pedigree were replaced with missing values for the locus in question when the errors could not be rectified. Finally, the remaining loci within genetic bins were incorporated into the linkage map. Unmapped loci having a zero estimated recombination fraction with a primary locus were incorporated as a haplotyped system, with the primary locus. CRI-MAP only considers the primary locus in each haplotyped system when evaluating marker orders, whereas all loci were used in likelihood calculations (Green et al. 1990). Genetic distances were not forced to zero between markers within haplotyped systems. Unmapped loci with nonzero estimated recombination fractions with the primary locus were incorporated into the map order to the right of the primary locus, in decreasing order of informativeness. The CRI-MAP flips4 option was then iteratively used to permutate the local marker orders until no higher likelihood map order could be obtained. The final linkage maps were charted using MapChart version 2.2 (Voorrips 2002). from the TCAP consensus map to visually compare patterns of recombination and locus ordering, as well as diagnose causes of poor concordance. Results Of the approximately 9,000 SNPs assayed, 3,977 were polymorphic and produced clusters that facilitated reliable scoring of SNP genotypes. Additionally, 22 of the 26 SSR loci amplified a product that could be reliably scored, resulting in a total of 3,999 informative loci for subsequent linkage analysis. The average number of informative meioses per SNP locus was 320 and ranged from 20 to 604 (Fig. 1). Hierarchical clustering of these loci in JoinMap resulted in the identification of 31 linkage groups. Cross linkage statistics provided by JoinMap combined with previous SSR and EST mapping data enabled the combination of these linkage groups and unassigned marker loci into 21 groups, putatively representing the 21 wheat chromosomes. Evaluation of recombination fractions identified a total of 1,269 unique genetic bins, with an average bin size of 5 loci. Therefore, 67 % of the interrogated loci were found to be colocalized, resulting from tight linkage among the marker loci as well as markers that interrogated the same locus. Comparatively, only 38 % of these SNP Concordance analysis A consensus map of the wheat genome has been developed for the iselect 9,000 SNP assay through the Triticeae Coordinated Agricultural Project (TCAP) as described by Cavanagh et al. (2013). The TCAP consensus map incorporates a four-way MAGIC population and 6 biparental mapping populations with a combined population size of 2,486 fixed lines. The TCAP consensus map was used as the standard for evaluating the accuracy of linkage maps developed in this study. For each chromosome, Spearman s rankorder correlation coefficient was computed as a measure of concordance for locus ordering between the TCAP consensus map and the linkage map developed in this study. For each linkage map, relative genetic distances were computed as the locus position in cm divided by the total map length in cm. The relative genetic distances estimated from the breeding populations in this study were plotted against those Fig. 1 Distribution of the number of informative meioses for polymorphic SNP loci. Primary loci are the most informative loci from each genetic bin used for initial map development, whereas secondary loci are the remaining loci in the genetic bins

7 Table 2 Summary of the estimated genetic maps for each chromosome Chromosomes Loci mapped Centimorgans Recombination Total Genetic bins Total Mean interval Informative meioses Observed crossovers Singletons (\20 cm) 1A ,086 1, B , D , A ,299 1, B , D , A ,134 1, B ,886 1, D , A , B , D , A ,824 1, B ,417 1, D , A , B , D a , A ,964 1, B ,273 1, D a , Overall 3,875 1,269 3, ,143 18, a The chromosome was represented by multiple linkage groups markers were colocalized on the TCAP consensus map. Primary loci from each of the genetic bins provided approximately 437,000 of the uniquely informative data points, from which over 18,000 recombination events could be observed (Table 2). Linkage maps estimated from the breeding populations are summarized in Table 2 and depicted in Fig. 2a g. The estimated linkage maps included 3,875 loci and covered a total genetic distance of 3,080 cm, with an average interval of 2.5 cm between genetic bins. Marker coverage was relatively poor for the D genome. After curating the data to remove doublecrossover events among consecutive marker trios, a total of 184 singletons remained within partially informative regions of 20 cm or less. This number of observed singletons represents a double-crossover rate of 4 % within a span of 20 cm, which is consistent with the expected maximum recombination rate (0.2 2 = 4). Therefore, the majority singletons remaining in the data were assumed to result from genuine recombination events. Poor marker coverage on the D genome resulted in multiple linkage groups per chromosome for both the TCAP consensus map and the map estimated in this study. Therefore, only the A and B genomes were used for analysis of concordance with the consensus map. Linkage maps of the A and B genomes developed from the breeding populations in this study exhibited high concordance with the TCAP consensus maps (Fig. 3a c). Excluding chromosomes 2B and 6B, the average rank-order correlation with the consensus maps was 0.98, indicating a high level of agreement regarding the locus ordering between the two sets of linkage maps. Chromosome 6B had the lowest rank-order correlation with the consensus map (0.52), due to a large centromeric inversion of the locus order (Fig. 3b). The linkage map for chromosome 2B had a region of highly suppressed centromeric recombination compared to

8 A 1A 1B 1D 4643, , 1602, , 8622, , 1375, , 7385, 6110, 1566, (9) , 1934, 4579, 5509, (5) , 1387, , 4163, 4644, , , 2656, 710, 7421, (8) , 6887, 7879, 7922, (7) , 1991, , 1582, 1583, 7115, (8) 1811, 2057, 2247, 2289, (48) 1078, 2630, 3473, 356, (22) 3144, , 2982, 3532, 3533, (24) 7021, 4326, , 4126, 268, 5839, (8) , 3134, 3891, 4116, (7) 1594, 3475, 3613, 3820, (11) 7956, 2584, 5138, 6609, (7) 6708, 5174, , 163, 6553, 7871, (13) 2540, 4511, , 2488, 2490, 2484, (9) , 7145, 3859, 5493, (6) 7144, , , 4537, , 578, , 3434, 3435, 3804, (5) 3060, , , 3146, 2404, 2405, (5) 1619, 6253, 735, 2818, (20) , 4978, , 4021, , , 8051, 2035, 5318, (10) , 4944, 5407, 1710, (5) 4518, , , 6916, , 4121, 4, 4120, (5) , , , , , , 63, , 3443, 7480, 1578, (9) , 2578, , 7280, 6703, 4389, (8) 131, 15, 6290, 353, (6) , 5348, 8081, 8082, (9) 128, 7721, 2517, 3502, (26) 7017, 4556, , 491, 3945, 269, (14) 5076, , , 4680, 5962, 5229, (26) , , , 7037, 255, 2308, (8) 5186, , , , 3095, 5448, 3096, (5) , 7141, , 5447, , , , , , , 4694, 545, , 6511, 2928, , , 6500, , 7533, , 5019, 5018, 5020, (6) , 830, , , 2021, 5235, 2341, (6) 6186, 1736, 3014, 3058, (26) 4886, 7838, , 4938, 7675, 3465, (9) 4960, 2043, 4024, 5514, (21) , 7023, 8485, 5541, (5) , , 3548, 3547, 3549 Fig. 2 a Estimated genetic maps for homeologous group 1 chromosomes. Genetic bins are followed by the number of constituent markers in parentheses if[4. See supplemental data for a complete list. b Estimated genetic maps for homeologous group 2 chromosomes. Genetic bins are followed by the number of constituent markers in parentheses if [4. See supplemental data for a complete list. c Estimated genetic maps for homeologous group 3 chromosomes. Genetic bins are followed by the number of constituent markers in parentheses if[4. See supplemental data for a complete list. d Estimated genetic maps for homeologous group 4 chromosomes. Genetic bins are followed by the number of constituent markers in parentheses if [4. See supplemental data for a complete list. e Estimated genetic maps for homeologous group 5 chromosomes. Genetic bins are followed by the number of constituent markers in parentheses if [4. See supplemental data for a complete list. f Estimated genetic maps for homeologous group 6 chromosomes. Genetic bins are followed by the number of constituent markers in parentheses if [4. See supplemental data for a complete list. g Estimated genetic maps for homeologous group 7 chromosomes. Genetic bins are followed by the number of constituent markers in parentheses if[4. See supplemental data for a complete list

9 B 2A 2B 2D , , 6922, , 2425, 2428 barc , , 3469, , 8274, 422, 423, (5) gwm , 2059, wmc , , 6477, 6727 wmc , 2067, 2731, 6565, (6) 6564, , 5893, 5023, , , 5824, , 2531, 3569, 4026, (8) 7248, 1597, 4410, 5585, (10) 5378, 534, 2758, , 2733, 3368, 5092, (8) 3839, , 1174, 488, , 5574, 5640, 6090, (7) , 200, , , , 3988, 6503, 7540, (18) , 2157, , , , , 8041, 5685, , 4336, 7727, , , , 5007, 7761, , , , , , 7143, , 1350, 1347, , 6963, , , 5879, 3687, 7327, (6) , , , , , 4808, 7656, ,3589, ,2846, , 2110, 2088, 2111, (6) wmc , 2440, 2442, ,2441, , 2115, 5410, 4284, (7) 1931, 5737, 1929, , 4420, 4421, 6739, (8) , 1210 gwm , ,1204,429, , 5811, 7076, 4388, (5) 6209,2766,535, , , 5506, 2237, 5397, (6) 3924, , 3908, 7661, 1763, (5) 4720, , , 1102, 1127, 1128, (25) 7951, 8244, 6875, 4983, (12) 1981, 4303, 561, 3452, (5) 2253, 2236, 2184, 5254, (15) , 671, 3236, 6076, (103) , , 2678, 3395, 5850, (8) 7909, , 4256, 4956, 652, (29) , 4357, 4890, 4948, (8) 3148, 1707, 5051, 1291, (6) ,3233, , 5675, 3937, , 6778, 1599, 2052, (9) 7113,7112, , 3982, 8504, , 1668, 4118, 3773, (7) , 7335, 5694, , , , , , , 7273, 1975, , , , 5637, , 1083, 5205, 5630, (15) , 5072, 4229, 6840 Fig. 2 continued

10 C 3A 5443, , 5427, 4927, 5430, (8) , 2993, , , , 8106, , 5642, , 4335, 4333, , 8526, 6334, 1812, (9) , 2095, , 5444, , , 2156, 1922, 4912, (5) , , , 6923, 7010, 3069, (5) , 3156, 4397, 462, (13) , 234, , 2750, 2886, 3930, (13) 3836, 6170, , , 133, 743, , 7541, 2801, 5994, (14) 3250, , 1701, 1536, 1700, (22) 5286, 7159, 5285, 1678, (11) , 4926, 4851, 1260, (5) , 5596, 8455, 1892, (6) , , 6997, , 3524, , 5213, 5982, 5212, (5) 524, 799, 523, 2029, (5) , 1207, , , , , 7835, , 7999, 2396, 6716, (5) , 1457, 602, B gwm389, barc , 758 gwm , 289, , , 280, 4625, 5618, (5) , 3425, , 2663, 4054, , 3117, 3716, 347, (5) , 3150, 3149, , , 3731, 3732, , 4838, 3390, , 2408, , 2119, 7748, 1416, (7) 4040, 4310, 7860, 4039, (5) 4755, 7294, 6165, 6243, (14) 380, 376, 5351, 729, (6) 210, 377, , 91 gwm285, , 6814, 6655, , 1607, 7247, 4193, (10) 6793, 2329, 6677, 4452, (15) , 5613, , 3710, 3711, 5101, (10) , 5677, , 628, , 4653, 5710, 4218, (8) , barc164, 4156, , 6222, 1148, , 3306, , 1467, 7225, 1196, (8) 6223, 5805, 6158, , , , 2399, 2510, , 2361, 3833, , 7870, , 1333, , 3244, 3834, , 3402, , 2721, 2661, , 7889, 2620, 3455, (11) , , , 5941, 6375, 6299, (6) 2951, 3592, 1095, 3591, (6) 1731, 3170, 6002, 4498, (10) 3332, , 5014, 5015, 4778, (5) barc , 785, , , , D , 7468, 5695, 1321, (5) , , 4683, , 1715, 7274, 8578, (7) 153, 1624, 4877, 3012, (7) 5230 Fig. 2 continued the consensus map, which resulted in shuffling of locus orders and a reduced rank-order correlation with the consensus map (0.72). The majority of the linkage maps estimated from the breeding populations showed regions with reduced recombination relative to the TCAP consensus maps. These lower estimates of recombination were typically centromeric, which resulted in s-shaped trends when plotted against the consensus map positions (Fig. 3a c). However, the differences in estimated

11 D 4A 4B 4D , 4858, 7365, 4084, (5) , 7364, 1836, 1835, (5) , 4030, , , , 2298, , wmc710, , , , 3188, 5152, 8168, (5) , , 2384, , , , 6883, , 3728, , 2581, , , , 2585, 2901, 3191, (9) , 1728, 1992, 4252, (5) 4261, 4260, 3902, 4480, (6) , 8389, gwm495, 2963, 3400, 3874, (7) , , , 8163, 3241, , 4348, , , 2532, 7437, 7168, (8) 2666, 3846, 4070, 4330, (11) , 907, 3074, 7752, (5) , 3041, 3042, 3038, (5) 3040, , , 7202, 3697, , , 5408, 6397, 2595, (5) wmc , 3779, 3781, , Fig. 2 continued recombination frequencies did not greatly affect the relative ordering of the loci. Discussion The application of mapping algorithms developed for general pedigrees to existing breeding populations facilitated the development of a 3,875 locus linkage map of the wheat genome without the need for extraneous population development. High-throughput SNP genotyping of 18 crosses comprised of a total of 372 four-way F 1 individuals enabled the mapping of over 43 % of the interrogated loci. These results indicate that a collection of breeding populations, derived from crosses among numerous parents, provide a highly polymorphic and informative genetic resource for the development of linkage maps. Comparatively, Cavanagh et al. (2013) were able to map roughly 46 % of the SNP loci from the same 9,000 SNP assay using a four-way cross MAGIC population consisting of 1,579 recombinant inbred lines. Biparental mapping populations used for consensus map individually enabled mapping of

12 E A 2642, , 2558, , , 2838, , , , , , , 2113, 2856, 2858, (10) , 3702, 3704, 3703, (7) 674, 675, 7014, , , , 6681, 1439, , , 5623, , 4391, 4390, , 1343, 7833, , , 5668, , , , , , , 4670, 4668, 4667, (5) 4914, 1685, , 3283, 3645, , , 2445, 6747, 7436, (6) , 8308, , 3201, 5726, 6899, (21) , 850, , 2926 gwm156, 122, , 5529, , 5034, , , 5539, 5689, 6, (7) , , 7960, , 7961, 7690, 7925, (7) 7130, 7129, , 2463, 2814, 3212, (37) , 3100, 7565, 1301 barc56, 5431, 3445, 1465, (6) 7109, 5496, , 4454, 2120, , , 3190, 8154, , cfa , , 3811, 2378, , 5615, , 4325, 7801, 6463, (6) , , , , 3567, 5924, , 2142, 480, 2143, (7) B 6579, 2682, 2681, 6578, (5) 4903, 2931, 5802, 4329, (5) , 3360, , , 22, , , , 5454, 4634, 7708, (5) , 3972, 6393, , 4762, 2659, , 6779, 1433, 6147, (5) , 7965, , , 1441, 1443, , 7668, 7963, , 7989, 4820, 5552, (5) , 2565, 3226, 3432, (9) , 8375, 4211, 7735, (9) , 6894, , 3641, , , , 2335, 2336, 8187, (6) 7485, 7514, 4641, , , 4632, 3964, 337, (10) , 5283, 2453, 2454, (8) 6383, 3025, 1374, 822, (7) 2536, 2934, 2432, , , 7562, 2180, , 3436, 5485, 5486, (8) 6992, 7, 7944, 4533, (5) , 301, 1584, 5289, (5) 2791, 5845, 6980, , 7127, 6689, 5279, 1777, (7) , 1705, 1706, 2071, (20) 5764, 2742, , 7613, 3682, , 6429, 1461, 279, (17) , 6521, 6908, 5494, (8) 4377, 4378, 4379, 1176, (9) 1342, 1965, , , 4281, 4282, 5108, (5) 6773, 7227, , 3706, 4862, , 4414, 8005, 2596, (6) 2609, , 7209, 7211, 1143, (7) 4355, , 332, , 1709, , 3606, , 4416, 4415, 4790, (7) 1719, 7183, 7254, D , 5012, 4550, , 6190, , 6872, , 6060, 6061, , , 7915, , , 1428, 1427, 1429 Fig. 2 continued

13 F 6A 6B 6D , 7913, , 6711, , , 680, 6630, , 6013, , 2017, , 6806, , 3700, 3866, 4738, (11) , 3527, 8110, 6820, (9) 6928, 6560, 6559, , 2186, 2187, 2421, (15) 1423, 1285, 1606, 2033, (7) , 3463, 6962, 3483, (7) , , 1514, 4929, 653, (34) , 3024, 2241, 3023, (18) 259, , 4950, 20, 19, (10) , , 5704, 504, , , 8438, 8222, , , 2639, 7894, , , 5655, 6537, 6316, (10) , 5767, 6305, , 5747, 2603, 5768, (6) , 7497, 3205, 7495, (6) 2795, 4699, 3247, 3203, (5) , 8477, , , , 1901, 5857, 4610, (6) 4290, 7725, , 4760, , , 2439, 2888, 5888, (12) , 1905, 5056, , , , 2417, 4824, 1655, (16) , 4485, 5625, 1017, (11) 2039, 6660, 4383, , , 5045, 5044, , 3869, , 7954, 2927, 5197, (13) , , 4202, 5170, , 7487, 6904, 657, (6) 4501, 4502, 4503, 755, (13) , 5346, 387, 4440, (5) 1151, 1743, 4086, 6855, (13) , , 3878, 3963, 2653, (5) , 5266, , 6466, 6467, , 3677, 5042, 7974, (11) , 283, 3636, 1472, (20) 1838, 5966, , 7574, , 5242, 617, 1840, (62) 6101, , , , 3971, , , 7810, 3301, 7618, (6) 7084, , 4436, , 4564, 3801, , 221, 3327, 5148, (8) 1629, 3221, , , 2331, 7056, 3464, (5) , , , 1484, , 8441, 823, 3947, (6) 3225, 4246, 7098, D2 6D , 1926, 6274, , , 5354, 4455, , 600, 4315, , 2965, 2966, 3291, (7) , , 1317, 1643, , 2338, , 1895, 1896, Fig. 2 continued % of the SNP loci with population sizes ranging from 96 to 250 fixed lines (Cavanagh et al. 2013). Given the time and resources required for the development of such purpose-built mapping populations, it seems clear that exploiting existing segregating populations in breeding programs provides a cost-efficient alternative for development of dense linkage maps. Furthermore, it should be noted that the breeding populations used in this study were highly interrelated by common founders and the two Wesley-Fhb1 backcross lines were nearly isogenic. A collection of breeding populations derived from a larger number of founders or from a more diverse set of founders would therefore be informative for more loci and thus allow for more extensive linkage mapping than possible in this study.

14 G 7A 7B 7D , 6519, , , , , , , , , 2521, 7610, 2522, (5) 1323, 3851, 6822, , , , , 2513, 7301, 7201, (5) 3760, , , , 3831, 6472, , , 2506, , 6331, , 275, 7205, 3674, (6) , 4032, 7282, 3903, (7) 3754, 7731, 2820, 363, (5) , 6569, 3318, 4168, (22) 5220, 3, 4082, 1871, (28) 1554, 759, 127, 5867, (14) , , 5844, 6208, 208, (6) 4637, , 4063, 1834, 1832, (6) 4639, 7293, 8098, , 3062, , , 2082, , 6629, , 6866, 7430, 8113, (6) , 3668, 8073, , 635, 476, 1946, (36) 1110, 1111, 810, , 7140, 4109, 7549, (5) 593, , 2009, 2012, 2011, (5) 6562, , 8077, , 5489, , 4910, , 7409, , , , 7407, , , 1363, , , 4991, 4994, , , , 2403, , 866, 1223, , , 4176, 1271, 4433, (7) 4137, 4173, 1518, 4177, (11) 737, 795, 179, 6833, (6) 6576, 7005, , , gwm , 2832, , 6901, 1314, 4967, (15) 2894, , 3508, 3506, 3507, (7) , , 306, 3437, 3807, (8) 8021, , , 6414, 4873, , 5661, 6788, 5210, (9) 2353, 4727, 6212, 3114, (22) 1642, 2271 gwm , 8300 gwm297, 3163, 632, 698, (5) gwm , 3112, , , 3986, 6400, 6857, (7) 1420, 1963, 4190, 4191, (16) 4857, , , 4701, 8570, 1345, (6) 1339, 3928, 3423, 7329, (10) , 7403, , 4750, 6246, 4749, (5) , 5564, , 3387, , 3603, 2825, 7780, (5) , , , , D , 1257, 5557, , Fig. 2 continued Correct ordering of marker loci during linkage mapping is a critical factor determining the accuracy and power of subsequent QTL mapping applications (Collard et al. 2009). Therefore, an assortment of breeding populations must not only enable the mapping of a large number of marker loci, but also the accurate ordering of those loci from large genotypic data sets. Evaluation of rank-order correlations