SNP genotyping and linkage map construction of the C417 mapping population

Size: px
Start display at page:

Download "SNP genotyping and linkage map construction of the C417 mapping population"

Transcription

1 STSM Scientific Report COST STSM Reference Number: COST-STSM-FA Reference Code: COST-STSM-ECOST-STSM-FA SNP genotyping and linkage map construction of the C417 mapping population Short term scientific mission (STSM) supported by and serving aims of: COST action FA1104: "Sustainable production of highquality cherries for the European market" 19/01/14 to 01/02/14 Dr Emma Skipper East Malling Research, New Road, East Malling, Kent, ME19 6BJ UK Dr Daniel Sargent (Local trainer and supervisor) Host institution: Fondazione Edmund Mach - Istituto Agrario di San Michele All Adige, Trento (IT)

2 1. Purpose of the STSM The main purpose of the STSM was to construct a genetic linkage map for the C417 (Colney x C210-7) sweet cherry mapping population to enable the identification of QTL associated with fruit quality traits. The objectives of the mission were: To be trained in the use of Genome Studio Genotyping Module Software to analyse the SNP data generated. To construct a genetic linkage map for the population using JoinMap software. To perform QTL analysis for fruit quality traits using MapQTL software. 1. Description of the work carried out during the STSM 2.1 SNP genotyping Prior to the mission, DNA was extracted from the parental lines and 138 individuals from the C417 mapping population, and sent to IASMA for genotyping with the Illumina Infinium 6K SNP array for cherry. The cherry array comprises of 5,696 SNPs of which 76% and 24% target the sweet and sour cherry genomes respectively (Peace et al, 2012). SNP genotypes were determined using Genome Studio Genotyping Module Software (v1.0, Illumina). All SNPs were manually inspected for appropriateness of clustering, cluster separation, number of clustering and presence of null alleles. Each SNP was also evaluated for genotyping errors and deviation from the expected mendelian segregation ratios. SNP genotypes were classified as either homozygous, heterozygous, failed or difficult to score due to ambiguous clustering. SNPs were classified as 'failed' if they had a high proportion of 'no calls'. Monomorphic, failed and SNPs with ambiguous clustering that could not be manually rescored were excluded from the linkage analysis. 2.2 Linkage analysis Linkage analysis was carried out in JoinMap 4.0 (Van Ooijen, 2006) for a cross pollinated progeny. A genotype matrix was created for all markers which were coded as heterozygous in one (<ll x lm>, <np x nn>) or both parents (<hk x hk>, <ef x eg>). Markers from the population were assigned to groups, based on recombination frequency, using the independence LOD test. Eight groups were identified which correspond to the number of chromosomes in cherry (2n = 2x = 16). For map construction, a two-step approach was used to construct linkage maps for each group, using the regression mapping function in JoinMap 4.0, with a minimum LOD threshold of three. Firstly, maps were constructed for both parents based on the heterozygous in one parent classes (<ll x lm>, <np x nn>), followed by the construction of integrated parental maps by including markers that were heterozygous in both parents (<hk x hk>, <ef x eg>). Map distances were calculated by converting recombination frequencies into map distances using Kosambi's mapping function. Linkage groups were correctly orientated and the marker order checked through comparisons with the peach physical genome v1.0 (Verde et al, 2013). Mapchart 2.2 (Voorrips, 2002) was used to draw the linkage maps and make group-wise comparisons between maps.

3 2. Description of the main results obtained 3.1 Marker polymorphism In total, 826 (14.5%) out of the 5696 SNP markers from the cherry array were found to segregate in the population with 81.8% and 3.7% or the remaining SNPs being classified as either monomorphic or as failed respectively (Table 1.) Table 1. Number of monomorphic, polymorphic and failed SNPs for the population. Monomorphic Failed Polymorphic Total On examination of the heterozygous classes of SNPs segregating in the population 71.0% (586/826) were heterozygous in one parent (<lm x ll>, <np x nn>) and 26.6% (220/826) were heterozygous in both parents (<hk x hk>, <ef x eg>). In addition, 2.4% (20/826) of markers had ambiguous clustering (AC) and could not be manually rescored and were subsequently removed prior to performing linkage analysis (Figure 1). 0,2% 2,4% 6% 26,4% 28,5% lm x ll nn x np hk x hk ef x eg AC Mapped Unlinked 42,5% 94% Figure 1. Percentages of heterozygous SNPs corresponding to different segregation classes and the percentage of markers that were mapped or remained unlinked after linkage analysis. 3.2 Linkage map construction Linkage analysis was carried out in JoinMap 4.0 (Van Ooijen, 2006) on 806 polymorphic markers in the population resulting in 797 (99%) markers initially being grouped into eight linkage groups and a total of 748 (94%) markers being mapped (Figure 1; Table 2.). The parent maps constructed, based on markers that were segregating in the population as heterozygous in one parent (<lm x ll>, <np x nn>), contained eight linkage groups and covered (Colney) and cm (C210-7) respectively. The number of markers in each linkage group ranged from 14 (Colney, LG1) to 79 (C210-7, LG 6) and the average marker distance was 2.6 and 2.1 cm respectively (Table 3.).

4 Table 2. The number and class of SNP markers mapped in the population. Marker Class LG lm x ll nn x np hk x hk Total Total Segregation distortion (P>0.01) of markers was observed in both parent maps: 5.2% (12/230) Colney and 17.4% (59/339) in C Skewed markers that deviated from the expected mendilian ratio of 1:1 were located on LG2 (27.5 cm Colney, cm C210-7) and LG6 ( cm Colney, cm C210-7) for both lines as well as on LG5 in C210-7 ( cm). Table 3. The number of markers mapped, map size and marker density of each parent map and the integrated parental maps. LG1 LG2 LG3 LG4 LG5 LG6 LG7 LG8 Total Number of mapped markers Colney Colney (integrated) C C210-7 (integrated) Linkage Group Length (cm) Colney Colney (integrated) C C210-7 (integrated) Average marker distance (cm) Colney Colney (integrated) C C210-7 (integrated) Integrated maps constructed for each parent, by including markers that were segregating in the population as heterozygous in both parents (<hk x hk> and <ef x eg),had a total map distance of (Colney) and cm (C210-7), with an average marker distance of 1.5 and 1.3 cm respectively (Table 3). The number of markers mapped in each linkage group ranged from 41 (Colney, LG3) to 95 (C210-7, LG6). In total, 13.2 % (99/748) skewed markers were mapped in both of the integrated parental maps. As with the parent maps segregation distortion (P>0.01) was found on LG2 ( cm Colney, cm C210-7), LG5 ( cm C210-7) and LG6 ( cm Colney, cm C210-7).

5 3.3 Quantitative Trait Loci (QTL) analysis The maps created as part of this STSM, were used to perform QTL analysis in MapQTL 6.0 (Van Ooijen, 2009) with phenotypic data collected in 2013, for the identification of QTL associated with fruit quality traits. Due to time restraints, an initial analysis was only carried out resulting in the identification of three QTL associated with flesh colour, fruit weight and flavour (data not presented). 3. Future collaboration with the host institution Additional work streams have been identified to investigate candidate genes responsible for the segregation distortion observed in LG6. Detailed QTL analysis will be completed, as well as expression analysis, to confirm the correct identification of a candidate gene involved in controlling flesh colour, identified from the intial QTL analysis carried out in this population. 4. Foreseen publications/articles resulting from the STSM Publications describing the linkage map and candidate genes for the segregation distortion observed on LG6, as well as a paper describing the gene involved in controlling flesh colour are planned once additional work has been completed. 5. Confirmation by the host institution of the successful execution of the STSM (see attached document) 6. Acknowledgments. This scientific mission was supported by COST action FA1104. I would like to express my gratitude to Francesco Paolo Marra and José Quero-Garcia for approving my application. I would also like to thank Daniel Sargent for hosting my visit and for providing excellent training and sharing his expertise. 7. References Illumina I (2010a) GenomeStudio Genotyping module v1.0, User Guide. Illumina Inc., Towne Centre Drive, San Diego, CA, USA. Peace et al, (2012). Development and evaluation of a genome-wide 6K SNP array for diploid sweet cherry and tetraploid sour cherry. PLoS ONE 7(12): e48305.doi: /journal.pone Van Ooijen, J. W., JoinMap (R) 4, Software for the calculation of genetic linkage maps in experimental populations. Kyazma B. V., Wageningen, Netherlands. Van Ooijen, J. W., MapQTL (R) 6, Software for the mapping of quantitative trait loci in experimental populations of diploid species. Kazyma B. V., Wageningen, Netherlands.

6 Verde et al, The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nature Genetics 45: Voorrips, R.E., MapChart: Software for the graphical presentation of linkage maps and QTLs. The Journal of Heredity 93 (1):