Molecular markers in plant breeding

Similar documents
Mapping and Mapping Populations

Authors: Vivek Sharma and Ram Kunwar

Module 1 Principles of plant breeding

Introduction to some aspects of molecular genetics

I.1 The Principle: Identification and Application of Molecular Markers

Genetic dissection of complex traits, crop improvement through markerassisted selection, and genomic selection

Marker types. Potato Association of America Frederiction August 9, Allen Van Deynze

Using molecular marker technology in studies on plant genetic diversity Final considerations

PCB Fa Falll l2012

Concepts: What are RFLPs and how do they act like genetic marker loci?

SolCAP. Executive Commitee : David Douches Walter De Jong Robin Buell David Francis Alexandra Stone Lukas Mueller AllenVan Deynze

MAS refers to the use of DNA markers that are tightly-linked to target loci as a substitute for or to assist phenotypic screening.

Genomic Selection: A Step Change in Plant Breeding. Mark E. Sorrells

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

Gene Tagging with Random Amplified Polymorphic DNA (RAPD) Markers for Molecular Breeding in Plants

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score

Plant breeding QTL (Quantitative Trait Loci)

INTERNATIONAL UNION FOR THE PROTECTION OF NEW VARIETIES OF PLANTS GENEVA

Identifying Genes Underlying QTLs

Using mutants to clone genes

QTL Mapping, MAS, and Genomic Selection

Why do we need statistics to study genetics and evolution?

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases).

Gene Mapping in Natural Plant Populations Guilt by Association

Traditional Genetic Improvement. Genetic variation is due to differences in DNA sequence. Adding DNA sequence data to traditional breeding.

Marker-Assisted Selection for Quantitative Traits

INTERNATIONAL UNION FOR THE PROTECTION OF NEW VARIETIES OF PLANTS

A brief introduction to Marker-Assisted Breeding. a BASF Plant Science Company

Genomic Selection in Dairy Cattle

RFLP Method - Restriction Fragment Length Polymorphism

Map-Based Cloning of Qualitative Plant Genes

Statistical Methods for Quantitative Trait Loci (QTL) Mapping

Erhard et al. (2013). Plant Cell /tpc

Molecular Cell Biology - Problem Drill 11: Recombinant DNA

GBS Usage Cases: Examples from Maize

SNP calling and Genome Wide Association Study (GWAS) Trushar Shah

R1 12 kb R1 4 kb R1. R1 10 kb R1 2 kb R1 4 kb R1

Restriction Enzymes (endonucleases)

Strategy for Applying Genome-Wide Selection in Dairy Cattle

MICROSATELLITE MARKER AND ITS UTILITY

GDMS Templates Documentation GDMS Templates Release 1.0

POPULATION GENETICS Winter 2005 Lecture 18 Quantitative genetics and QTL mapping

Applicazioni biotecnologiche

Lecture 12. Genomics. Mapping. Definition Species sequencing ESTs. Why? Types of mapping Markers p & Types

GBS Usage Cases: Non-model Organisms. Katie E. Hyma, PhD Bioinformatics Core Institute for Genomic Diversity Cornell University

Sept 2. Structure and Organization of Genomes. Today: Genetic and Physical Mapping. Sept 9. Forward and Reverse Genetics. Genetic and Physical Mapping

Lecture 8: Sequencing and SNP. Sept 15, 2006

Understanding genetic association studies. Peter Kamerman

Lecture 1 Introduction to Modern Plant Breeding. Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013

MARKER-ASSISTED EVALUATION AND IMPROVEMENT OF MAIZE

Genomic Selection in Breeding Programs BIOL 509 November 26, 2013

Enzyme that uses RNA as a template to synthesize a complementary DNA

WORKING GROUP ON BIOCHEMICAL AND MOLECULAR TECHNIQUES AND DNA PROFILING IN PARTICULAR. Twelfth Session Ottawa, Canada, May 11 to 13, 2010

STANDER, l.r., Betaseed, Inc. P.O. Box 859, Kimberly, ID The relationship between biotechnology and classical plant breeding.

HCS806 Summer 2010 Methods in Plant Biology: Breeding with Molecular Markers

Comparative study of EST-SSR, SSR, RAPD, and ISSR and their transferability analysis in pea, chickpea and mungbean

Genetics Effective Use of New and Existing Methods

Biology 105: Introduction to Genetics PRACTICE FINAL EXAM Part I: Definitions. Homology: Reverse transcriptase. Allostery: cdna library

Association Mapping in Wheat: Issues and Trends

METODOLOGIE INTEGRATE PER LA SELEZIONE GENOMICA DI PIANTE ORTIVE SELEZIONE GENOMICA

MOLECULAR TYPING TECHNIQUES

Molecular Biology (2)

Microsatellite markers

Pathway approach for candidate gene identification and introduction to metabolic pathway databases.

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010

Analysis of genome-wide genotype data

POPULATION GENETICS studies the genetic. It includes the study of forces that induce evolution (the

Manipulating DNA. Nucleic acids are chemically different from other macromolecules such as proteins and carbohydrates.

QTL Mapping Using Multiple Markers Simultaneously

Selection and breeding process of the crops. Breeding of stacked GM products and unintended effects

Application of Genotyping-By-Sequencing and Genome-Wide Association Analysis in Tetraploid Potato

AD HOC CROP SUBGROUP ON MOLECULAR TECHNIQUES FOR MAIZE. Second Session Chicago, United States of America, December 3, 2007

Introduction to Quantitative Genomics / Genetics

PCR Techniques. By Ahmad Mansour Mohamed Alzohairy. Department of Genetics, Zagazig University,Zagazig, Egypt

Population Genetics. If we closely examine the individuals of a population, there is almost always PHENOTYPIC

Variation Chapter 9 10/6/2014. Some terms. Variation in phenotype can be due to genes AND environment: Is variation genetic, environmental, or both?

GenSap Meeting, June 13-14, Aarhus. Genomic Selection with QTL information Didier Boichard

Department of Biotechnology. Molecular Markers. In plant breeding. Nitin Swamy

Molecular studies (SSR) for screening of genetic variability among direct regenerants of sugarcane clone NIA-98

Before starting, write your name on the top of each page Make sure you have all pages

GENETICS - CLUTCH CH.20 QUANTITATIVE GENETICS.

Name_BS50 Exam 3 Key (Fall 2005) Page 2 of 5

BIO 304 Fall 2000 Exam II Name: ID #: 1. Fill in the blank with the best answer from the provided word bank. (2 pts each)

Biology 163 Laboratory in Genetics, Final Exam, Dec. 10, 2005

_ DNA absorbs light at 260 wave length and it s a UV range so we cant see DNA, we can see DNA only by staining it.

5 Results. 5.1 AB-QTL analysis. Results Phenotypic traits

Genetics and Biotechnology. Section 1. Applied Genetics

GENETICS EXAM 3 FALL a) is a technique that allows you to separate nucleic acids (DNA or RNA) by size.

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR

Linkage Disequilibrium

Usage Cases of GBS. Jeff Glaubitz Senior Research Associate, Buckler Lab, Cornell University Panzea Project Manager

1b. How do people differ genetically?

Dissecting the genetic basis of grain size in sorghum. Yongfu Tao DO NOT COPY. Postdoctoral Research Fellow

INTERNATIONAL UNION FOR THE PROTECTION OF NEW VARIETIES OF PLANTS

This is a closed book, closed note exam. No calculators, phones or any electronic device are allowed.

Genetic Engineering & Recombinant DNA

Research techniques in genetics. Medical genetics, 2017.

GENOTYPING BY PCR PROTOCOL FORM MUTANT MOUSE REGIONAL RESOURCE CENTER North America, International

BIOLOGY - CLUTCH CH.20 - BIOTECHNOLOGY.

Transcription:

Molecular markers in plant breeding Jumbo MacDonald et al., MAIZE BREEDERS COURSE Palace Hotel Arusha, Tanzania 4 Sep to 16 Sep 2016

Molecular Markers QTL Mapping Association mapping GWAS Genomic Selection Mapping populations Linkage Linkage disequilibrium Key areas to look at

Genetic Markers Genetic differences between organisms Neural sites of variations at DNA sequence level They act as signs or flags Markers in close proximity to genes can be referred to as gene tags Such markers themselves do not affect the phenotype of the trait of interest because they are located only near or linked to genes controlling the trait. All genetic markers occupy specific genomic positions within chromosomes

Types of genetic markers There are three major types of genetic markers: Morphological (also classical or visible ) markers which themselves are phenotypic traits or characters; Biochemical markers, which include allelic variants of enzymes called isozymes; and DNA (or molecular) markers, which reveal sites of variation Morphological markers are usually visually characterized phenotypic characters such as flower colour, seed shape, growth habits or pigmentation. Isozyme markers are differences in enzymes that are detected by electrophoresis and specific staining. The major disadvantages of morphological and biochemical markers are that they may be limited in number and are influenced by environmental factors or the developmental stage of the plant

DNA Markers DNA markers are the most widely used type of marker predominantly due to their abundance. They arise from different classes of DNA mutations such as substitution mutations (point mutations), rearrangements (insertions or deletions) or errors in replication of tandemly repeated DNA Unlike morphological and biochemical markers, DNA markers are practically unlimited in number and are not affected by environmental factors and/or the developmental stage of the plant

DNA markers Apart from the use of DNA markers in the construction of linkage maps, they have numerous applications in plant breeding such as assessing the level of genetic diversity within germplasm and cultivar identity DNA markers may be broadly divided into three classes based on the method of their detection: hybridization-based; polymerase chain reaction (PCR)-based and sequence-based

DNA markers DNA markers may reveal genetic differences that can be visualized by using a technique called gel electrophoresis and staining with chemicals (ethidium bromide or silver) or detection with radioactive or colourimetric probes DNA markers are particularly useful if they reveal differences between individuals of the same or different species. These markers are called polymorphic markers, whereas markers that do not discriminate between genotypes are called monomorphic markers

Polymorphic markers may also be described as codominant or dominant. DNA markers This description is based on whether markers can discriminate between homozygotes and heterozygotes (Figure 1). Codominant markers indicate differences in size whereas dominant markers are either present or absent. The different forms of a DNA marker (e.g. different sized bands on gels) are called marker alleles. Codominant markers may have many different alleles whereas a dominant marker only has two alleles.

Dominat and Co-dominant Markers Comparison between (a) codominant and (b) dominant markers. Codominant markers can clearly discriminate between homozygotes and heterozygotes whereas dominant markers do not. Genotypes at two marker loci (A and B) are indicated below the gel diagrams.

Types of Markers Hybridization-based molecular markers RFLP is the most widely used hybridization-based molecular marker. Digestion of the DNA with one or more restriction enzyme(s). b) Separation of the restriction fragments in agarose gel. c) Transfer of separated fragments from agarose gel to a filter by Southern blotting. d) Detection of individual fragments by nucleic acid hybridization with a labeled probe(s)

RFLPs Restriction enzymes (endonucleases) are bacterial enzymes (e.g., MseI, EcoRI, PstI, etc.) that recognize specific four, six or eight base pair (bp) sequences in DNA, and cleave double-stranded DNA whenever these sequences are encountered. For example, EcoRI has six bp recognition sequence and it cuts between G and A whenever the sequences 5 GAATTC 3 or 3 CTTAAG 5 exist. The choice between of the enzymes depends on the resolution needed

RFLPs These are then separated by electrophoresis through agarose or polyacrylamide gels. The choice between agarose and polyacrylamide is based on the restriction enzymes chosen. Four-cutters produce fragments too small to be resolved by agarose gels; hence, polyacrylamide gels are required. Conversely, polyacrylamide gels can not normally be used to resolve the fragments produced by six-cutters so agarose gels must be used. These considerations have led to most workers use six-cutter enzymes, as agarose gels are much easier to handle

PCR based Markers The various PCR-based techniques are of two types depending on the primers used for amplification: 1) Arbitrary or semi-arbitrary primed PCR techniques developed without prior sequence information (e.g., AP-PCR, DAF, RAPD, AFLP, ISSR). 2) Site-targeted PCR techniques developed from known DNA sequences (e.g., EST, CAPS, SSR,SCAR, STS).

Types of Markers A number of factors need to be considered in choosing one or more of the various molecular marker types: Marker system availability Marker should be informative (Polymorphic Information content-pic) Simplicity of the technique and time availability. Anticipated level of polymorphism in the population. Quantity and quality of DNA available. Transferability between laboratories, populations, pedigrees and species. The size and structure of the population to be studied Availability of adequate skills and equipment Cost per data-point and availability of sufficient funding. Marker inheritance (dominant versus codominant) and the type of genetic information sought in the population

Marker type by application Foreground markers Foreground selection, in which the breeder selects plants having the marker allele of the donor parent at the target locus. The objective is to maintain the target locus in a heterozygous state (one donor allele and one recurrent parent allele) until the final backcross is completed. Then, the selected plants are self-pollinated and progeny plants identified that are homozygous for the donor allele

Marker type by application Background markers Background selection, in which the breeder selects for recurrent parent marker alleles in all genomic regions except the target locus, and the target locus is selected based on phenotype. Background selection is important in order to eliminate potentially deleterious genes introduced from the donor. So-called ' linkage drag ', the inheritance of unwanted donor alleles in the same genomic region as the target locus, is difficult to overcome with conventional backcrossing, but can be addressed efficiently with the use of markers.

Polymorphic alleles Alleles 142 170 178 184 193 Genotypes 142/178 Heterozygote 142/170 Heterozygote 170/170 Homozygote 178/178 Homozygote 184/184 Homozygote 193/193 Homozygote Adopted from Dr. Kassa

SSRs Code nc130 phi014 phi029 phi031 TL2012-1 139:142 428:431 152:152 221:221 TL2012-2 139:139 428:428 152:152 185:189 TL2012-3 142:142 431:431 148:148 185:189 TL2012-4 139:139 428:428 148:148 185:191 TL2012-5 139:139 428:428 148:148 185:191 TL2012-6 139:139 428:428 148:148 185:189 TL2012-7 139:139 428:428 148:148 185:191 TL2012-8 139:139 428:428 148:152 189:221 TL2012-9 139:139 428:428 148:154 185:221 TL2012-10 139:142 428:428 148:148 187:191 TL2012-11 139:142 428:431 148:148 185:189 TL2012-12 139:139 428:428 148:148 185:191

SNP csu1171 PHM106 PHM119 PHM129 PHM129 PHM130 SubjectID an1_5 _2 d8_2 d8_3 lac1_3 21_29 85_27 04_7 79_9 20_10 2171 G:G G:A A:A A:A G:A G:C G:A G:G A:A G:A 3158 G:A A:A A:A A:A A:A G:C? G:A A:A G:A Fam18-39 G:G A:A A:A A:A A:A G:G A:A G:G A:A G:G Fam20-27 G:G G:A A:A A:A G:A G:G A:A G:G A:A A:A 48 A:A G:G A:A A:A G:G C:C A:A G:G A:A A:A H16 G:A G:A A:A A:A G:A C:C G:A G:G G:A A:A Fam7-11 G:G A:A A:A A:A A:A G:G A:A G:G A:A G:G Fam16-26 G:G G:A G:A G:A G:A G:G A:A G:G A:A? 839 G:A G:A A:A A:A G:A G:C G:A G:A A:A A:A 3350 G:G A:A A:A A:A G:A G:G A:A G:G A:A G:A Fam16-19 G:G A:A? G:G A:A G:G A:A G:G A:A A:A Fam11-25 G:G A:A A:A A:A A:A G:G A:A G:G A:A G:G 2441 G:A G:A G:A G:A A:A G:C A:A G:G A:A A:A

GBS SNP calls - Lots of Missing Data rs# alleles chrom pos stran d RIL_1 RIL_10 RIL_100 RIL_101 RIL_102 RIL_103 RIL_104 RIL_105 S10_13181 T/G 10 13181 + T T T T N T N T S10_13355 T/C 10 13355 + T T T N T N T T S10_15605 A/G 10 15605 + N A N A A A A N S10_15607 A/G 10 15607 + N A N A A A A N S10_15619 A/G 10 15619 + N A N A A A A N S10_15629 G/A 10 15629 + G G G G G G G G S10_15685 C/G 10 15685 + C N C C C N C C S10_15687 G/A 10 15687 + G N G G G N G G S10_15699 G/C 10 15699 + G G G G G N G G S10_15720 A/G 10 15720 + N N N N N N N A S10_15721 G/C 10 15721 + N N N N N N N G S10_15722 A/T 10 15722 + N N N N N N N N S10_15723 T/C 10 15723 + N N N N N N N T S10_16315 G/T 10 16315 + N G G G G N N G S10_16419 A/G 10 16419 + A A A A N N A A S10_16432 C/G 10 16432 + C C C C N N C C S10_16439 A/G 10 16439 + N N N N N N N N S10_16497 C/A 10 16497 + C C C N C C N N S10_16498 A/G 10 16498 + A A A N A A N N S10_16499 C/A 10 16499 + C C C N C C N N S10_16573 T/A 10 16573 + T T N N T T T T S10_17505 G/A 10 17505 + G G N G G G G G S10_17518 G/T 10 17518 + G N G G G G G G S10_17528 G/C 10 17528 + G G N G G G G G S10_17533 C/G 10 17533 + C N C C C C C C S10_17550 G/C 10 17550 + G N G G G G G G S10_17551 A/C 10 17551 + A N A A A A A A S10_17591 G/C 10 17591 + G G G G G G N G S10_17593 T/C 10 17593 + T T T T T T N T S10_17613 A/G 10 17613 + A A A A A A N A

What do we use the genotypic data for Diversity studies Quality control Mapping QTL Association mapping GWAS Marker Deployment Marker assisted backcrosing Forwardbreeding Marker assisted recurrent selection Genomic selection

Mapping

Mapping populations Segregating populations F2s, F3s BCs (Temporary) Recombinant inbred lines (RILs) permanent Doubled Haploid lines (permanent) Nested Association Mapping panels (NAM) Multi-parent advanced generation intercross (MAGC)

Linkage Analysis Single-marker analysis (also single-point analysis ) is the simplest method for detecting QTLs associated with single markers. The simple interval mapping (SIM) method makes use of linkage maps and analyses intervals between adjacent pairs of linked markers along chromosomes simultaneously, instead of analyzing single markers Composite interval mapping (CIM) has become popular for mapping QTLs. This method combines interval mapping with linear regression and includes additional genetic markers in the statistical model in addition to an adjacent pair of linked markers for interval mapping

Composite interval mapping 3.0 LOD M1 M2 M3 M4 M5 M7 M8 M9 10 23 30 35 41 49 62 65 71 2 cm interval Linkage group 1 Cofactors

QTL analysis

GENOMIC SELECTION (GS)

Marker Assisted Selection (MAS) Benefits of MAS Higher genetic gain per unit time Increased Reliability Not affected by environmental factors Increased efficiency Traits that come later in the development stage can be scored before Reduced costs? Incase of multi-environment trials

Marker Assisted Selection (MAB) Benefits of MAB Reduced Linkage Drag Marker assisted back Crossing Gene pyramiding Resistance genes Marker Assisted breeding of polygenic traits Keeping tract of all genes involved in complex traits Introduction of novel characters Back Cross Effective exploitation of exotic germplasm

How does the QTL work QTL-Based Marker Assisted Selection QTLs localized to marker intervals, their effect sizes estimated QTLs ranked by effect size. Those with largest effects declared significant

QTL-Based Marker Assisted Selection: From Breeders Perspective has it delivered desired results? Precision problems in estimating QTL position, genetic effects, false positives and negatives Limited proportion of the total genetic variance is captured by the markers Bias of estimated effects (overestimation of selected effects- Beavis effect ) Effects too small for detection-ignoring some variation? Often lead to poor response

Genomic Selection (GS) -Concept GS is based on utilization of high-density marker application GS differs from QTL-based breeding approaches in that it uses all markers in a prediction of performance genomic estimated breeding value (GEBV)

Utilization of GS and its benefits GS has advantage of increasing genetic gain by reducing cycle time Reduce phenotyping cost by predicting GEBVs of untested lines Filtering bulk of lines in stage 1 trials before advancing them to next level More accuracy to capture variation by including alleles with minor effects apart from those alleles with major effect

Genomic Selection (GS) Two steps: Estimation of the effects of chromosome segments in a reference population and, Prediction of Genomic Estimated Breeding Values (GEBVs) not in the reference population (selection candidates) QTL are in Linkage Disequilibrium (LD) with a marker or haplotype of markers

LD: Non Random association of Alleles = r2

Genomic Selection (GS) 1. In a training population (both genotypic and phenotypic data available), fit a large number of markers as random effects in a linear model to estimate all genetic effects simultaneously for a quantitative trait. The aim is to capture all of the additive genetic variance due to alleles with both large and small effects on the trait 2. In a breeding population (only genotypic data available), use estimates of marker effects to predict breeding values and select individuals with the best GEBVs.

GS: Predicting Using Many Markers Breeding Material Genotyping Calculate GEBV Make Selections Meuwissen et al. 2001 Genetics 157:1819-1829

Summary of GS Scheme Advance lines informative for model improvement Test varieties and release Advance lines with highest GEBV Phenotype (lines have already been genotyped) Model Training Cycle Updated Model Genomic Selection Line Development Cycle Make crosses and advance generations Train prediction model Genotype New Germplasm Heffner, E.L. et al. 2009. Genomic Selection for Crop Improvement. Crop Science 49:1-12

Genotyping by Sequencing (GBS) 1. DNA extraction 2. Sequencing (GBS) 3. Allele calls 4. SNPs 5. Imputation (depending on statistical model) 6. Statistical Models 7. Analysis

Statistical Models for GS Linear Mixed Models & Bayesian estimation of many QTL effects, set as random effects, can be estimated simultaneously Simple basic model Y = 1μ + Zg + e Y = Data vector 1 = vector of ones (n = records) Z = design matrix g = genetic effects to be estimated e = vector of residuals

Genomic Selection (GS)-Linear Models Ridge Regression BLUP Equal variance of marker effects: Overcomes the problem of over-estimation of segment effects by shrinking estimates towards the mean Problem treats all effects equally across all loci, whereas in fact many markers have negligible effects However ridge regression may still perform reasonably well in the context of estimating genomic breeding values, as the effects are accumulated across many segments.

Genomic Selection (GS)-Linear Models Bayesian methods Different variance for each marker Captures prior knowledge that there are some chromosome segments containing QTL of large effects, some segments with moderate to small effects, and some segments with no QTL at all when estimating the effects of haplotypes (or single markers) within the chromosome segments

Genomic Selection (GS)-Linear Models Bayesian Shrinkage Regression-Bayes A (Meuwissen et al), Assumption: marker variance = inverse chi-square distribution. Bayesian Variable Selection -Bayes B, Bayes Cpi. Assumptions: marker variance = inverse chi-square distribution. Some marker values are zero. Mark E. Sorrells, Jessica Rutkoski, Elliot Heffner and Long- Xi Yu

Genomic Selection (GS)-Statistical Models Kernel Regression & Reproducing Kernel Hilbert Spaces (RKHS) Regression (parameters control complexity of the distribution of the QTL effects) (Gianola et al) Model performance is based on correlation between GEBV and True Breeding Value (TBV) G-BLUP method-same as RKSH

Genomic Selection (GS)-Statistical Models G-BLUP method-same as RKSH Equal variance for marker effects Model performance is based on correlation between GEBV and True Breeding Value (TBV) Uses genotypic data for G-matrix file used for prediction No need for imputation of genotypic data

Proof of Concept Experiments in Maize-GBS 2,300 S 4 lines were genotyped and their testcrosses phenotyped The testcross trials were from 2007, 2008, 2009, 2010, 2011, 2012 Phenotypic testcross data from 154 trials was assembled 700 SYNF 2 lines (Group A & B) have been genotyped. Their testcrosses are being phenotyped 19 bi-parental Populations

Proof of Concept Experiments in Maize-GBS For stage 1 & 2 testcrosses, we are trying to analyze within tester Within Management (optimal, managed drought, managed low nitrogen and random drought) Stage 1 predict stage 2 Cross validations within trials For 19 bi-parental Populations, we are trying Bayesian models

G-BLUP CROSS VALIDATIONS IN MANAGED DROUGHT TRIALS Number of Lines in Trial (Validation) Number of Lines Training Set Correlatio n Training Set Correlation Validation Set H 2 Correlation Validation/ sqrth 2 Trial Name ILS2-TC-2-1 74 1435 0.8658 0.3516 0.39373 0.560337 ILS2-TC-3-1 64 1445 0.8661 0.037 0.11492 0.109145 ILS2-TC-4-1 85 1424 0.8694 0.1734 0.46992 0.252951 ILS2-TC-5-1 84 1425 0.863 0.0935 0.17252 0.225108 3WHYB-2010-15-1 183 1326 0.8528 0.131 0.70445 0.15608 TK-LXT-1-7 87 1422 0.8655 0.1066 0.37751 0.173497 DTMA-MARS-EVALTC- 01-1 182 1327 0.8607 0.0581 0.05704 0.243269 DTMA-MARS-EVALTC- 02-1 198 1311 0.8789 0.1982 0.28514 0.371171 DTMA-MARS-EVALTC- 03-1 225 1284 0.8764 0.1425 0.29932 0.260464 EIHYB-2011-1-3 73 1436 0.8654 0.0998 0.33892 0.171428 EIHYB-2011-2-3 69 1440 0.8669 0.293 0.60457 0.376829 EIHYB-2011-3-3 72 1437 0.8678 0.1872 0.35835 0.312717 EIHYB-2011-4-3 63 1446 0.8659 0.438 0.49604 0.621893 3WHYB-2011-19-1 50 1459 0.8651 0.2198 0.44242 0.330453

RKHS-BLUP CROSS VALIDATIONS IN MANAGED DROUGHT TRIALS Number of Lines in Trial (Validation) Number of Lines (Training Set) Correlation Training Set Correlation Validation Set H 2 Correlation Validation/ sqrth 2 Name of Trial ILS2-TC-2-1 74 1435 0.9793 0.3053 0.39373 0.48655 ILS2-TC-3-1 64 1445 0.9795 0.0296 0.11492 0.087316 ILS2-TC-4-1 85 1424 0.9861 0.181 0.46992 0.264038 ILS2-TC-5-1 84 1425 0.98 0.1812 0.17252 0.436253 3WHYB-2010-15-1 183 1326 0.9661 0.14 0.70445 0.166803 TK-LXT-1-7 87 1422 0.9811 0.0478 0.37751 0.077797 DTMA-MARS-EVALTC- 01-1 182 1327 0.9735 0.1183 0.05704 0.49533 DTMA-MARS-EVALTC- 02-1 198 1311 0.9848 0.1785 0.28514 0.334279 DTMA-MARS-EVALTC- 03-1 225 1284 0.9791 0.135 0.29932 0.246755 EIHYB-2011-1-3 73 1436 0.9796 0.2164 0.33892 0.371714 EIHYB-2011-2-3 69 1440 0.9787 0.3325 0.60457 0.42763 EIHYB-2011-3-3 72 1437 0.9788 0.2832 0.35835 0.473085 EIHYB-2011-4-3 63 1446 0.9795 0.5104 0.49604 0.72469 3WHYB-2011-19-1 50 1459 0.9789 0.325 0.44242 0.488614

Factors Affecting the Accuracy of GEBVs Level and distribution of LD between markers and QTL R 2 > 0.2 desirable, but more markers increase accuracy Meuwissen 2009: Minimum number of markers for across family= Ne*L where Ne is the effective population size and L is the genome size in Morgans Mark E. Sorrells, Jessica Rutkoski, Elliot Heffner and Long-Xi Yu

Factors Affecting the Accuracy of GEBVs Distribution of QTL effects Many small effect QTL or low LD favor BLUP for capturing small effect QTL that may not be in LD with a marker Prediction based on relationship decays faster than prediction based on LD (Habier et al 2007; Zhong et al 2009). InbreedingMendelian Sampling Term Selection for favorable, low frequency alleles and against inbreeding Mark E. Sorrells, Jessica Rutkoski, Elliot Heffner and Long-Xi Yu

Thank you for your interest!