Pathway approach for candidate gene identification and introduction to metabolic pathway databases.

Similar documents
Mapping and selection of bacterial spot resistance in complex populations. David Francis, Sung-Chur Sim, Hui Wang, Matt Robbins, Wencai Yang.

HCS806 Summer 2010 Methods in Plant Biology: Breeding with Molecular Markers

Genetic dissection of complex traits, crop improvement through markerassisted selection, and genomic selection

Identifying Genes Underlying QTLs

Mapping and Mapping Populations

A brief introduction to Marker-Assisted Breeding. a BASF Plant Science Company

SolCAP. Executive Commitee : David Douches Walter De Jong Robin Buell David Francis Alexandra Stone Lukas Mueller AllenVan Deynze

Using molecular marker technology in studies on plant genetic diversity Final considerations

Marker types. Potato Association of America Frederiction August 9, Allen Van Deynze

MAS refers to the use of DNA markers that are tightly-linked to target loci as a substitute for or to assist phenotypic screening.

Genomic Selection in Breeding Programs BIOL 509 November 26, 2013

High-density SNP Genotyping Analysis of Broiler Breeding Lines

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases).

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010

PROJECT CODE BRE1010

Genome-Wide Association Studies (GWAS): Computational Them

Usage Cases of GBS. Jeff Glaubitz Senior Research Associate, Buckler Lab, Cornell University Panzea Project Manager

Module 1 Principles of plant breeding

GBS Usage Cases: Examples from Maize

I.1 The Principle: Identification and Application of Molecular Markers

Mapping and linkage disequilibrium analysis with a genome-wide collection of SNPs that detect polymorphism in cultivated tomato

Application of Genotyping-By-Sequencing and Genome-Wide Association Analysis in Tetraploid Potato

Using semantic web technology to accelerate plant breeding.

Gene Mapping in Natural Plant Populations Guilt by Association

Map-Based Cloning of Qualitative Plant Genes

Association mapping of Sclerotinia stalk rot resistance in domesticated sunflower plant introductions

J. W. Scott & Sam F. Hutton Gulf Coast Research & Education Center CR 672, Wimauma, FL 33598

The 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential

QTL mapping in domesticated and natural fish populations

Why do we need statistics to study genetics and evolution?

Genome Wide Association Study for Binomially Distributed Traits: A Case Study for Stalk Lodging in Maize

Experimental Design and Sample Size Requirement for QTL Mapping

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001

Supplemental Figure Legends

Improving barley and wheat germplasm for changing environments

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

Bioinformatic Analysis of SNP Data for Genetic Association Studies EPI573

SNP calling and Genome Wide Association Study (GWAS) Trushar Shah

High-Density SNP Genotyping of Tomato (Solanum lycopersicum L.) Reveals Patterns of Genetic Variation Due to Breeding

7.012 Problem Set 2. c) If an HhAa unicorn mates with an hhaa unicorn, what fraction of the progeny will be short and brown?

Dissecting the genetic basis of grain size in sorghum. Yongfu Tao DO NOT COPY. Postdoctoral Research Fellow

Nature Genetics: doi: /ng Supplementary Figure 1. The pedigree information for American upland cotton breeding.

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score

BICD100 Midterm (10/27/10) KEY

Identifying and exploiting natural variation

DESIGNS FOR QTL DETECTION IN LIVESTOCK AND THEIR IMPLICATIONS FOR MAS

Utilization of Genomic Information to Accelerate Soybean Breeding and Product Development through Marker Assisted Selection

QTL Mapping, MAS, and Genomic Selection

Statistical Methods for Quantitative Trait Loci (QTL) Mapping

Genome Sequence Assembly

Lecture 12. Genomics. Mapping. Definition Species sequencing ESTs. Why? Types of mapping Markers p & Types

GENETICS - CLUTCH CH.20 QUANTITATIVE GENETICS.

GBS Usage Cases: Non-model Organisms. Katie E. Hyma, PhD Bioinformatics Core Institute for Genomic Diversity Cornell University

Traditional Genetic Improvement. Genetic variation is due to differences in DNA sequence. Adding DNA sequence data to traditional breeding.

Answers to additional linkage problems.

Overview of the next two hours...

Lecture 2: Height in Plants, Animals, and Humans. Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013

Advanced breeding of solanaceous crops using BreeDB

Marker-Assisted Selection for Quantitative Traits

Solutions to Problem Set 2

Cloning drought-related QTLs. WUEMED training course June 5-10, 2006

Genetics Effective Use of New and Existing Methods

Genome-wide association studies (GWAS) Part 1

Molecular markers in plant breeding

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011

New sunflower rust projects in the USDA Sunflower Research Unit

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012

Current Applications and Future Potential of High Resolution Melting at the National Clonal Germplasm Repository in Corvallis, Oregon

Linkage Disequilibrium

MICROSATELLITE MARKER AND ITS UTILITY

GREG GIBSON SPENCER V. MUSE

Analysis of genome-wide genotype data

QTL Mapping, MAS, and Genomic Selection

OBJECTIVES-ACTIVITIES 2-4

7.03 Problem Set 2 Due before 5 PM on Friday, September 29 Hand in answers in recitation section or in the box outside of

PCB Fa Falll l2012

Genome-wide response to selection and genetic basis of cold tolerance in rice (Oryza sativa L.)

Introgression of a functional epigenetic OsSPL14 WFP allele into elite indica rice genomes greatly improved panicle traits and grain yield

GENOTYPING-BY-SEQUENCING USING CUSTOM ION AMPLISEQ TECHNOLOGY AS A TOOL FOR GENOMIC SELECTION IN ATLANTIC SALMON

This is a closed book, closed note exam. No calculators, phones or any electronic device are allowed.

HCS806 Summer 2010 Methods in Plant Biology: Breeding with Molecular Markers

A Primer of Ecological Genetics

Lecture 21: Association Studies and Signatures of Selection. November 6, 2006

A PCR Assay for the Anthocyaninless Mutation in Fast Plants and a Bridge Between Classical Genetics and Genomics

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Association Mapping. Mendelian versus Complex Phenotypes. How to Perform an Association Study. Why Association Studies (Can) Work

SNPs - GWAS - eqtls. Sebastian Schmeier

Authors: Vivek Sharma and Ram Kunwar

Strategic Research Center. Genomic Selection in Animals and Plants

5 Results. 5.1 AB-QTL analysis. Results Phenotypic traits

Selection Strategies for the Development of Maize Introgression Populations

Existing potato markers and marker conversions. Walter De Jong PAA Workshop August 2009

Association Mapping in Wheat: Issues and Trends

Genomics assisted Genetic enhancement Applications and potential in tree improvement

Begomovirus resistance z Resistance TYLCV ToMoV Yield Fruit size Designation Source Spring Fall Spring Fall (kg/plant) (g)

Tomato Breeding at University of Florida: Present Status and Future Directions

GenSap Meeting, June 13-14, Aarhus. Genomic Selection with QTL information Didier Boichard

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA.

Agricultural Outlook Forum Presented: February 17, 2006 STRATEGIES IN THE APPLICATION OF BIOTECH TO DROUGHT TOLERANCE

Genomic selection in American chestnut backcross populations

Transcription:

Marker Assisted Selection in Tomato Pathway approach for candidate gene identification and introduction to metabolic pathway databases. Identification of polymorphisms in data-based sequences MAS forward selection, background selection, combining traits, relative efficiency of selection Why (population) size matters

Example: QTL for color uniformity in elite crosses Chr 1 Chr 2 Chr 3 Chr 4 Chr 5 Dist cm Marker Name Dist cm Marker Name Dist cm Marker Name Dist cm Marker Name Dist cm Marker Name 17.1 2.1 20.0 2.0 24.7 7.2 12.8 9.3 9.9 10.7 5.8 1.1 15.1 10.1 CT233 TG67 LEOH36 TG125 CT62 CT149 LEOH17* TG273 TG59 CT191 TG465 TG260 LEOH7 TG255 TG580 13.9 LEOH36 10.7 IL1-1 IL1-2 3.8 10.9 18.4 6.1 5.2 10.6 LEOH17* 7.0 3.1 10.2 IL1-3 LEOH17* TG608 TG114 LEOH15* TG15 5.2 TG130 7.9 12.4 LEOH17* 3.6 CT205 TG483 18.5 18.5 TG165 CT141 23.6 LEOH23 13.4 3.0 CT157 TG14 TG520 3.0 LEOH15* 12.5 18.5 23.5 LEOH17* LEOH37 LEOH37 CT244 7.7 19.5 CT82 TG469 CT178 9.1 6.5 TG645 CD51 CT194 10.1 TG537 16.5 TG129 5.7 28.3 TG167 TG246 1.6 CT50 TG151 8.2 TG500 CT85 13.6 TG154 0.0 TG163 18.2 LEOH10 LEOH10 IL2-4 TG214 IL3-1 IL3-2 IL3-3 Audrey Darrigues, Eileen Kabelka IL4-1 IL4-3 IL4-4 CT101 TG441 LEOH17* CT167 CT93 LEOH16 TG96 TG100A CT118 TG185 IL5-2 QTL Trait Origin 2 L, YSD S. lyc. 4 YSD S. lyc. 6 L, Hue og c 7 L, Hue S. hab. 11 L, Hue S. lyc.

Carotenoid Biosynthesis: Candidate pathway for genes that affect color and color uniformity. Disclaimer: this is not the only candidate pathway

Databases that link pathways to genes http://www.arabidopsis.org/help/tutorials/aracyc_intro.jsp

Databases that link pathways to genes http://metacyc.org/ http://www.plantcyc.org/ http://sgn.cornell.edu/tools/solcyc/ http://www.arabidopsis.org/biocyc/index.jsp http://www.arabidopsis.org/help/tutorials/aracyc_intro.jsp External Plant Metabolic databases CapCyc (Pepper) (C. anuum) CoffeaCyc (Coffee) (C. canephora) SolCyc (Tomato) (S. lycopersicum) NicotianaCyc (Tobacco) (N. tabacum) PetuniaCyc (Petunia) (P. hybrida) PotatoCyc (Potato) (S. tuberosum) SolaCyc (Eggplant) (S. melongena)

http://www.plantcyc.org:1555/

Note: missing step (lycopene isomerase, tangerine)

Check boxes (Note: MetaCyc has many more choices, but no plants)

Scroll down page Capsicum annum sequence retrieved

http://www.ncbi.nlm.nih.gov/

Select database

Query CCACCACCATCCTCACTTTAACCCACAAATCCCACTTTCTTTGGCCTAATTAACAATTTT Sbjct CCACCACCATCCTCACTTTAACCCACAAATCCCATTTTCTTTGGCCTAATTAACAATTTT Zeaxanthin epoxidase Probable location on Chromosome 2 Alignment of Z83835 and EF581828 reveals 5 SNPs over ~2000 bp

51 annotated loci

Information missing from other databases is here Candidates identified in other databases are here

Comment on the databases: Information is not always complete/up to date. Display is not always optimal, and several steps may be needed to go from pathway > gene > potential marker. Sequence data has error associated with it. esnps are not the same as validated markers. There is a wealth of information organized and available. We will be asking for feed-back RE how best to improve the SGN database and access via the Breeders Portal

The previous example detailed how we might identify sequence based markers for trait selection. Query CCACCACCATCCTCACTTTAACCCACAAATCCCACTTTCTTTGGCCTAATTAACAATTTT Sbjct CCACCACCATCCTCACTTTAACCCACAAATCCCATTTTCTTTGGCCTAATTAACAATTTT Improving efficiency of selection in terms of 1) relative efficiency of selection, 2) time, 3) gain under selection and 4) cost will benefit from markers for both forward and background selection. Remainder of Presentation will focus on Where to apply markers in a program Forward and background selection Marker resources Alternative population structures and size

Comparison of direct selection with indirect selection (MAS). Relative efficiency of selection: r (gen) x {H i /H d } Line performance over locations > MAS > Single plant

Accelerating Backcross Selection F1 50:50 BC1 75:25 Expected proportion of Recurrent Parent (RP) genome in BC progeny BC2 87.5:12.5 BC3 93.75:6.25 BC4 96.875:3.125

Two-stage selection Select for RP genome at unlinked Select for target allele markers Three-stage selection Select for RP recombinants at flanking Select for target allele markers Four-stage selection Select for target allele References: Select for RP recombinants at flanking markers Select for RP genome at unlinked markers Select for RP genome on carrier chromosome Select for RP genome at unlinked markers Frisch, M., M. Bohn, and A.E. Melchinger. 1999. Comparison of Selection Strategies for Marker-Assisted Backcrossing of a Gene. Crop Science 39: 1295-1301.

Progeny needed for Background Selection During MAS Q10 of RP genome in percent Population Size 20 40 60 80 100 125 150 200 Two-Stage BC1 76.7 78.7 79.7 80.3 80.7 81.3 81.7 82.2 BC2 90.3 91.9 92.8 93.3 93.6 93.9 94.0 94.6 BC3 95.8 96.2 97.1 97.3 97.4 97.5 97.6 97.8 Three-Stage BC1 71.2 72.7 73.4 73.6 73.3 73.2 72.8 72.2 BC2 86.1 87.2 88.5 89.3 90.2 90.7 91.3 91.8 BC3 94.4 95.7 96.5 96.9 97.2 97.3 97.5 97.6 Q10 indicates a 90% probability of success From Frisch et al., 1999.

Marker Data Points required (Modified from Frisch et al., 1999; based on assumption of 12 chromosomes; initial selection with 4 markers/chromosome) Population Size Two-Stage Selection 60 80 100 125 BC1 2880 3840 4800 6000 BC2 900 1164 1416 1716 BC3 228 264 300 348 Total Marker points 4008 5268 6516 8064 Cost 0.15 601.2 790.2 977.4 1209.6 0.20 801.6 1053.6 1303.2 1612.8 0.25 1002.0 1317.0 1629.0 2016.0 Three-Stage Selection BC1 2880 3840 4800 6000 BC2 492 708 960 1308 BC3 250 444 504 576 Total Marker point 3622 4992 6264 7884 Cost 0.15 543.3 748.8 939.6 1182.6 0.20 724.4 998.4 1252.8 1576.8 0.25 905.5 1248.0 1566.0 1971.0

For effective background selection we need: Markers for our target locus (C > T SNP for Zep) Markers on the target chromosome (Chrom. 2) Markers unlinked to the target chromosome

http://www.tomatomap.net http://sgn.cornell.edu/

Ovate

HBa0104A12

55 polymorphic markers 44 polymorphic markers

Missing data in SGN Limited ability to generate tables, PCR conditions sometimes incomplete, Enzyme sometimes missing, SNP not described. Missing data in Tomatomap.net SNP and sequence context requires BMC genomics supplemental table, ASPE primers, GoldenGate primers. 2007. BMC Genomics 8:465 www.biomedcentral.com/content/pdf/1471-2164-8-465.pdf

Where can we expect to be? TA496 ESTs with SNPs VS H1706 BAC sequences n = 1 n = 2 n = 3 n = 4 n = 5-10 n > 10 Total 806 596 106 34 22 38 10 Where EST Coverage = Allele Coverage n = 1 n = 2 n = 3 n = 4 n = 5-10 n > 10 Total 127 not tested 64 22 11 23 7 Proportion 0.16 0.60 0.65 0.50 0.61 0.70 analysis by Buell et al., unpublished Data based on estimated ~42% of sequence, therefore expect as many as 300 markers for a cross like E6203 x H1706

QTL s mapped in a bi-parental cross may not be appropriate for MAS in all populations Marker allele and trait may not be linked in all populations. Genetic background effects may be population specific. Original association may be spurious. QTL detection is dependent on magnitude of the difference between alleles and the variance within marker classes. What about mapping and MAS in unstructured populations? A brief introduction to Association Mapping follows.

Association Mapping statistical model designed to account for population structure (Q), correct for genetic background effects (Z), and identify marker-trait linkage (Marker) Y = μ REPy + Qw + Markerα + Zv + Error

LD measure (R 2 ) 1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Fresh market y = -0.037ln(x) + 0.1713 0 20 40 60 80 100 120 140 Distance between loci (cm) LD measure (R 2 ) 1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Processing y = -0.054ln(x) + 0.2583 0 20 40 60 80 100 120 140 Distance between loci (cm) 2 2 3 3 4 4 5 5 6 6 7 8 9 10 7 8 9 10 11 11 12 12

K=4 Tomato populations will have sub-structure 1 2 3 4 1) Fresh Market (FM) ; 2) Landrace; 3) Heirloom; 4) Processing K=8 1 2 3 4 5 6 7 8 1,6,7) Processing; 2) Landrace: 3,5) FM; 4) FM & Processing; 8) Heirloom Output from Pritchard s STRUCTURE

Association mapping Incorporates population structure and coefficient of relatedness The number of markers needed depends on the rate of LD decay (reflects recombination history) Highly specific to inference population wild species vs breeding program Sensitive to marker coverage LD decay and number of alleles (Nor, gf, and others all have multiple alleles within populations used by breeders) Will not be able to map traits where trait variation overlaps with population structure.

Even without sequence or marker data, there are lessons for practical breeding: Use pedigree data, knowledge of population structure, and objective data to increase precision of estimates of breeding value.

Take home messages: Marker resources exist for forward and background selection in elite x elite crosses in tomato. Marker resources are currently not sufficient for QTL discovery in bi-parental or AM populations; they will soon be. The best time to use genetic markers : early generation selection Restructuring of breeding program to integrate markers may include: 1) Increasing genotypic replication (population size) at the expense of replication (consider augmented designs). 2) Collecting objective data. Further discussion of AM approach in session VI Unstructured mapping of bacterial spot resistance

References: Kaepler, 1997. TAG 95:618-621. Frisch, et al., 1999. Crop Science 39: 1295-1301. Knapp and Bridges, 1990. Genetics 126: 769-777. Yu et al., 2006. Nature Genetics 38:203-308. Van Deynze et al., 2007. BMC Genomics 8:465 www.biomedcentral.com/content/pdf/1471-2164-8-465.pdf