Marker types Potato Association of America Frederiction August 9, 2009 Allen Van Deynze
Use of DNA Markers in Breeding Germplasm Analysis Fingerprinting of germplasm Arrangement of diversity (clustering, PCA, etc.) Breeding Alternative or support to selection for traits Increase rate of genetic gain: Selection during off-season cycles Selection of hybrid traits on inbred individuals Early selection (e.g. pre-flowering) Parental Selection Marker Based Parent Similarity Marker based estimated variance within a population Genetic distance between parents Trait Analysis Association of traits with genomic regions Understanding trait relationships (linkage vs. pleiotropy) Understanding causes of variation (aid in gene cloning) Marker Assisted Breeding Marker Assisted Backcrossing Quality Assurance Parent-offspring tests, Genetic purity tests, Event tests
Marker assisted selection DNA marker Fruit ripening
The # of Markers Needed Depends on Goals Protect varieties: 100s of markers Classify germplasm: 100s mapped ID tightly linked QTLs in linkage studies - 100s mapped ID candidate genes and association studies - saturated map. Depends on number of chromosomes Depends on size of genetic map (cm)
DNA RNA Protein Trait The Central Dogma of molecular biology is that the information in the DNA sequence is transcribed into mrna, which is then translated into proteins. Proteins are large molecules that are the enzymes and structural components of living cells = trait Image compliments of National Human Genome Research Institute
Marker types RFLPs RAPDs AFLPs SSRs SNPs SFPs Others
Restriction Fragment Length Polymorphism (RFLPs) cdna clones Genomic clones
RFLPs Co-dominant Detect all alleles simultaneously Good across related species Basis (anchors) of many species maps Too costly and labor intensive for breeding
Random Amplified Polymorphic DNA (RAPDs) University of Saskatchewan
RAPDs No sequence information needed Universal primer set Reproducibility problems
Amplified Fragment Length Polymorphism Restriction enzyme digestion genomic DNA Adaptor ligation Selective PCR amplification AFLP fingerprint
AFLP characteristics multiplex PCR Competition PCR : quantitative detection No sequence information required Size-based fragment discrimination Transcript and marker discovery Transcript and marker detection Universal technology (proprietary)
Marker types Inter MITE Polymorphism (IMP), interssr, Inter RGA Amplifies DNA between MITEs (miniature inverted-repeat transposable elements) MITEs PCR Amplification Template DNA Terminal inverted repeats Inter The MITEs Each numerous end are DNA well of the is polymorphic distributed amplified MITE by characterized throughout bands create PCR most by a an genomes distinct inverted fingerprint repeat sequence for each line
Inter markers High multiplexing value 15 to 75 loci per reaction High throughput Cost-effective Distributed throughout the genome Good level of intra-species variation High level of cross applicability Dominant markers May not be in coding regions Tomato
Simple Sequence Repeat (microsatellites) tcactttgcagtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtcccgttcag tcactttgcagtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtcccgttcag PCR
Simple Sequence Repeats Medium abundance Medium throughput Available in many crops Need sequence information May or may not be associated with genes
Single-nucleotide polymorphisms (SNPs) cgtgtactgacctgcatgctatgaatcagtacatcgactagctt cgtgtactgacctgcatgctaggaatcagtacatcgactagctt Highly abundant roughly 1 per 100-2000 base pairs Distributed throughout genome including genes Genetically stable Typically biallelic Can be scored as a +/- marker Mutation may be diagnostic
SNPs Limited information per locus Need sequence information
Single Feature Polymorphisms Genotype 1 Genotype 2 A B C D G H I M N E F J K L Probe Intensity A B C D E F G H I J K L M N SFP
SFPs Based on SNPs and Insertion/deletions Abundant Distributed throughout genome including genes Genetically stable Highly multiplexable Dominant Need sequence information
Diversity array technology
DARTs Medium throughput Multiplex Dominant markers Semi-Fixed assays Use SNPs
Why move to SNP maps? Microsatellite markers create maps with large gaps- appropriate for within family studies SNPs SNPs create dense maps to pinpoint regions across the population
Marker Detection Hybridization Amplification Electrophoresis Fluorescence
Polymerase Chain Reaction Taken from the National Health Museum gallery
SNP technologies Hybridization Single base pair extension Allele-specific PCR
Agarose Gel Electrophoresis Easy Universal Expensive Low throughput Use RAPDs SSRs SNPs RFLPs AFLPs
Automated Gel Electrophoresis Easy High resolution Automated High throughput Expensive equip Use SSRs AFLPs SNPs IMPs
Real Time PCR
Real-Time PCR cont d Easy Automated High throughput Expensive equip Use SNPs
96 samples x 96 assays Fluidigm
Pyrosequencing Automated Medium throughput Expensive equip Use SNPs
Invader Assay for SNP Detection Biplex FRET Format Cleavage Site Cleavage Site Invader Oligo A WT Probe Invader Oligo C Mut Probe Target T Released 5 Flap A Cleavage Site F1 Q F2 Q G Released 5 Flap C Cleavage Site Target A C FRET Cassette 1 FRET Cassette 2 F1 F2
Invader Automated High throughput Highly sensitive Flexible Quantitative Minimum amount reagents required Use SNPs
Mass Spec
Mass Spec Medium throughput Multiplex Inexpensive reagents Automated Need amplification Expensive equipment
Melting Curve Analysis homos het
Liquid Arrays Automated High throughput Highly sensitive Multiplex Flexible Expensive equip Use SNPs
Illumina 2-60,000 SNPs x 96 samples $<0.01-0.15/dp
Experimental Procedure
SNP technologies Technology Samples SNPs Cost/SNP Agarose Gels 10 384 10 384 high Polyacrylamide Gels 10 384 10 384 high Real Time PCR 96 1,500 1 100 low Fluidigm 12 15,000 12 96 low very low Invader 96 1000s 96+ very low Pyrosequencing 96 384 100s med Mass Spec 96 384 100s med Melting curve 10 384 100s med Illumina Bead Express 480 1 384 med Illumina Golden Gate 480 384 1536 low Illumina Infinium 1152 7,600 100k very low
Marker Attributes Marker RFLPs RAPDs SSRs AFLPs/ IMPs SFPs SNPs Development costs high low high low high med-high Technical complexity high low low med med low Automated no no med med semi yes Reproducibility high low high med med high Cross species yes no yes no yes no Segregation co-dom dom co-dom dom dom co-dom Information content genomic/ gene none genomic none genomic / genes genomic/ genes Cost/datapoint high low med low low For Breeding no no yes yes no yes $0.5-1.00 $<0.01-0.20