Data Management for Efficient Phenotypic and Genomic Selection in Applied Breeding Programs

Similar documents
The NDSU Barley Breeding Program Past, Present, and Future. Rich Horsley, Paul Schwarz, Ana Maria Heilman-Morales, and Tom Walk

Background. The Integrated Breeding Platform An Introduction

Development of Early Maturing GEM lines with Value Added Traits: Moving U.S. Corn Belt GEM Germplasm Northward

OBJECTIVES-ACTIVITIES 2-4

Using molecular marker technology in studies on plant genetic diversity Final considerations

Genomic selection in American chestnut backcross populations

INVESTIGATORS: Chris Mundt, Botany and Plant Pathology, OSU C. James Peterson, Crop and Soil Science, OSU

Utilization of Genomic Information to Accelerate Soybean Breeding and Product Development through Marker Assisted Selection

USWBSI Barley CP Milestone Matrix Updated

High-density SNP Genotyping Analysis of Broiler Breeding Lines

How to Keep the Aquaculture Business sustainable? BluGen Korea Woo-Jai Lee

Zahirul Talukder 1, Yunming Long 1, Thomas Gulya 2, Charles Block 3, Gerald Seiler 2, Lili Qi 2. Department of Plant Sciences, NDSU

The Rice Experiment Station Rice Breeding Program. K.S. McKenzie V.C. Andaya F. Jodari S.O. PB Samonte J.J Oster C.B. Andaya

Brassica carinata crop improvement & molecular tools for improving crop performance

Lecture 1 Introduction to Modern Plant Breeding. Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013

Use of molecular markers to enhance genetic gains in the maritime pine breeding program

Dissecting the genetic basis of grain size in sorghum. Yongfu Tao DO NOT COPY. Postdoctoral Research Fellow

(toward) An Empirical Evaluation Of Genomic Selection In Perennial Ryegrass (Lolium perenne L.)

Genetic evaluation programs and future opportunities

Final Report Developing the Next Generation of Faster Drier Corn Products for Northern MN Project SP Project Title

CHALLENGES FOR DISEASE AND INSECT PEST RESISTANCE BREEDING IN WINTER WHEAT

AD HOC CROP SUBGROUP ON MOLECULAR TECHNIQUES FOR MAIZE. Second Session Chicago, United States of America, December 3, 2007

A Few Thoughts on the Future of Plant Breeding. Ted Crosbie VP Global Plant Breeding Monsanto Distinguished Science Fellow

How to optimize and compare breeding schemes?

Characterization of Sclerotinia phenotypic variation relevant to diseases of sunflower

Genomic selection in the Australian sheep industry

GENETICS - CLUTCH CH.20 QUANTITATIVE GENETICS.

Towardmoreefficient treebreeding. Matti Haapanen

High Cross-Platform Genotyping Concordance of Axiom High-Density Microarrays and Eureka Low-Density Targeted NGS Assays

Module 1 Principles of plant breeding

Jon Degner PhD candidate Centre for Forest Conservation Genetics University of British Columbia

The what, why and how of Genomics for the beef cattle breeder

Discussion paper Scenario analysis for the RPBC genomics programme. David Evison, School of Forestry, University of Canterbury

Genome wide association and genomic selection to speed up genetic improvement for meat quality in Hanwoo

Genomic Selection in Cereals. Just Jensen Center for Quantitative Genetics and Genomics

Application of Genotyping-By-Sequencing and Genome-Wide Association Analysis in Tetraploid Potato

I.1 The Principle: Identification and Application of Molecular Markers

MAS refers to the use of DNA markers that are tightly-linked to target loci as a substitute for or to assist phenotypic screening.

Traditional Genetic Improvement. Genetic variation is due to differences in DNA sequence. Adding DNA sequence data to traditional breeding.

Undergraduate Research in the Brzustowicz Laboratory Human Genetics Institute Department of Genetics Rutgers University

Genomic selection in cattle industry: achievements and impact

Genomic Selection in Dairy Cattle

Development of improved medium grain rice variety for Southern US challenges and opportunities

FOREST GENETICS. The role of genetics in domestication, management and conservation of forest tree populations

Phenotyping the forest-concepts and progress. Heidi Dungey, Dave Pont, Jonathan Dash, Mike Watt, Toby Stovold, Emily Telfer

Whole Genome-Assisted Selection. Alison Van Eenennaam, Ph.D. Cooperative Extension Specialist

Agrigenomics Genotyping Solutions. Easy, flexible, automated solutions to accelerate your genomic selection and breeding programs

Single- step GBLUP using APY inverse for protein yield in US Holstein with a large number of genotyped animals

BREEDING AND GENETICS

The 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential

Genetics Effective Use of New and Existing Methods

CSU Wheat Breeding and Genetics Program Update

An Introduction to SNP Testing. Michelle Miller, MSc, MBA Director of Operations Canadian Livestock Records Corp. AGM April 16,

WORKING GROUP ON BIOCHEMICAL AND MOLECULAR TECHNIQUES AND DNA PROFILING IN PARTICULAR. Twelfth Session Ottawa, Canada, May 11 to 13, 2010

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001

A Breeders Perspective on using the Breeding Information Management System for Cotton Breeding

Understanding genomic selection in poultry breeding

INCREASING THE RATE OF GENETIC GAIN FOR YIELD IN SOYBEAN BREEDING PROGRAMS

A brief introduction to Marker-Assisted Breeding. a BASF Plant Science Company

Streamlining data management at BioBank AS

DRAFT (REVISION) the Technical Working Party for Vegetables at its fifty-second session to be held in Beijing, China, from September 17 to 21, 2018

Strategy for Applying Genome-Wide Selection in Dairy Cattle

Genetic Evaluations. Stephen Scott Canadian Hereford Association

COURSE CATALOGUE UH SEMESTER 3

latestdevelopments relevant for the Ag sector André Eggen Agriculture Segment Manager, Europe

BLUP and Genomic Selection

Plant breeding programs producing inbred lines have two

A Primer of Ecological Genetics

WORKING GROUP ON BIOCHEMICAL AND MOLECULAR TECHNIQUES AND DNA PROFILING IN PARTICULAR. Eleventh Session Madrid, September 16 to 18, 2008

The promise of genomics for animal improvement. Daniela Lourenco

MULTI-HYBRID PLANTING MAKING THE MOST OUT OF EVERY ACRE

UC Davis Grain Legume Improvement and Statewide Support Activities Progress Report to the California Dry Bean Board. Lima Bean Breeding

Evolution of The Avena Toolbox The Triticeae Toolbox for Oat

Genomic selection applies to synthetic breeds

Quantitative Genetics

Modern Technologies to Access Common Bean Responses to Environmental Stress

OPTIMIZATION OF BREEDING SCHEMES USING GENOMIC PREDICTIONS AND SIMULATIONS

Genomic Selection: A Step Change in Plant Breeding. Mark E. Sorrells

technical bulletin Enhancements to Angus BREEDPLAN December 2017 Calculation of EBVs for North American Animals

Application of MAS in French dairy cattle. Guillaume F., Fritz S., Boichard D., Druet T.

Improving barley and wheat germplasm for changing environments

LARGE DATA AND BIOMEDICAL COMPUTATIONAL PIPELINES FOR COMPLEX DISEASES

HCS806 Summer 2010 Methods in Plant Biology: Breeding with Molecular Markers

Agrigenomics Genotyping Solutions. Easy, flexible, automated solutions to accelerate your genomic selection and breeding programs

Maize breeders decide which combination of traits and environments is needed to breed for both inbreds and hybrids. A trait controlled by genes that

Mapping and Mapping Populations

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score

Maja Boczkowska. Plant Breeding and Acclimatization Institute (IHAR) - NRI

Sequencing a North American yak genome

Big Data Platform Implementation

The Irish Beef Genomics Scheme; Applying the latest DNA technology to address global challenges around GHG emissions and food security.

Private-public partnership on eggplant (pre-)breeding using a participatory approach

INTERNATIONAL UNION FOR THE PROTECTION OF NEW VARIETIES OF PLANTS

Lecture 12. Genomics. Mapping. Definition Species sequencing ESTs. Why? Types of mapping Markers p & Types

Do markers add value?

Adaptive genetic variation within and among populations. Kathleen O Malley Associate Professor, OSU 2017 Annual Meeting Genetics Workshop

CALIFORNIA CROP IMPROVEMENT ASSOCIATION COMPREHENSIVE ANNUAL RESEARCH REPORT July 1, 2017 to June 30, Developing Malting Barley for California

TSB Collaborative Research: Utilising i sequence data and genomics to improve novel carcass traits in beef cattle

3. A form of a gene that is only expressed in the absence of a dominant alternative is:

Transcription:

Data Management for Efficient Phenotypic and Genomic Selection in Applied Breeding Programs Rich Horsley, Ana Heilman, and Tom Walk https://www.agronomix.com/_assets/images/clouddevices_400.jpg http://www.extremetech.com/wp-content/uploads/2013/03/minion_117.jpg https://www.bernardmarr.com/img/blog/how-big-data-is-transforming-every-business-in-every-industry.png

Year Generatio n NDSU Breeding Scheme Location Process Number of plants or lines 1 Crossing Greenhouse (fall) Crossing block 200 crosses 1 1 F 1 Greenhouse (winter) F 1 increase 5,000 plants 1 1 F 2 North Dakota (summer) F 2 selection 400,000 plants 1 Number of locations 2 F 3 New Zealand or Puerto Rico F 3 advancement 85,000 plants 2 2 F 4 North Dakota Progeny rows 25,000 lines 1 3 F 5 New Zealand or Arizona Seed increase 5,000 lines 2 3 F 6 North Dakota Preliminary yield trials 1,200 lines 2 4 F 7 North Dakota Intermediate yield trials 160 lines 7 5 F 8 North Dakota Advanced yield trial 40 lines 7 6 F 9 North Dakota Variety yield trial (VYT) 10 lines 7 7 F 10 North Dakota & region VYT and AMBA Pilot Scale 8 lines 14 8 F 11 North Dakota & region VYT and AMBA Pilot Scale 4-8 lines 14 9 F 12 North Dakota & region VYT and AMBA Plant Scale 2 lines 20 10 F 13 North Dakota & region VYT and AMBA Plant Scale 2 lines 20

How Long Does it Take to Make These Tables? Balanced data vs. unbalanced data. How do you make similar tables for all lines in the program that have been tested two or more years? 200 lines * 30 minutes/table = 12.5 working days 200 lines * 4 hours/table = 20 weeks

How NDSU Is Addressing These Data Management Needs? Department identified bioinformaticists as a priority for the 2013 SBARE & Legislative process. Hiring bioinformaticists identified as the top priority for the AES in the 2015 SBARE & Legislative process. Two Breeding Pipeline Data Managers hired in summer 2016. https://www.ag.ndsu.edu/plantsciences/news/large-data-managers-join-plant-sciences

Pipeline Database Management Team Data Compliance Standardization Data Analysis Efficiency QAQC Technology transference Data Visualizations Breeding Data Management Bioinformatic s Statistics Scripting language IT Security Automation Data migration Statistics

Requirements for Efficient Data Management Single platform for managing phenotype data genotype data Single server for housing platforms and data. Automating routine processes (queries and calculation of means or BLUPs)

Why use a relational database? Flat File Database Relational Database Advantages Easy to use, quick start No extra installation required Analytic & graphing tools built-in High level of security (authentication process) Powerful data organization and storage Data centralized, standardized & easily shared Changes universally applied to all tables Data easily queried, compiled, and analyzed Intellectual property protection & compliance Disadvantages Multiple files difficult to compile & compare Only understood by primary user in most cases Prone to corruption More difficult to secure Complex set-up Changes may be difficult Database needs to be well planned Requires experienced management Data must be applied individually to each table

Managing Phenotype Data Photo courtesy: Dr. Andrew J Green

Breeding system management Randomizations Data collection on tablets Analyses of basic experimental designs Stores data and means in a relational database

Data Management Work Plan at NDSU Educate and support breeders on use of technology platforms (e.g. Agrobase, T.3, JMP, and SAS) Integrate legacy data from breeding programs into DB. Automate data queries and calculations of: Unbalanced means or BLUPS across years based on Entry Pedigree Parents Balanced means across years Breeder requests Expand visualization capabilities for making decisions

Managing Genotype Data File-DNA-Sequencers_from_Flickr_57080968.jpg https://thumbs.dreamstime.com/z/genetic-fingerprinting-as-fingerprint-dnaemerging-out-as-medical-identification-symbol-paternity-test-64869820.jpg https://www.extremetech.com/wp-content/uploads/2013/03/minion-usb-dna-sequencer.jpg

How do we Manage Genotype Data? 525 parents screened with 50,000 SNPs 28.25 million data points 3,500 lines screened with 450 SNPs 1.575 million data points How do you pair genotype data with phenotype data for GWAS or genomic selection?

Year Generatio n NDSU Breeding Scheme Location Process Number of plants or lines 1 Crossing Greenhouse (fall) Crossing block 200 crosses 1 1 F 1 Greenhouse (winter) F 1 increase 5,000 plants 1 1 F 2 North Dakota (summer) F 2 selection 400,000 plants 1 Number of locations 2 F 3 New Zealand or Puerto Rico F 3 advancement 85,000 plants 2 2 F 4 North Dakota Progeny rows 25,000 lines 1 3 F 5 New Zealand or Arizona Seed increase 5,000 lines 2 3 F 6 North Dakota Preliminary yield trials 1,200 lines 2 4 F 7 North Dakota Intermediate yield trials 160 lines 7 5 F 8 North Dakota Advanced yield trial 40 lines 7 6 F 9 North Dakota Variety yield trial (VYT) 10 lines 7 7 F 10 North Dakota & region VYT and AMBA Pilot Scale 8 lines 14 8 F 11 North Dakota & region VYT and AMBA Pilot Scale 4-8 lines 14 9 F 12 North Dakota & region VYT and AMBA Plant Scale 2 lines 20 10 F 13 North Dakota & region VYT and AMBA Plant Scale 2 lines 20

Favorable Phenotypes in the Arizona Winter Nursery

Weak Strawed 2-rowed Phenotypes in the Arizona Winter Nursery

https://triticeaetoolbox.org/

https://triticeaetoolbox.org/barley/

Having Data in Databases Allows for Optimization of: Markers to include for genomic selection. Lines to use in training population. Statistical models to use for genomic selection of specific traits. Predictions of progeny performance from planned crosses.

Next Steps for NDSU Data Management Team Determine needs for hosting T3 locally and updating system to handle more crops. Develop Addins in JMP for automating queries and analyses. Make applications available on tablet and mobile devices for NDSU breeding programs.

Any Questions? Yield Trials at Osnabrock, ND