Evolution of The Avena Toolbox The Triticeae Toolbox for Oat

Size: px
Start display at page:

Download "Evolution of The Avena Toolbox The Triticeae Toolbox for Oat"

Transcription

1 Evolution of The Avena Toolbox The Triticeae Toolbox for Oat David Matthews, Gerard Lazo *, Clay Birkett, Jean-Luc Jannink USDA-ARS, Dept. Plant Breeding and Genetics, Cornell University, Ithaca NY *USDA-ARS, Western Regional Research Center, Albany CA 28 Feb 2014

2 Iowa State University GrainGenes, USDA Phylogeny of T3 Databases 2014 Instant Oatmeal My Crop Datafarm USWBSI The Breeders' Datafarm Triticeae CAP The Triticeae Toolbox CORE 2010 The Avena Toolbox The Hordeum Toolbox Barley CAP 2006

3 T3 Tools and Materials Materials - Phenotypes: Lines, Traits, Trials - Genotypes: Lines, Markers Functions - Select subsets, combine them - Download, e.g. TASSEL - Analyze online, explore - Genetic similarity: clustering, PCA - Genome-wide association studies, GWAS - Genomic selection

4 Data Content (T3 Wheat) Phenotype Genotype Traits or Markers K Total data points 300K 80M Phenotypes 3-dimensional, genotypes 2-dimensional Sparse data matrices: 8K x 100K = 800M, not 80M Total data size 2.3 Gbytes as mysqldump

5 Analysis and Download Tools

6 Selection Tools

7 What's a Phenotype? Phenotype = f (Trait, Line, Trial) Yield of MORTGAGE LIFTER in a trial in Butte MT in 2003 was 5000 kg/ha. Property = f (Trait, Line) Eye color of Dave is blue. Market class of CAYUGA is soft white winter. Stem rust genes of PAVON are Sr17, Sr30. aka "trait"

8 Query by Phenotype

9 What's a Property? name origin (breeder, year) environment-independent trait allele of a gene: Rht2, Lr34...

10 Query by Properties Passport data (name, origin) Genetic Characters (environmentindependent traits, alleles of genes)

11 1427 winter wheat lines from the National Small Grains Collection, clustered and plotted by genotype similarity Genotype Visualization

12 Calculation of "Selection Index" Arithmetic, weighted sum of trait values Scaling to deal with unequal trait scales For some traits less is better

13 Loading Data Spreadsheet format 2-dimensional for phenotypes and genotypes Template for each data type Tutorials Examples: American Oat Workers Conference trial - Uniform Early/Midseason Oat Performance Nurseries, Quaker Uniform Oat Nurseries, , 6000 trials

14 Curation Tools

15 Format for Phenotype Results

16 Sandbox Databases for Testing

17 To Do Load results from multiple trials in one file - e.g trials from UOPN / QUON Integrate POOL's pedigree display instead of T3's...

18 Iowa State University GrainGenes, USDA Phylogeny of T3 Databases 2014 Instant Oatmeal My Crop Datafarm USWBSI The Breeders' Datafarm Triticeae CAP The Triticeae Toolbox CORE 2010 The Avena Toolbox The Hordeum Toolbox Barley CAP 2006

19 T3 Pedigree Display

20 Pool Pedigree Display

21 T3 Pedigree + Haplotype

22 T3 Internals Powering the tools MySQL, R, PHP, Apache, Postfix, git GPL licensed, on Github > 10 running instances, 3.5 in production Developed over six years Current staff one programmer, one curator, 1/3 system administrator Server: 64 cpus, 500Gbytes memory, 1.5Tbytes disk, some of it SSD

23 Vision A turnkey web database saved as an image that users can manage with no command-line skills Customers Public breeders with industry collaborations Industry breeders Any breeder for internal projects only Breeders of crops other than wheat, barley, oats In-house copy for when the US government shuts down

24 My Crop Datafarm A packaged web phenotype/genotype database you can own David E. Matthews, Clay Birkett, Jean-Luc Jannink USDA-ARS, Dept. Plant Breeding and Genetics, Cornell University, Ithaca NY iplant, 12 Feb 2014

25 iplant Image MySQL, R, PHP, Apache, Postfix, git Free Server: 16 cpus, 128 Gbytes memory, 1.2 Tbytes disk Or several smaller servers No knowledge of Unix, MySQL, PHP, R needed for setup, operation, data input or use Command-line via SSH with full root privileges Some drawbacks

26 iplant Image

27 Thanks! Jean-Luc Jannink, Director Victoria C. Blake, Curator Clay Birkett, Programmer Dave Hane, System Administrator Peter Bradbury, Advisor TCAP students!

28 Triticeae Toolbox, Instant Oatmeal, and

29 Fussy stuff All good: Lines, Trials, Markers bothersome plethoration of Traits