NRSP10 Crop Databases Data Resources Sustainability and Funding Workshop Plant and Animal Genome Conference 2017 San Diego Dorrie Main*, Sook Jung, Jim McFerson, Mike Kahn, Cameron Peace Washington State University dorrie@wsu.edu
What is NRSP10? Na%onal Research Support Project Research that is deemed na%onal in scope that fulfills an unmet na%onal research need Requires par%cipa%on by scien%sts from a large majority of US Land Grant Universi%es Agricultural Experiment Sta%ons NRSP10 is only one of 10 projects approved in the last 100 years
NRSP 10 Vision Enable basic, transla%onal and applied crop research by expanding exis%ng online community databases for underserved crops mostly specialty crops Provide a comprehensive open-source, flexible, resource-efficient database solu%on Tripal Develop a model for long term sustainability of community databases Stakeholder driven and supported
Integrated Data Facilitates Discovery! Basic Science Structure and evolu%on of genomes Gene func%on Gene%c variability Mechanism underlying traits Diversity Genomics Integrated Data & Tools Breeding Genetics Germplasm TranslaRonal Science Trait discovery Marker development Gene%c mapping Breeding values Applied Science U%liza%on of DNA informa%on in breeding decisions
Community Databases Increasingly Important Recent advances in sequencing, genotyping, and phenotyping technologies have led to a paradigm shir in crop science research Big Data driven Individual scienrsts now rournely Sequence and genotype genomes from popula%ons, families, individuals of interest Pursue large-scale gene expression studies Create highly saturated gene%c maps Iden%fy genome wide loci influencing traits of interest Conduct large-scale standardized phenotyping.
The Challenge To con%nue to provide high-quality online community database resources for Rosaceae, Citrus, CoYon, Cool Season Food Legumes and Vaccinium crops databases that are: Easy to use Easy to manage and update Scalable and fast Resource efficient U%lize a common, interoperable pla\orm Self-sustaining in the medium term Easy to adopt for other underserved crops
Database SoluRon Content Management System Drupal modules as web front-end for Chado Chado Generic Database schema
NRSP 10 Specific ObjecRves 1. Expand online community databases currently housing high quality genomic, gene%c and breeding data for Rosacaeae, citrus, coyon, cool season food legumes and Vaccinium crops 2. Develop/Implement a tablet applica%on to collect phenotypic data from field and laboratory studies improve efficiency of data collec%on and ease of upload to breeding programs 3. Develop a Tripal Applica%on Programming Interface for building breeding databases connect private breeding databases with public genomics, gene%cs and breeding data 4. Convert GenSAS, a comprehensive online community genome annota%on and cura%on tool, to Tripal 5. Develop Web Services to promote database interoperability
NRSP10 Crops are Economically Important $3.4 B $12.3 B $6.0 B $1.2 B $0.4 B NRSP10 Databases 2016 > 50,000 users from 160 countries, 500K + pages accessed
Support Highly engaged and suppor%ve user communi%es Con%nuous funding since 2003-2019 (>$14M) o o USDA SCRI ($4.6M) 2009-2019, plus renewal poten%al USDA NRSP10 ($2M) 2015-2019, plus renewal poten%al o NSF PGRP ($4.6M) 2003-2008, 2016-2009 o NSF DIBBS ($1.5 M) 2015-2017 o Industry ($1.75M) 2003-2020 o Land Grant Universi%es (salaries) o Community of developers working on open-source Tripal Database modules Use of an open-source, easy to manage generic database pla\orm helps reduce costs and eases concern by funding sources about database transferability and sustainability (to an extent). Significantly leverages co-development by Drupal and Tripal communi%es.
NRSP 10 Website www.nrsp10.org
www.rosaceae.org
co_ongen.org
coolseasonfoodlegume.org
citrusgenomedb.org
vaccinium.org
Data Host publicly available genomic, gene%c and breeding data for 25 crops Curate gene%cs and breeding data extracted from publica%ons use undergraduates with postdoc oversite of the data Analyze/add informa%on to certain types of genomic data - do not typically curate genomic data but would like to do so.
Tripal.info Use of an open-source, easy to manage system with good support in place
Breeding Information Management System View information on existing crosses Developing a comprehensive breeding information management system in Tripal, that combines the power of a secure breeding program-controlled system with seamless access to all the public genomics, genetics and breeding data available in the public side of the database
www.gensas.org
10 The Team 15 6 11 6 6 7 1 2 2 2 2
The dream!!! NRSP10 scien%sts without up to date, comprehensive databases BuYon-clicking energized NRSP10 scien%sts using up to date databases to enable their research
Acknowledgements Mainlab Bioinforma%cs Team Project copis/pis \GDR (GDR and Citrus); Cacao Genome Database; Pine Genome Sequencing Project; Genome Database for Vaccinium; Cool Season Food Legume Database; CoYonGen Rosaceae, Citrus, Cacao, Blueberry, Legume, CoYon and Bioinforma%cs Communi%es USDA NIFA SCRI, NSF Plant Genome Program, USDA-ARS, SAAEDS, Mars Inc, Washington Tree Fruit Research Commission, CoYon Incorporated, USA Dry Pea and Len%l Commission, Northern Pulse Growers US Land Grant University researchers and extension agents
Thanks for your ayen%on and support