Big Data in Agriculture Challenges. Pascal Neveu INRA Montpellier

Similar documents
G20 MACS Agricultural Technology Sharing Platform

The 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

Capability Statement. For more information please contact: Capability Statement

Exploiting Open IoT Data from Smart Cities in the Future Internet Society

Case study Software Services for stakeholders in agriculture

Case study Software Services for stakeholders in agriculture

Data science and precision farming

Big Data in Emergency Informatics Social media data perspective. Rajendra Akerkar

2018 WCO IT Conference and Exhibition, Lima

DATABIO: ADVANCED VISUALISATION OF BIG DATA FOR AGRICULTURE

New approaches for analyzing multi-channel image data and post-processing of phenotypic data

Wong: What Big Data shifts are you seeing taking place among engineering companies?

Phenotypes and genebanks

Genomic Data Is Going Google. Ask Bigger Biological Questions

Connected Geophysical Solutions Real-Time Interactive Seismic Modeling & Imaging

GE Intelligent Platforms. Proficy Historian HD

Building Cyberinfrastructure for Multidisciplinary, Multiscale AgroInformatics

Razvan IONITA 27 Oct 2016 UNIFORMANCE SUITE. Delivers New Process Intelligence Capabilities

(Big) Data Analytics for strain and product development

Building a Generic Virtual Research Environment Framework for Multiple Earth and Space Science Domains and a Diversity of Users

Workflow-Driven Geoinformatics Applications and Training in the Big Data Era

Big Data & Analytics for Wind O&M: Opportunities, Trends and Challenges in the Industrial Internet

The DLR Institute of Data Science

Data Informatics. Seon Ho Kim, Ph.D.

BUSINESS ENTERPRISE AND BIG DATA: EVOLVING DATABASE CHALLENGES AND APPROACHES

Artificial Intelligence and Machine Learning in IoT

Advanced Information Systems Big Data Study for Earth Science

One compelling application area for increased regional resilience is food security.

OpenAlea: Scientific Workflows Combining Data Analysis and Simulation

Agricultural Data Collaboration Platform

Overcoming the Phenotyping Bottleneck

The Internet of Everything and the Research on Big Data. Angelo E. M. Ciarlini Research Head, Brazil R&D Center

MACHINE LEARNING IN THE OIL AND GAS INDUSTRY

SENPAI.

MACHINE LEARNING AND BIGDATA ANALYTICS IN HEALTH CARE SYSTEM - A SURVEY

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud

GGIM: Future Proofing the Provision of Geoinformation - Emerging Technologies: Connecting Place. Steven Hagan, Vice President, Server Technologies

TDWI Analytics Fundamentals. Course Outline. Module One: Concepts of Analytics

A Statistical Process Monitoring Perspective on Big Data

Next-Generation Sequencing Gene Expression Analysis Using Agilent GeneSpring GX

ProteinPilot Report for ProteinPilot Software

INDUSTRY 4.0 BIG DATA. IIoT THE GRID VIEW. Industrial Internet of Things: The Next Frontier for Manufacturing

SDSC 2013 Summer Institute Discover Big Data

Remote sensing: A suitable technology for crop insurance?

Elixir: European Bioinformatics Research Infrastructure. Rolf Apweiler

Processing Data from Next Generation Sequencing

Establishing an Optimal Operational Infrastructure

Bioinformatics Analysis of Nano-based Omics Data

Towards a Reference Model for the LifeWatch ICT Infrastructure

Scanned by CamScanner

DIGITAL AGRICULTURE. Harold van Es TECHNOLOGY INNOVATION IN COMPLEX PRODUCTION ENVIRONMENTS. Cornell University

Deep Learning Acceleration with

Application Migration Patterns for the Service Oriented Cloud

ROAD TO STATISTICAL BIOINFORMATICS CHALLENGE 1: MULTIPLE-COMPARISONS ISSUE

Grand Challenges. C r o p S c i e n c e S o c i e t y o f A m e r i c a. Plant Sciences for a Better World

11. Precision Agriculture

HPC in Bioinformatics and Genomics. Daniel Kahn, Clément Rezvoy and Frédéric Vivien Lyon 1 University & INRIA HELIX team LIP-ENS & INRIA GRAAL team

Red Bull Racing - Maximum. Think Brussels / DOC ID / October 04, 2018 / 2018 IBM Corporation. FilipVan den Neucker. Brussels

Big Data and NASA Space Sciences

Agriculture at the crossroads of IIoT and HPC

RESOLUTE project presentation

An Enterprise Approach to Evaluating Complex Systems Using Big Data Analytics

Managing explosion of data. Cloudera, Inc. All rights reserved.

Data Fusion in Agriculture

Microsoft Azure Essentials

TEXT MINING: THE NEXT DATA FRONTIER

Integrating Configuration Management Into Your Release Automation Strategy

NIEHS and Data Integration

High peformance computing infrastructure for bioinformatics

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

Enterprise Database Systems for Big Data, Big Data Processing, and Big Data Analytics. Sunnie Chung Cleveland State University 1

Setting the scene - recent development and future challanges

POWERED BY. Zenith MMS Media Monitoring and Open Source Intelligence Platform

White Paper: Introduction to CANARIE s Research Software Program

Developments in BioPharma Grid Computing. Rowan Gardner, BioLauncher Ltd

Solution Brief. An Agile Approach to Feeding Cloud Data Warehouses

Adopt a Quantitative Investing approach without recruiting a team of rocket scientists

TechWatch Report. Monitoring Tools. Date: April 2016 Contributors: Abhishek Amralkar, Mayank Gaikwad, Rohit Nerkar, Pulkit Gupta, Mukesh Waghadhare

Research of the Social Media Data Analyzing Platform Based on Cloud Mining Yi-Tang ZENG, Yu-Feng ZHANG, Sheng CAO, Li LI, Cheng-Wei ZHANG *

Real-time Streaming Insight & Time Series Data Analytic For Smart Retail

Creating an Enterprise-class Hadoop Platform Joey Jablonski Practice Director, Analytic Services DataDirect Networks, Inc. (DDN)

From data science to data stories:

IBM Briefing Hva med 2017?

USING HPC CLASS INFRASTRUCTURE FOR HIGH THROUGHPUT COMPUTING IN GENOMICS

Plant cultivation on moving field conveyer system and various imaging platforms in Nanaji Deshmukh Plant Phenomics Centre at ICAR-IARI, New Delhi

2011 Copyrighted. KMRC, HKPolyU

Sentinel-2 for agriculture and land surface monitoring from field level to national scale

We have an opportunity for a creative and motivated Lead Data Scientist to join our Digital team.

From travelling a two-way road to navigating across scales. Claudia Bauzer Medeiros Institute of Computing, UNICAMP Brazil

Interlinked data and models using a semantic approach: example of the RECORD platform in the context of the ANAEE-France project

Find the Information That Matters. Visualize Your Data, Your Way. Scalable, Flexible, Global Enterprise Ready

An overview of wished recording requirements to satisfy to the current evolution of milk recording organizations and selection programs in France

Cloud, Big Data, Android, Struts2, Spring, Java-SEEE

32N2386. Jangwon Gim Sungjoon Lim Hanmin Jung. ISO/IEC JTC1 SC32 Ad-hoc meeting May 29, 2013, Gyeongju Korea

Contents Working with Oracle Primavera Analytics... 5 Legal Notices... 10

Towards Practical Problems in Deep Learning for Radiology Image Analysis

EBOOK: Cloudwick Powering the Digital Enterprise

List of Previously Funded Platforms

Transcription:

Big Data in Agriculture Challenges INRA Montpellier

The rise of Big Data in agriculture More data production from heterogeneous sources 2

The rise of Big Data in agriculture More and more data services and datasets on Web Breeding data Crop data Weather data Soil data Environmental data Genomic data Economic, health etc. 3

The rise of Big Data in agriculture Ontology Common conceptualisation: to link data we need to define concepts and the relations between concepts Linked (Open) Data Cultivated Land Plant Subclass of Is contained contains Plot Agrovoc/FAO Référence (thesaurus/ontologie) 4

The rise of Big Data in agriculture cv 5

The rise of Big Data in agriculture cv KNOWLEDGE 6

The rise of Big Data in agriculture cv KNOWLEDGE 7

Big Data A definition: data sets grow so large and complex that it becomes impossible to process using traditional data processing methods (Management and Analysis) Make data valuable (information, knowledge, decision support) 8

Agronomic Big Data V characteristics Volume: massive data and growing size hard to store, manage and analyze Variety and Complexity: different sources, scales, disciplines different semantics, schemas and formats etc. hard to understand, combine, integrate Velocity: speed of data generation have to be processed on line Veracity, Validity, Vocabulary, Vulnerability, Volatility, Visibility, Visualisation, Vagueness, etc. 9

Why Big Data is important for Agriculture? Production of a lot of heterogeneous data for understanding Open new insights Allow to know: Which theories are consistent and which ones are not! When data did not quite match what we expect Discover patterns, Discover frequent associations ''Big Data'' vs ''Sample and Survey'' data-driven decision support (dynamic, integrative and predictive approaches) 10

Why Big Data is important for Agriculture? Population treatment Some respond and some don t 11

Why Big Data is important for Agriculture? Population treatment Some respond and some don t Individualized treatment All targeted population respond 12

Why Big Data is important for Agriculture? Gather->Organize->Analyse->Understand->Decide Adaptation to climate change Taking into account human health (farmer, consumer) More efficient use of natural resources (including water or soil) in our farming practices Sustainable management, biodiversity and Equity Food security Crop performance (yields are globally decreasing)... 13

Phenome High throughput plant phenotyping French Infrastructure 14

Phenome High throughput plant phenotyping French Infrastructure 9 multi-species platforms 2 green house platforms 5 field platforms 2 omics platforms 15

5 Field Platforms Various scales and data types Cell, organ, plant, population time Images, hyperspectral, spectral, sensors, human readings... Thousands of micro-plots 16

2 Green house Platforms 60 Various scales and data types Plant biomass g 40-1 time 50 30 20 10 0 0 20 40 60 80 100 Time (d) 17

2 «Omics» platforms Various data complex types Composition and structure Quantification of metabolites and enzyme activities Grinding weighting (-80 C) Extraction Fractionation Pipetting Incubation Reading 18

Data management challenges in Phenome: Volume growth 40 Tbytes in 2013, 100 Tbytes in 2014, Volume is a relative concept Exponential growth makes hard Storage Management Analysis Phenome HPC and Storage Cloud and Grid computing (FranceGrille, EGI) On-demand infrastructure and Elasticity Virtualization technologies Data-Based parallelism (same operation on different data) 19

Data management challenges in Phenome: Variety Produced by different communities (geneticians, ecophysiologists, farmers, breeders, etc) Data integration needs extensive connections and associations to other types of data (environments, individuals, populations, etc.) Different semantics, data schemas, Extremely diverse data Approaches: Web API, Ontology sets, NoSQL and Semantic Web methods 20

Data management in Phenome: Velocity Green House platforms produce tens of thousands images/day (200 days/year) Field platforms produce tens of thousands images/day (100 days/year) Omic platforms produce tens of Gbytes/day (300 days/year) Approaches: Scientific Workflow OpenAlea /provenance module (Virtual Plant INRIA team) Scifloware (Zenith INRIA team) 21

Data management challenges in Phenome: Validity Data cleaning Automatically diagnose and manage: Consistency? Duplicates? Wrong? Annotation consistency? Outliers? Disguised missing data?... Approaches (dynamic): Unsupervised Curve clustering (Zenith INRIA team) Curve fitting over dynamic constraints Image Clustering 22

Conclusion Make agricultural data: Findable Accessible Interoperable Reusable (over discipline) Discover knowledge on the world described by data (data mining, integrative methods, etc.) E-Infrastructure Big data is Cultural and technical challenges 23