ELIXIR: data for molecular biology and points of entry for marine scientists

Similar documents
Tools, techniques and infrastructure bioinformatics in marine biotechnology

Elixir: European Bioinformatics Research Infrastructure. Rolf Apweiler

WP4: Data services and reporting standards

Introduction to ELIXIR Andy Smith, ELIXIR Hub 18 March 2015 Wageningen

European Genome phenome Archive at the European Bioinformatics Institute. Helen Parkinson Head of Molecular Archives

Elixir: Overview, Progress and Futures

Introduction to EMBL-EBI.

ELIXIR KPI framework and alignment of KPIs between e-infrastructures and research infrastructures

Global Biomolecular Information Infrastructure and Australia. Graham Cameron Director The EMBL Australia Bioinformatics Resource

Towards standard, accessible and reproducible Metabolomics

ELIXIR-GREECE Babis Savakis ELIXIR-GREECE kick-off meeting NHRF, Athens 18 May 2018

Introduc)on to Databases and Resources Biological Databases and Resources

Computational Challenges in Life Sciences Research Infrastructures

ELIXIR Service portfolio management

ELIXIR connects national centres and EMBL EBI to build a sustainable European infrastructure for biological research data. medicine.

Access to Information from Molecular Biology and Genome Research

Professor Ewan Birney FRS Director, EMBL-EBI

EMBL-EBI Overview EMBL-EBI Overview

The Gene Ontology Annotation (GOA) project application of GO in SWISS-PROT, TrEMBL and InterPro

ELE4120 Bioinformatics. Tutorial 5

all official ACS meetings and events without the express written consent from the ACS.

NCBI web resources I: databases and Entrez

Two Mark question and Answers

Chapter 2: Access to Information

Interpreting Genome Data for Personalised Medicine. Professor Dame Janet Thornton EMBL-EBI

Gene-centered resources at NCBI

The EMBL-Bioinformatics and Data-Intensive Informatics

Clustering and scoring molecular interactions

Minimum Information About a Microarray Experiment (MIAME) Successes, Failures, Challenges

Translating Biological Data Sets Into Linked Data

ELIXIR: The Federated data infrastructure for Europe s lifescience

Nordic Register and Biobank Data

Dr. Thibaud Mascart. ERA-MBT final conference Oceans of opportunities. Oslo, 20 Nov 2017

Introduction to BIOINFORMATICS

Ensembl and ENA. High level overview and use cases. Denise Carvalho-Silva. Ensembl Outreach Team

An integrative academic-industry collaboration for biotechnology driven discovery of new functions in marine environmental metagenomes

The European Bioinformatics Institute - Cambridge. Overview

Genetics and Bioinformatics

BIMM 143: Introduction to Bioinformatics (Winter 2018)

Bioinformatics: Present & Future from an EBI Perspective. Janet Thornton. UCL February 14 th 2012

Deliverable 1.2 Detailed risk management plan

The RNA tools registry

ArrayExpress: Quick tour

Introduction to 'Omics and Bioinformatics

EMBL-EBI and pan-national scale bioinformatics: Relevance to Australia

Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015

B I O I N F O R M A T I C S

I AM NOT A METAGENOMIC EXPERT. I am merely the MESSENGER. Blaise T.F. Alako, PhD EBI Ambassador

Large Scale Enzyme Func1on Discovery: Sequence Similarity Networks for the Protein Universe

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005

Biology 644: Bioinformatics

EBI web resources I: databases and tools. Yanbin Yin Spring 2013

Sequence Based Function Annotation

INNOVATIVE CONCEPTS ON DATA GENERATION AND USE FOR PERSONALISED MEDICINE RESEARCH

Protein Bioinformatics Part I: Access to information

Bioinformatics and computational tools

METHOD ARTICLE Identifying ELIXIR Core Data Resources [version 1; referees: 2 approved]

APPLIED BLUE BIOTECHNOLOGY MASTER

Bioinformatics for Cell Biologists

A framework for quality management in the biomedical research infrastructures (BMS RIs)

Horizon 2020 Call Fiche 2014: Food Security*

Q4. WORK STREAMS Discovery, Cloud, Large Scale Genomics. DRIVER PROJECTS ELIXIR Beacon, EVA/EGA/ENA. WORK STREAMS Discovery

Course Presentation. Ignacio Medina Presentation

Types of Databases - By Scope

The Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica

G E N OM I C S S E RV I C ES

EMBL-EBI Industry Programme

Applied Blue Biotechnology Master

Perspectives on the Priorities for Bioinformatics Education in the 21 st Century

dixa-a data infrastructure for chemical safety Jos Kleinjans Dept of Toxicogenomics Maastricht University

High peformance computing infrastructure for bioinformatics

Ethical Governance Framework

Food Safety (Bio-)Informatics

The University of California, Santa Cruz (UCSC) Genome Browser

ONLINE BIOINFORMATICS RESOURCES

Array-Ready Oligo Set for the Rat Genome Version 3.0

Analysis of Microarray Data

Polo Bibliotecario di Scienze, Farmacologia e Scienze Farmaceutiche. Scuola di Dottorato di Ricerca in Scienze Molecolari

Introduction and Public Sequence Databases. BME 110/BIOL 181 CompBio Tools

Open Data Open Access to Scientific Data

Since 2002 a merger and collaboration of three databases: Swiss-Prot & TrEMBL

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

TEXT MINING: THE NEXT DATA FRONTIER

Introduction to Bioinformatics

Function Prediction of Proteins from their Sequences with BAR 3.0

Analysis of Microarray Data

Evolutionary Genetics. LV Lecture with exercises 6KP. Databases

DNA. Clinical Trials. Research RNA. Custom. Reports CLIA CAP GCP. Tumor Genomic Profiling Services for Clinical Trials

Korilog. high-performance sequence similarity search tool & integration with KNIME platform. Patrick Durand, PhD, CEO. BIOINFORMATICS Solutions

This software/database/presentation is a "United States Government Work" under the terms of the United States Copyright Act. It was written as part

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

Introduction to Bioinformatics

BGGN 213: Foundations of Bioinformatics (Fall 2017)

Introduction to Bioinformatics

Databases/Resources on the web

Planning a national bioinformatics infrastructure investment

ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data

The Open PHACTS Discovery Platform. Semantic Data Integration for Life Sciences

DNA. bioinformatics. genomics. personalized. variation NGS. trio. custom. assembly gene. tumor-normal. de novo. structural variation indel.

Leveraging Open Chemogenomics Data and Tools with KNIME

Transcription:

ELIXIR: data for molecular biology and points of entry for marine scientists Guy Cochrane, EMBL-EBI EuroMarine 2018 General Assembly meeting 17-18 January 2018 www.elixir-europe.org

Scales of molecular biology data Credits: Applied Biosystems Oxford Nanopore Technologies https://universe-review.ca/r11-16-dnasequencing.htm Source: Charles E. Cook et al. Nucl. Acids Res. 2016;44:D20-D26 Source: http://omicsmaps.com

Data resources in life science Data resources in the life sciences 1800 Diverse Plentiful Dispersed databases Nucleic Acids Research annual Database Issue and the NAR online Molecular Biology Database Collection in 2012. MY Galperin, GR Cochrane Nucleic Acids Research, 2011

ELIXIR Membership

ELIXIR: European infrastructure for biological information Data infrastructure for Europe s life-science research: Data Interoperability Tools Compute Marine metagenomics Crop and forest plants Human data Rare diseases Training @ELIXIREurope www.elixir-europe.org

ELIXIR Services Data deposition: ENA, EGA, PDBe, EuropePMC, Data management: Genome annotation Data management plans Added value data: UniProt, Ensembl, OrphaNet, Data Interoperability: BioSharing, identifiers.org and OLS Compute: Secure data transfer, cloud computing, AAI Bioinformatics tools: Bio.tools Industry: Innovation and SME programme Bespoke collaborations Training: TeSS, Data Carpentry, elearning

Core Data Resources a data infrastructure Scientific focus and quality of science Curation level, benchmarking Community served by the resource Web statistics Quality of service Uptime, user support and training Legal and funding infrastructure Institutional support, use policy Impact and translational stories Foundational role Initial set of Core Data Resources ArrayExpress ChEBI ChEMBL EGA ENA Ensembl Ensembl Genomes Europe PMC Human Protein Atlas The IMEx Consortium (IntAct and MINT) InterPro PDBe PRIDE STRING db UniProt Durinx C, McEntyre J, Appel R et al. Identifying ELIXIR Core Data Resources F1000Research 2016, 5(ELIXIR):2422

Core resource example: European Nucleotide Archive (ENA) Globally comprehensive scientific record and European node of INSDC A broad platform for the management, sharing, integration and dissemination of sequence data Established in the early 1980s, extended for new technologies and applications Connectivity with broader EMBL-EBI resources Sequence data foundation Sustained within EMBL-EBI under EMBL funding with additional support from EC, UK Research councils, Wellcome Trust, etc. Substantial scale: 1 submission every 6 minutes, 1.3 petabase pairs across 1.5 million taxa, 2,000-5,000 active data providers, global consumer userbase http://www.ebi.ac.uk/ena/ Rich submission, discovery and retrieval software, tools and services

Use Case: Marine metagenomics Mission: develop sustainable metagenomics infrastructure to enhance research and innovation Actions: Develop standards for the marine domain Develop databases specific to marine metagenomics Implement tools and pipelines for metagenomics analysis Develop search engine for interrogation of marine metagenomics datasets Develop and deliver training for end users https://www.elixir-europe.org/use-cases/marine-metagenomics Nils P. Willassen, NO & Rob Finn, EBI

Tara Oceans Tara Oceans expeditions: expeditions: facts and facts figures and figures 35,000 samples Scientists on the expedition helped collect, pack and ship around 35,000 samples of plankton and water. 7012 datasets One of the richest molecular sample collections in the public domain - freely available to everyone. 11,535 gigabytes Size of the Tara datasets in the European Nucleotide Archive as of May 2015. This represents 12,581 gigabases - roughly equivalent to 135 fully sequenced human genomes. Unlimited Potential to discover new things about life in the world s oceans. Tara data: www.ebi.ac.uk/services/tara-oceans-data 40 million genes Largest-ever DNA sequencing effort for ocean science. Genetic sequences collected could represent tens of thousands of new species and ecosystem interactions. Considering the size of the world s oceans, there is much, much more to discover.

EMBRIC European Marine Biological Research Infrastructure Cluster to promote the Blue Bioeconomy Objectives Coherent chains of high quality services across RIs Strengthen connection between science and industry Defragment communities TNA - 15 February 2018 opening of the 2 nd call (http://www.embric.eu/access/ta) Consortium Led by UPMC Bernard Kloareg 27 partners INFRADEV-4 RI cluster call

Online Configurator Form EMBRIC Configurator Marine data consultancy: the EMBRIC Configurator CCMAR MBA SZN Do you have marine data, metadata and/or samples with a biomolecular focus? Do you need help with management, analysis and sharing of your data?? Omics Samples Standards Other @ The Configurator provides free data management consultancy to assist marine scientists with their (data) needs around biotechnology and biomolecular work. The Configurator offers help on: data management data QC & curation standards & compliance data analysis cloud compute sustainability and much more! Do you have doubts about whether the Configurator consultancy suits your needs? Get in touch and we will get back to you: annalisa@ebi.ac.uk Recommendations and tools External case consultant advice Call presentation and Final case report For more information visit: www.embric.eu/configurator Case officer and configurator team This project has received funding from the European Union s Horizon 2020 research and innovation programme under grant agreement No 654008 Guidance on data management 22 registered consultants 2 ongoing cases 4 complete cases Call for: More consultants More cases

EMBRIC Configurator examples Sustainability: Long-term operation of domain-specific data resource on fish species Data dissemination: Pseudo-nitzschia multistriata genomics, transcriptomics and gene expression Data access: Small molecule activities with respect to strains and genes

Acknowledgements www.elixir-europe.org ELIXIR-EXCELERATE is funded by the European Commission within the Research Infrastructures programme of Horizon 2020, grant agreement number 676559. www.embric.eu/ www.embric.eu/configurator EMBRIC has received funding from the European Union s Horizon 2020 research and innovation programme under grant agreement No 654008 www.ebi.ac.uk/ena/ www.ebi.ac.uk/services/tara-oceans-data

Talk to us Annalisa Milano Data Coordination Biocurator Jeena Rajan Sequence Data Biocurator Guy Cochrane Team Leader