Towards standard, accessible and reproducible Metabolomics

Size: px
Start display at page:

Download "Towards standard, accessible and reproducible Metabolomics"

Transcription

1 Towards standard, accessible and reproducible Metabolomics Reza Salek PhD Metabolism and Molecular Informatics The European Bioinformatics Institute (EMBL-EBI) The 1st International Electronic Conference on Metabolomics

2 EBI Databases and services Genomes Ensembl Ensembl Genomes EGA Nucleotide sequence ENA Protein activity IntAct, PRIDE Functional genomics ArrayExpress Expression Atlas Protein Sequences UniProt Literature and ontologies PubMC, GO Protein families, motifs and domains InterPro Macromolecular PDBe Pathways Reactome Cheminformatics & Metabolism MetaboLights, ChEBI Chemogenomics ChEMBL Systems BioModels BioSamples

3 Is data growth, FAIR?

4 Metabolomics Standard Initiative (WG) Lives at 5 Workgroups Biological context metadata WG Chemical analysis WG Data processing WG Ontology WG Exchange format WG Roy Goodacre Metabolomics (2014) 10:5-7

5 Data sharing repositories

6 OmicsDI Collection of omics TX PX EGA MX

7 Leading to data discovery

8 OmicsDI

9 Capturing Metadata: ISA-Tab format Developed a user friendly way to capture standards-compliant metadata

10 ISAcreator Using Ontologies

11 Data Standards ; What is XML? XML stands for EXtensible Markup Language XML is a markup language much like HTML XML was designed to carry data, not to display data XML is designed to be self-descriptive NMR analysis All spectra were recorded on a <Varian NMR Instrument> Varian VNMRS 600 NMR Spectrometer </Varian NMR Instrument> operating at a proton NMR frequency of <Irradiation frequency> <Megahertz>MHz</Megahertz> </Irradiation frequency> using a <cryoprobe>5 mm inverse detection cryoprobe</cryoprobe>. <acquisition nucleus>1h</acquisition nucleus> NMR spectra were recorded [ ].

12 Generating ISA-Tab metadata files from metabolomics XML data

13 MetaboLights Study Validation Status MetaboLights - an open-access general-purpose repository for metabolomics studies and associated meta-data. Nucl. Acids Res. (2012) [ doi: /nar/gks1004

14 MetaboLights Study Validation details

15 Tools the way forward! 3

16 Current way and ideal

17 Samples Technical Triplicates QC C5 S3 S7 C1 C10 QC S1 C3 S5 C7 S6 QC.. C5 C5 C5 S3 S3 S DIMS Data Collection Complex analysis pipelines Instrument.RAW files Averaged Transients IRF C5 IRF C5 IRF C5 IRF S3 IRF S3 IRF S3 IRF.. IRF.. IRF.. AT C5 AT C5 AT C5 AT S3 AT S3 AT S3 AT.. AT.. AT.. Apodisation, Zero-filling and FFT TIC Filtering Frequency Spectra FS C5 FS C5 FS C5 FS S3 FS S3 FS S3 FS.. FS.. FS.. Calibrant List Mass Calibration and SIM-stitching Stitched Peak Lists SPL C5 SPL C5 SPL C5 SPL S3 SPL S3 SPL S3 SPL.. SPL.. SPL.. Replicate Filtering Replicate Filtering Replicate Filtering Replicate Filtered Peak Lists RFPL C5 RFPL S3 RFPL.. RFPL blank Sample Filtering Blank Filtering Sample Filtered Peak Matrix SFPM Missing-value Filtering PQN Normalisation Batch Correction Spectral Cleaning SFPM PQN SFPM PQN + BATCH SFPM PQN + BATCH + CLEAN Impute Missing Values using KNN SFPM PQN + KNN SFPM PQN + BATCH + KNN SFPM PQN + BATCH + CLEAN + KNN Glog Transformation SFPM PQN + KNN + GLOG SFPM PQN + BATCH + KNN + GLOG SFPM PQN + BATCH + CLEAN + KNN + GLOG

18 PhenoMeNal - Goal Data Producer Tool maker Infrastructure provider Data container Packaged tool Compute Infrastructure PhenoMeNal VRE Portal

19 Key objectives Understand the computational needs of the Metabolomics Community. Integrate and scale existing Open Source tools into a welltested e-infrastructure.

20 Major revolution

21 Same in software Developer s PI s Cluster Cloud Collaborator s

22 VRE Portal - Three usability rounds - 80% functionality running. - Public instance access. - App Library, hooked to EGI AppDB. - Documentation.

23 MetaboLights The team Kenneth Haug Reza Salek Kalai Jayaseelan Mark Williams Venkata Chandrasekhar Jose Ramon Macias Gonzalez Keeva Cochrane Xuefei Li (MRC) Christoph Steinbeck Jules Griffin (UC/MRC) Previous: Paula de Matos, Mark Rijnbeek, Tejasvi Mahendraker, Pablo Conesa

24 EBI PhenoMeNal The team Kenneth Haug Reza Salek Pablo Moreno Sijin He Christoph Steinbeck Namrata Kale

25 COSMOS consortium

26 PhenoMeNalconsortium

27 Funding and Collaborators