Agenda: Metabolomics Society September 18, 2012

Size: px
Start display at page:

Download "Agenda: Metabolomics Society September 18, 2012"

Transcription

1 Agenda: Applications of GC/Q-TOF for Metabolomics Jennifer Gushue, Ph.D. Data-Directed Multi-Omics of Biological Pathways Theodore Sana, Ph.D. Integrated Biology is a Software Challenge Norton Kitagawa, Ph.D. 1 Metabolomics Society September 18, 2012

2 Diversity 2 Metabolomics Society September 18, 2012

3 Diversity 3 Metabolomics Society September 18, 2012

4 Diversity Agilent LC/MS Instrument Portfolio 4 Metabolomics Society September 18, 2012

5 Olive Oil Classification using the Agilent 7200 Series GC/Q-TOF System UC Davis Olive Center Stephan Baumann, Agilent Technologies 5 September 18, 2012

6 Olive Oil Demand is Growing The United States market is expected to surpass $1.8 billion by 2013.* Increase interest in Mediterranean foods Health benefits associate with olive oil: Reduced risk of coronary heart disease Rich in antioxidants and anti-inflammatory compounds** * Packaged Facts Market Report ** US FDA 6 September 18, 2012

7 Olive Oil Standards International Olive Council and USDA Standards: Chemical tests Tasting panel sensory test

8 Most Common Olive Oil Defects Rancidity A flavor in the olive oil usually accompanied by a greasy mouth feel. In sensory tests, this greasiness is often noticed first. Fusty It is caused by fermentation in the absence of oxygen; this occurs within the olives before they are milled. A fusty smell has been compared to sweaty socks, swampy vegetation, or too-wet compost heap. Winey-vinegary That is caused by fermentation with oxygen, and can be reminiscent of vinegar or nail polish. Musty Caused by moldy olives, it tastes of dusty, musty old clothes, or the basement floor. Limited understanding of organoleptic smell and taste.

9 More Demand than Supply International Olive Council and USDA Standards: Chemical tests Tasting panel sensory test Sensory tests are expensive and subjective they often fail EVOO sensory test Imported Extra Virgin Olive Oils account for 99% of the US supply.* * UC Davis Olive Center

10 Possible Solution: Chemical Screening Develop a chemical screen to predict whether an olive oil will pass the sensory test. Allows producers to submit only those olive oils for sensory testing that have a high probability of passing Reduces certification costs Increases the quality of the EVOO available in the marketplace

11 Olive Oil Characterization 1. To create a model that could predict whether olive oil sample would pass or fail sensory test 2. To find statistically significant olive oil components that are present at distinct levels depending on whether they passed or failed sensory test 11 September 18, 2012

12 7200 Series GC/Q-TOF High resolution full acquisition spectra Accurate mass measurements Fast acquisition of full spectra MS/MS mode Full spectrum of Product Ions With high resolution and accurate mass High sensitivity structural elucidation tool Ideal tool for solving complex analytical problems 12 September 18, 2012

13 New Removable Ion Source includes repeller, ion volume, extraction lens and dual filaments 13 September 18, 2012

14 Hot Quartz Monolithic Quadrupole analyzer identical to the 7000 Quadrupole MS/MS 14 September 18, 2012

15 Hexapole Collision Cell accelerates ion through the cell to enable faster generation of high-quality MS/MS spectra without cross-talk 15 September 18, 2012

16 Dual-Stage Ion Mirror improves second-order time focusing for high mass resolution Pre dual-stage ion mirror Post dual-stage ion mirror Proprietary INVAR flight tube sealed in a vacuum-insulated shell eliminates thermal mass drift due to temperature changes to maintain excellent mass accuracy, 24/7. 16 September 18, 2012

17 4 GHz ADC Electronics enable a high sampling rate (32 Gbit/s) that improve the resolution, mass accuracy, and sensitivity for low-abundance samples. Analog-to-digital (ADC) Detector: Unlike time-to-digital (TDC) detectors which record single ion events, ADC detection records multiple ion events, allowing very accurate mass assignments over a wide mass range and dynamic range of concentrations. 4 GHz ADC electronics enable a high sampling rate (32 Gbit/s) that improves the resolution, mass accuracy, and sensitivity for low-abundance samples. Dual gain amplifiers simultaneously process detector signals through both low-gain and high gain channels, extending the dynamic range to September 18, 2012

18 EXPERIMENTAL DESIGN 18 September 18, 2012

19 Experimental Design Olive oil samples had been subjected to sensory test and classified as passed or failed. Samples were analyzed on the 7200 GC/Q-TOF. Data was acquired in both EI and PCI modes. Pass/Fail EI/CI MassHunter Qual and MPP MassHunter Qual was used for deconvolution and Library Searches. Mass Profiler Professional (MPP) was used for statistical evaluation of the data including construction of class prediction model to correctly predict whether the sample would pass or fail the sensory test 19 September 18, 2012

20 MassHunter Qualitative Analysis Deconvolution and Library Searches Identification of Compounds using Library Search 20 September 18, 2012

21 Mass Profiler Professional Data filtering, PCA, ANOVA, Volcano Plot Mass Profiler Professional (MPP) was used for statistical evaluation of the data including construction of class prediction model to correctly predict whether the sample would pass or fail the sensory test 21 September 18, 2012

22 RESULTS 22 September 18, 2012

23 EVOOs that Pass Sensory Evaluation

24 EVOOs that Pass and Fail Sensory Evaluation This is why we need powerful data analysis software!

25 Olive Oil Characterization: Data Filtering 442 unique compounds were distinguished by chromatographic deconvolution, most of which occur only once or twice and are filtered out by MPP. The table shows how many of these 442 compounds were actually found in each sample.

26 Principal Component Analysis is used to Visualize failed passed PCA shows how the pass/fail data clusters.the samples that failed the sensory test are marked in red and the ones that passed are blue.

27 Fold Change Analysis Compounds accumulated in the samples that failed the sensory test. The Volcano Plot (on the right) shows fold-change for each entity on the x- axis and significance on the y-axis.

28 Raw Data Verification It pays to go back to the raw data to visually inspect MPP (multivariate statistical) results. Here we see the raw data verification of the peak at minutes.

29 Building the Classification Model Training the model with data: Two data classes where established with the compounds (markers) increased in the failed samples Create a model that predicts whether an olive oil sample will pass the sensory test 29 September 18, 2012

30 Testing the Model x x x x

31 Testing the Model x x x x All samples correctly predicted. The samples that were not used for building the prediction model are listed with the training parameter set as None.

32 Library Searching Compound spectrum (accurate mass) Compound spectrum NIST library spectrum EI Commercial unit mass EI spectral libraries can be searched using accurate mass EI GC/Q-TOF data to identify compounds

33 MS/MS Analysis α-cubebene, full scan C 15 H (replib) α-cubebene α-cubebene: MS/MS Precursor: 204 CE: 10 ev C 8 H ppm C 9 H ppm C 10 H ppm C 12 H ppm Accurate masses of ion fragments are consistent with molecular formula

34 Molecular Structure Correlator

35 Odor Profile of Compounds Increased in Failed Samples Proposed NIST ID Formula CAS Odor C 16 H 32 O n-hexadecanoic acid Faint Oily1 Octadecanoic acid, ethyl C 20 H 40 O ester Waxy2 Squalene C 30 H Floral 2 α-cubebene C 15 H Herbal 2 Searching the flavor company catalogs by CAS number provided odor profile information on the up-regulated compounds. 1 Bedoukian Research 2 The Good Scents Company

36 Summary: A model that predicts the classification of extra virgin olive oils was constructed using data from the 7200 GC/Q-TOF and the MassHunter software suite. High Resolution provides the required selectivity Mass Accuracy is essential for defining compounds MS/MS is critical of structural confirmation of unknowns Comprehensive Software is vital for turning MS data into pertinent and relevant results X 36 September 18, 2012

37 Data-Directed Multi-Omics of Biological Pathways Theodore Sana, PhD Senior Scientist Life Sciences Group, Agilent Technologies

38 Metabolomics and Integrated Biology Workflow 6550 QTOF Ion Funnel Technology: General HW overview Untargeted, unlabeled metabolite workflow Metabolite Identification New Software Tools: Personal Compound Databases & Library (PCDL) Pathway Metabolite Database Creator (PMDC) Molecular Structure Correlator (MSC) MS/MS Library Data driven multi-omics analysis for Pathway Mapping and Targeted Proteomics: GeneSpring Pathway Analysis Capabilities

39 6550 Q-TOF: NEW TECHNOLOGY IN ION SAMPLING IMPROVES SENSITIVITY Jet Stream Technology Hexabore Inlet Capillary Dual Ion Funnel Generating more ions Sampling more ions Focusing more ions

40 Drastically Improved Metabolite Coverage of Central Carbon Metabolome Map Metabolites Detected (Courtesy of Prof. Nicola Zamboni, ETH, Zurich) 8-fold increase in coverage

41 Increasing Your Confidence in Compound Identification 1. MS/MS library and retention time matching 2. MS/MS library matching 3. Molecular Structure Correlation (MSC): a tool that uses compound MS/MS spectral information to predict bonds/compound structure 4. Accurate Mass + RT: Database matching accurate mass with isotope pattern matching and Retention Time 5. Database matching, with isotope pattern matching Confidence 6. Database matching using accurate mass measurement: Untargeted and/or Targeted (PMDC) mining of data

42 Metabolomics Untargeted & Targeted Data Mining Untargeted Data Acquisition TOF / QTOF Profiling of unknowns ; all detectable metabolites; relative quantitation Untargeted data mining: Samples are typically unlabeled Naïve or discovery based analysis (novel biomarkers/signatures) Finds hundreds/thousands of metabolites Metabolites need to be identified Map metabolites onto biological pathways Targeted data mining: Analyze data files using a database of compound names and target formulas (PCDL) Map results onto pathways

43 METLIN Personal Compound Database (PCD) An accurate mass LC-MS database Based on public METLIN database Metabolomics specific database Contains > 50,000 compounds ~8000 lipids from LipidMaps 679 metabolites have an Agilent provided retention time Customizable by user Works with other Agilent software MassHunter Qual ID Browser PCDL Manager 9/18/2012

44 Untargeted Data Mining: MassHunter Molecular Feature Extraction (MFE), a Naïve Peak Finding algorithm Find compound signals Find co-eluting ions that are related Include isotopes ( 13 C, 15 N, 2 H, 18 O) Include adducts, such as Na + or K + Include dimers, such as (2M+H) + Create a compound chromatogram (ECC) Sum all ion signals into one value (Feature) Create a compound spectra Report results as retention time and neutral mass Fully automated processing Create data file for export Neutral Mass =

45 Targeted Data Mining of data acquired in Untargeted mode: Pathway to Metabolite Database Creator (PMDC) Convert pathway metabolite information into Agilent personal compound database Select one or more pathways Remove redundant metabolites

46 PMDC Search Pathway Database Search Pathway Text search using: Match pathways - Name in pathway Reaction partners Compounds relate to the typed compound

47 Add Pathways to Create New Database Add Pathways Select pathways from matching pathway list Press add pathway Create database Redundant compounds are removed

48 METLIN Personal Compound Database & Library (PCDL) 10, 20, 40 ev METLIN PCD plus an accurate mass LC-MS/MS library (QTOF) MS/MS spectra from mono-isotopic ion MS/MS spectra are collected in ESI positive and negative ion mode Fragmentation data is collected at three collision energies: 10, 20 and 40 MS/MS spectra are curated for quality Fragment ions are confirmed Fragment ions are mass corrected Noise ions removed Manually reviewed MS/MS searches use MassHunter Qual MS/MS Library contains > 2000 compounds

49 MS/MS spectral difference matching for Sample vs Library: x10 2 Cpd 1: -ESI Product Ion ( min, 4 Scans) Frag=140.0V CID@20.0 ( [z=1] -> **) Forward x Cpd 1: -ESI Product Ion ( min, 4 Scans) Frag=140.0V CID@20.0 ( [z=1] -> **) x10 2 N1-(5-Phospho-D-ribosyl)-AMP C15H23N5O14P2 - Product Ion Frag=140.0V CID@20.0 Metlin_AM Reverse Counts vs. Mass-to-Charge (m/z)

50 MassHunter MS/MS Structural Correlation (MSC) Structure Explained MS/MS Fragmentation Search database of known compounds using empirical formula or mass Database matches must have compound structures Assign fragment ions to substructures of the proposed parent structure Assign probability to fragment forming Calculate a probability the proposed structure fits the MS/MS data Need to confirm via standards Search ChemSpider 30,000,000 entries

51 GeneSpring: A Bioinformatics Suite of Integrated Modules GX mrna Alternative Splicing microrna Genome-wide association Copy Number Variation NGS SureSelect Target Enrichment Whole Genome Sequencing DNA Variation Chromosomal Rearrangements RNA-Seq Gene Fusion Detection Alternative Splicing MPP MS-Proteomics MS-Metabolomics Integrated Biology Joint Pathway Analysis Computational Network Discovery 9/18/2012

52 Data Driven multi omics technologies & Pathway mapping LC/MS GC/MS MassHunter Qual/Quant ChemStation AMDIS Microarrays Feature Extraction GeneSpring Platform Biological Pathways NGS Alignment to Reference Genome 52

53 MOA Results: Venn Diagram of Enriched Pathways 53

54 Multi-Dataset Visualization & Analysis KEGG1 ChEBI Supported Metabolite Databases: 1. KEGG 2. HMDB 3. LMP 4. ChEBI 5. CAS Pathway Db Map entities KEGG3 BridgeDb Gene, Protein, and Metabolite ID Met1 CAS KEGG1 Met ChEBI2 Met3 KEGG Protein Databases: 1. Swiss-Prot 2. UniProt 3. UniProt/TrEMBL Gene Databases: Entrez Gene, GenBank, Ensembl, EC Number, RefSeq, UniGene, HUGO, HGNC, EMBL BridgeDb resolves the mapping problem between databases for small molecules, genes, or protein identifiers Table of compounds

55 Joint Pathway Analysis Tyrosine Metabolism Microarray and Metabolite Data Overlay Active tab Genes Microarray or Metabolite Data Results Heatmap of all pathway entities, dynamically linked to pathway selection for comparative analysis

56 Pathway-directed re-mining of metabolite data Propose new experiments based on pathway analysis 1. Re-mine originally acquired (or legacy) untargeted metabolomics data based on pathway analysis create db 2. Design new experiments (metabolite, protein or genes) based on pathway results interpretation Build custom metabolite database Spectrum Mill PCDL Export protein IDs to Peptide Selector for targeted MS/MS earray Upload select pathway genes for custom microarray or NGS design

57 Acknowlegements Steve Fischer (Metabolomics & Proteomics Marketing Manager) Norton Kitagawa (ID Browser, LC/MS: MPP software) D. Benjamin Gordon (GeneSpring-IB architecture/pmdc) Joe Roark and the MH Qual software engineering team Christine Miller (Proteomics Applications) Dave Peterson (GC/MS: MPP software) Michael Janis, Michael Rosenberg (GeneSpring Technical Marketing) Jayati Ghosh and Ashutosh (GeneSpring R&D) Allan Kuchinsky for Cytoscape Plug-in, plus Agilent Labs team Kyu Rhee (Weill Cornell Medical Center)