Agenda: Metabolomics Society September 18, 2012

Similar documents
Transcription:

Agenda: Applications of GC/Q-TOF for Metabolomics Jennifer Gushue, Ph.D. Data-Directed Multi-Omics of Biological Pathways Theodore Sana, Ph.D. Integrated Biology is a Software Challenge Norton Kitagawa, Ph.D. 1 Metabolomics Society September 18, 2012

Diversity 2 Metabolomics Society September 18, 2012

Diversity 3 Metabolomics Society September 18, 2012

Diversity Agilent LC/MS Instrument Portfolio 4 Metabolomics Society September 18, 2012

Olive Oil Classification using the Agilent 7200 Series GC/Q-TOF System UC Davis Olive Center Stephan Baumann, Agilent Technologies 5 September 18, 2012

Olive Oil Demand is Growing The United States market is expected to surpass $1.8 billion by 2013.* Increase interest in Mediterranean foods Health benefits associate with olive oil: Reduced risk of coronary heart disease Rich in antioxidants and anti-inflammatory compounds** * Packaged Facts Market Report ** US FDA 6 September 18, 2012

Olive Oil Standards International Olive Council and USDA Standards: Chemical tests Tasting panel sensory test

Most Common Olive Oil Defects Rancidity A flavor in the olive oil usually accompanied by a greasy mouth feel. In sensory tests, this greasiness is often noticed first. Fusty It is caused by fermentation in the absence of oxygen; this occurs within the olives before they are milled. A fusty smell has been compared to sweaty socks, swampy vegetation, or too-wet compost heap. Winey-vinegary That is caused by fermentation with oxygen, and can be reminiscent of vinegar or nail polish. Musty Caused by moldy olives, it tastes of dusty, musty old clothes, or the basement floor. Limited understanding of organoleptic smell and taste.

More Demand than Supply International Olive Council and USDA Standards: Chemical tests Tasting panel sensory test Sensory tests are expensive and subjective they often fail EVOO sensory test Imported Extra Virgin Olive Oils account for 99% of the US supply.* * UC Davis Olive Center

Possible Solution: Chemical Screening Develop a chemical screen to predict whether an olive oil will pass the sensory test. Allows producers to submit only those olive oils for sensory testing that have a high probability of passing Reduces certification costs Increases the quality of the EVOO available in the marketplace

Olive Oil Characterization 1. To create a model that could predict whether olive oil sample would pass or fail sensory test 2. To find statistically significant olive oil components that are present at distinct levels depending on whether they passed or failed sensory test 11 September 18, 2012

7200 Series GC/Q-TOF High resolution full acquisition spectra Accurate mass measurements Fast acquisition of full spectra MS/MS mode Full spectrum of Product Ions With high resolution and accurate mass High sensitivity structural elucidation tool Ideal tool for solving complex analytical problems 12 September 18, 2012

New Removable Ion Source includes repeller, ion volume, extraction lens and dual filaments 13 September 18, 2012

Hot Quartz Monolithic Quadrupole analyzer identical to the 7000 Quadrupole MS/MS 14 September 18, 2012

Hexapole Collision Cell accelerates ion through the cell to enable faster generation of high-quality MS/MS spectra without cross-talk 15 September 18, 2012

Dual-Stage Ion Mirror improves second-order time focusing for high mass resolution Pre dual-stage ion mirror Post dual-stage ion mirror Proprietary INVAR flight tube sealed in a vacuum-insulated shell eliminates thermal mass drift due to temperature changes to maintain excellent mass accuracy, 24/7. 16 September 18, 2012

4 GHz ADC Electronics enable a high sampling rate (32 Gbit/s) that improve the resolution, mass accuracy, and sensitivity for low-abundance samples. Analog-to-digital (ADC) Detector: Unlike time-to-digital (TDC) detectors which record single ion events, ADC detection records multiple ion events, allowing very accurate mass assignments over a wide mass range and dynamic range of concentrations. 4 GHz ADC electronics enable a high sampling rate (32 Gbit/s) that improves the resolution, mass accuracy, and sensitivity for low-abundance samples. Dual gain amplifiers simultaneously process detector signals through both low-gain and high gain channels, extending the dynamic range to 10 5. 17 September 18, 2012

EXPERIMENTAL DESIGN 18 September 18, 2012

Experimental Design Olive oil samples had been subjected to sensory test and classified as passed or failed. Samples were analyzed on the 7200 GC/Q-TOF. Data was acquired in both EI and PCI modes. Pass/Fail EI/CI MassHunter Qual and MPP MassHunter Qual was used for deconvolution and Library Searches. Mass Profiler Professional (MPP) was used for statistical evaluation of the data including construction of class prediction model to correctly predict whether the sample would pass or fail the sensory test 19 September 18, 2012

MassHunter Qualitative Analysis Deconvolution and Library Searches Identification of Compounds using Library Search 20 September 18, 2012

Mass Profiler Professional Data filtering, PCA, ANOVA, Volcano Plot Mass Profiler Professional (MPP) was used for statistical evaluation of the data including construction of class prediction model to correctly predict whether the sample would pass or fail the sensory test 21 September 18, 2012

RESULTS 22 September 18, 2012

EVOOs that Pass Sensory Evaluation

EVOOs that Pass and Fail Sensory Evaluation This is why we need powerful data analysis software!

Olive Oil Characterization: Data Filtering 442 unique compounds were distinguished by chromatographic deconvolution, most of which occur only once or twice and are filtered out by MPP. The table shows how many of these 442 compounds were actually found in each sample.

Principal Component Analysis is used to Visualize failed passed PCA shows how the pass/fail data clusters.the samples that failed the sensory test are marked in red and the ones that passed are blue.

Fold Change Analysis Compounds accumulated in the samples that failed the sensory test. The Volcano Plot (on the right) shows fold-change for each entity on the x- axis and significance on the y-axis.

Raw Data Verification It pays to go back to the raw data to visually inspect MPP (multivariate statistical) results. Here we see the raw data verification of the peak at 27.54 minutes.

Building the Classification Model Training the model with data: Two data classes where established with the compounds (markers) increased in the failed samples Create a model that predicts whether an olive oil sample will pass the sensory test 29 September 18, 2012

Testing the Model x x x x

Testing the Model x x x x All samples correctly predicted. The samples that were not used for building the prediction model are listed with the training parameter set as None.

Library Searching Compound spectrum (accurate mass) Compound spectrum NIST library spectrum EI Commercial unit mass EI spectral libraries can be searched using accurate mass EI GC/Q-TOF data to identify compounds

MS/MS Analysis α-cubebene, full scan C 15 H 24 100 105 119 161 50 41 55 81 91 69 77 204 133 147 175 189 0 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 (replib) α-cubebene α-cubebene: MS/MS Precursor: 204 CE: 10 ev C 8 H 9-2.63 ppm C 9 H 11-3.58 ppm C 10 H 13 0.93 ppm C 12 H 17 5.11 ppm Accurate masses of ion fragments are consistent with molecular formula

Molecular Structure Correlator

Odor Profile of Compounds Increased in Failed Samples Proposed NIST ID Formula CAS Odor C 16 H 32 O n-hexadecanoic acid 2 57-10-3 Faint Oily1 Octadecanoic acid, ethyl C 20 H 40 O ester 2 111-61-5 Waxy2 Squalene C 30 H 50 111-02-4 Floral 2 α-cubebene C 15 H 24 17699-14-8 Herbal 2 Searching the flavor company catalogs by CAS number provided odor profile information on the up-regulated compounds. 1 Bedoukian Research 2 The Good Scents Company

Summary: A model that predicts the classification of extra virgin olive oils was constructed using data from the 7200 GC/Q-TOF and the MassHunter software suite. High Resolution provides the required selectivity Mass Accuracy is essential for defining compounds MS/MS is critical of structural confirmation of unknowns Comprehensive Software is vital for turning MS data into pertinent and relevant results X 36 September 18, 2012

Data-Directed Multi-Omics of Biological Pathways Theodore Sana, PhD Senior Scientist Life Sciences Group, Agilent Technologies

Metabolomics and Integrated Biology Workflow 6550 QTOF Ion Funnel Technology: General HW overview Untargeted, unlabeled metabolite workflow Metabolite Identification New Software Tools: Personal Compound Databases & Library (PCDL) Pathway Metabolite Database Creator (PMDC) Molecular Structure Correlator (MSC) MS/MS Library Data driven multi-omics analysis for Pathway Mapping and Targeted Proteomics: GeneSpring Pathway Analysis Capabilities

6550 Q-TOF: NEW TECHNOLOGY IN ION SAMPLING IMPROVES SENSITIVITY Jet Stream Technology Hexabore Inlet Capillary Dual Ion Funnel Generating more ions Sampling more ions Focusing more ions

Drastically Improved Metabolite Coverage of Central Carbon Metabolome Map Metabolites Detected 6520 6550 (Courtesy of Prof. Nicola Zamboni, ETH, Zurich) 8-fold increase in coverage

Increasing Your Confidence in Compound Identification 1. MS/MS library and retention time matching 2. MS/MS library matching 3. Molecular Structure Correlation (MSC): a tool that uses compound MS/MS spectral information to predict bonds/compound structure 4. Accurate Mass + RT: Database matching accurate mass with isotope pattern matching and Retention Time 5. Database matching, with isotope pattern matching Confidence 6. Database matching using accurate mass measurement: Untargeted and/or Targeted (PMDC) mining of data

Metabolomics Untargeted & Targeted Data Mining Untargeted Data Acquisition TOF / QTOF Profiling of unknowns ; all detectable metabolites; relative quantitation Untargeted data mining: Samples are typically unlabeled Naïve or discovery based analysis (novel biomarkers/signatures) Finds hundreds/thousands of metabolites Metabolites need to be identified Map metabolites onto biological pathways Targeted data mining: Analyze data files using a database of compound names and target formulas (PCDL) Map results onto pathways

METLIN Personal Compound Database (PCD) An accurate mass LC-MS database Based on public METLIN database Metabolomics specific database Contains > 50,000 compounds ~8000 lipids from LipidMaps 679 metabolites have an Agilent provided retention time Customizable by user Works with other Agilent software MassHunter Qual ID Browser PCDL Manager 9/18/2012

Untargeted Data Mining: MassHunter Molecular Feature Extraction (MFE), a Naïve Peak Finding algorithm Find compound signals Find co-eluting ions that are related Include isotopes ( 13 C, 15 N, 2 H, 18 O) Include adducts, such as Na + or K + Include dimers, such as (2M+H) + Create a compound chromatogram (ECC) Sum all ion signals into one value (Feature) Create a compound spectra Report results as retention time and neutral mass Fully automated processing Create data file for export Neutral Mass = 113.0589

Targeted Data Mining of data acquired in Untargeted mode: Pathway to Metabolite Database Creator (PMDC) Convert pathway metabolite information into Agilent personal compound database Select one or more pathways Remove redundant metabolites

PMDC Search Pathway Database Search Pathway Text search using: Match pathways - Name in pathway Reaction partners Compounds relate to the typed compound

Add Pathways to Create New Database Add Pathways Select pathways from matching pathway list Press add pathway Create database Redundant compounds are removed

METLIN Personal Compound Database & Library (PCDL) 10, 20, 40 ev METLIN PCD plus an accurate mass LC-MS/MS library (QTOF) MS/MS spectra from mono-isotopic ion MS/MS spectra are collected in ESI positive and negative ion mode Fragmentation data is collected at three collision energies: 10, 20 and 40 MS/MS spectra are curated for quality Fragment ions are confirmed Fragment ions are mass corrected Noise ions removed Manually reviewed MS/MS searches use MassHunter Qual MS/MS Library contains > 2000 compounds

MS/MS spectral difference matching for Sample vs Library: x10 2 Cpd 1: -ESI Product Ion (0.444-0.528 min, 4 Scans) Frag=140.0V CID@20.0 (558.0644[z=1] -> **) 346.0542 558.0635 Forward 0.5 0 x10 2 0 78.9593 210.9986 408.0103 290.9646 522.0420 Cpd 1: -ESI Product Ion (0.444-0.528 min, 4 Scans) Frag=140.0V CID@20.0 (558.0644[z=1] -> **) 346.0542 558.0635 78.9593 210.9986 290.9646 408.0103 522.0420-1 x10 2 N1-(5-Phospho-D-ribosyl)-AMP C15H23N5O14P2 - Product Ion Frag=140.0V CID@20.0 Metlin_AM 346.0545 558.0644 0.5 0 78.9591 210.9997 408.0099 290.9676 522.0433 Reverse 50 100 150 200 250 300 350 400 450 500 550 600 Counts vs. Mass-to-Charge (m/z)

MassHunter MS/MS Structural Correlation (MSC) Structure Explained MS/MS Fragmentation Search database of known compounds using empirical formula or mass Database matches must have compound structures Assign fragment ions to substructures of the proposed parent structure Assign probability to fragment forming Calculate a probability the proposed structure fits the MS/MS data Need to confirm via standards Search ChemSpider 30,000,000 entries

GeneSpring: A Bioinformatics Suite of Integrated Modules GX mrna Alternative Splicing microrna Genome-wide association Copy Number Variation NGS SureSelect Target Enrichment Whole Genome Sequencing DNA Variation Chromosomal Rearrangements RNA-Seq Gene Fusion Detection Alternative Splicing MPP MS-Proteomics MS-Metabolomics Integrated Biology Joint Pathway Analysis Computational Network Discovery 9/18/2012

Data Driven multi omics technologies & Pathway mapping LC/MS GC/MS MassHunter Qual/Quant ChemStation AMDIS Microarrays Feature Extraction GeneSpring Platform Biological Pathways NGS Alignment to Reference Genome 52

MOA Results: Venn Diagram of Enriched Pathways 53

Multi-Dataset Visualization & Analysis KEGG1 ChEBI Supported Metabolite Databases: 1. KEGG 2. HMDB 3. LMP 4. ChEBI 5. CAS Pathway Db Map entities KEGG3 BridgeDb Gene, Protein, and Metabolite ID Met1 CAS KEGG1 Met2 ------ ChEBI2 Met3 KEGG3 ------ Protein Databases: 1. Swiss-Prot 2. UniProt 3. UniProt/TrEMBL Gene Databases: Entrez Gene, GenBank, Ensembl, EC Number, RefSeq, UniGene, HUGO, HGNC, EMBL BridgeDb resolves the mapping problem between databases for small molecules, genes, or protein identifiers Table of compounds

Joint Pathway Analysis Tyrosine Metabolism Microarray and Metabolite Data Overlay Active tab Genes Microarray or Metabolite Data Results Heatmap of all pathway entities, dynamically linked to pathway selection for comparative analysis

Pathway-directed re-mining of metabolite data Propose new experiments based on pathway analysis 1. Re-mine originally acquired (or legacy) untargeted metabolomics data based on pathway analysis create db 2. Design new experiments (metabolite, protein or genes) based on pathway results interpretation Build custom metabolite database Spectrum Mill PCDL Export protein IDs to Peptide Selector for targeted MS/MS earray Upload select pathway genes for custom microarray or NGS design

Acknowlegements Steve Fischer (Metabolomics & Proteomics Marketing Manager) Norton Kitagawa (ID Browser, LC/MS: MPP software) D. Benjamin Gordon (GeneSpring-IB architecture/pmdc) Joe Roark and the MH Qual software engineering team Christine Miller (Proteomics Applications) Dave Peterson (GC/MS: MPP software) Michael Janis, Michael Rosenberg (GeneSpring Technical Marketing) Jayati Ghosh and Ashutosh (GeneSpring R&D) Allan Kuchinsky for Cytoscape Plug-in, plus Agilent Labs team Kyu Rhee (Weill Cornell Medical Center)