Gas Chromatography-Mass Spectrometry

Size: px
Start display at page:

Download "Gas Chromatography-Mass Spectrometry"

Transcription

1 Standardizing Gas Chromatography-Mass Spectrometry Metabolomics Maria I Klapa Metabolic Engineering and Systems Biology Laboratory Institute of Chemical Engineering and High Temperature Chemical Processes, Foundation for Research and Technology-Hellas (FORTH), Patras, GREECE

2 Metabolomic Profiling: a multi-step procedure Free Metabolite Pool Extraction Data Acquisition Polar Non - Polar NMR GC-MS LC-MS CE-MS Metabolite Identification Data Analysis Pool Quantification Multivariate Statistical Analysis

3 Schematic Diagram of Metabolomic Analysis Biological Sample Dried Metabolite Mixture Biological Conclusions List of Marker Ion Peak Areas Peak Area Profile

4 Schematic Diagram of Metabolomic Analysis Biological Sample Dried Metabolite Mixture Original Metabolite j Concentration = RF j j ( MI i ) Measured Marker Ion i Peak Area Biological Conclusions List of Marker Ion Peak Areas Peak Area Profile

5 Internal Standard Normalization only biases that change RF to the same extent for all metabolites (Type A) might be present eg variation in the injected volumes, variation in drying, variation in replicate division, & Equipment s operating conditions remain constant among runs Original Metabolite j Concentration = RF j j ( MI i ) Measured Marker Ion i Peak Area Internal Standard (IS) Original Concentration = RF IS ( MI k RPA j IS ) Measured IS Marker Ion k Peak Area ratio between 2 states (Metabolite s j Concentration) = ratio (RPA j )

6 Biological Sample Schematic Diagram of Metabolomic Analysis Dried Metabolite Mixture GC-MS Mixture of Metabolite Derivatives Biological Conclusions List of Derivative Marker Ion Peak Areas Derivative Peak Area Profile

7 From Original Metabolite to Derivative Peak Area concentration of the original metabolite concentration of a derivative of the original metabolite Measured peak area of the derivative s marker ion(s) Derivative s I concentration = RF l l ( MI h ) Measured Marker Ion h Peak Area Internal Standard (IS) Original Concentration = RF IS ( MI k IS ) Measured IS Marker Ion k Peak Area RPA deriv l of Mj ratio between 2 states (Mj Concentration)? ratio(rpa deriv l of Mj )

8 Type B Biases Incomplete derivatization Multiple Derivatives for some Metabolites Potential Change in Equipment s Conditions between Runs Need for a NEW Data Normalization, Correction and Validation Strategy not jeopardizing the high-throughput nature of metabolomic analysis Η Κanani and MI Klapa # 2007 Data Correction Strategy for Metabolomics Analysis using Gas Chromatography-Mass Spectrometry, Metabolic Engineering Vol9:39-5

9 TMS and MeOX Derivatization R C=O R2 + Methoxyamine HCL R C=N-O-CH 3 R2 R C=N-O-CH 3 R2 syn anti R-COOH R-OH R-NH 2 MSTFA R-COO-Si(CH 3 ) 3 R-O-Si(CH 3 ) 3 R-NH-Si(CH 3 ) 3

10 Metabolite Category M + MSTFA k MD SILYLATION concentration silylation time for t > t M [ M ] = [ MD] = w MD * RPAMD w = MD RF RF IS l MD j

11 M + Metabolite Category 2 k 2 ox Methoxy k MD amine MD 2 ox (+MSTFA) k 3 (+MSTFA) k 3 MD MD 2 concentration [ M ] = [ ] + [ ] MD MD 2 SILYLATION silylation time [ MD ] k w MD * RPA MD = = k [ ] 0 = MD2 k2 w MD * RPA 2 MD2

12 M + Metabolite Category 2 k 2 ox Methoxy k MD amine MD 2 ox (+MSTFA) k 3 (+MSTFA) k 3 MD MD 2 concentration [ M ] = [ ] + [ ] MD MD 2 SILYLATION silylation time [ MD ] k w MD * RPA MD Data = Validation = k [ ] 0 = Criterion! MD2 k2 w MD * RPA 2 MD2

13 Published Metabolomic Analysis based on Metabolomic Data Acquired at Different Equipment conditions 08 Glucose Fructose MD / MD Injection Number

14 Metabolite Category 3 M + MSTFA k M(TMS) x M(TMS) x+n (+MSTFA) M(TMS) x+ k concentration silylation time SILYLATION n [ M] = [ MD ] = i i = n i = w MD i * RPA MD i

15 Raw Data from Standard Amino Acid Mixture 0000 Relative Peak Area Derivatization Time Peak Area Variation with derivatization time among replicates of the same sample 5-00%

16 New Normalization Algorithm = ] [ ] [ ] [ ] [ o o o o M M MD t MD t MD t MD t IS M IS M w w RPA RPA RPA RPA N N V N V # of derivatives # of timepoints : relative (with respect to the peak area of the internal standard) peak area corresponding to the i-th derivative of M metabolite at derivatization time t j MD t RPA Η Κanani and MI Klapa # 2007 Data Correction Strategy for Metabolomics Analysis using Gas Chromatography-Mass Spectrometry, Metabolic Engineering Vol9:39-5 US Patent Application No /362,77 Best University of Maryland Invention of the Year 2005 in Information Sciences

17 Normalized Data from Standard Amino Acid Mixture Relative Peak Area Effective Peak Area Derivatization Time Derivatization Time Peak Area Variation with derivatization time among replicates of the same sample dropped from 5-00% to 2-8%

18 Category - and 2 Metabolites citrate TMS sorbitol TMS iso-citrate TMS ribitol 5 TMS threonate TMS fumarate TMS Glycerol 3TMS fructose meox2 TMS glucose MeOX 5TMS 25 log2(peak Area time at silylation time t / Peak area at 30 min of silylation) Time After Addition of MSTFA (minutes) Kanani HH, Chrysanthopoulos P, Klapa MI 2008 Standardizing GC-MS Metabolomics J Chromatogr B Analyt Technol Biomed Life Sci 87: 9-20

19 Matrix Effects Limit the Accuracy of the Measurements Matrix Effect-Derivatization even in time the 4hr presence of an automated derivatization scheme 0000 Fructose Glutamate Threonine Asparagine Ratio (PA MD / PA MD2) Plant Sample Number Kanani HH, Chrysanthopoulos P, Klapa MI 2008 Standardizing GC-MS Metabolomics J Chromatogr B Analyt Technol Biomed Life Sci 87: 9-20

20 Identification of Unknown Peaks Amino acid Derivative Derivative 2 Derivative 3 (M) MD w MD 2 w 2 MD 3 w 3 Alanine* Alanine N O 025 Alanine N N O Arginine Ornithine N N N O 0 Ornithine N N N O Ornithine N N N N O n/d 3 Asparagine Asparagine N N O 0726 Asparagine N N N O 904 Asparagine N N N N O 2,3 (putative) Aspartate* Aspartate O O 2, Aspartate N O O Cysteine Cysteine N O 2 n/d Cysteine N S O 267 Cysteine N N O Glutamate Glutamate N O O 04 Pyroglutamate N O Glutamine Glutamine N N O 0667 Glutamine N N N O 03 Pyroglutamine NNO,2,3 (putative) Glycine* Glycine N O 9397 Glycine N N O Histidine Histidine O 2 (putative) n/d Histidine N O n/d Histidine N N O 00 0 iso-leucine* iso-leucine O 255 iso-leucine N O 092 iso-leucine N N O 2 n/d Leucine Leucine O n/d Lecine N O 000 Leucine N N O 2 n/d 2 Lysine* Lysine N N O n/d Lysine N N N O 005 Lysine NNNNO Methionine* Methionine N O 42 Methinonine N N O Phenylalanine Phenylalanine O 30 Phenylalanine N O Serine Serine O O 297 Serine N O O 0299 Serine NNOO Threonine Threonine O O 330 Threonine N O O 032 Threonine NNOO Tryptophan Tryptophan O 2 (putative) n/d Tryptophan N O 0 Tryptophan N N O n/d 8 Tyrosine Tyrosine O 2 (putative) 8 Tyrosine O O 094 Tyrosine N O O Valine* Valine O 638 Valine N O 0842 Valine N N O 2,3 n/d Allantoin Beta-Alanine Gaba Dopamine Homoserine Ornithine n/d: Not detected consistently in all the samples Allantoin N N N B-Alanine O Gaba N O Dopamine N O O Homoserine O O Ornithine N N N O n/d B- Alanine N O Gaba N N O Derivatives not present in major public databases Allantoin N N N N Dopamine N N O O Homoserine N O O Ornithine N N N O n/d Allantoin N N N N N b Alanine N N O Homoserine N N O O Ornithine N N N N O Derivatives formed from chemical transformations Derivatives 267 treated as n/d unknowns in public databases

21 Conclusions We developed a GC-MS metabolomic data validation, normalization and correction strategy that does NOT jeopardize the high-throughput nature of the analysis The method is easy to implement and increases the accuracy of measurements by an order of magnitude for some metabolites (NH 2 containing compounds) In light of the importance of metabolomics research, this method is expected to provide a valuable tool for the acquisition of accurate metabolomic data

22 Objective To analyze stress-induced molecular interaction networks in the context of plant primary metabolism during the first (30) hours of the stress treatment under a variety of individual or combined perturbations using integrated time-series transcriptomic & metabolomic analyses Model System: Αrabidopsis thaliana Whole Plant Liquid Cultures Well-controlled growth environment

23 Day 0 Experimental Design & Setup A thaliana whole-plant liquid cultures Day 2 3h h Harvesting 6h 9h 2h 8h 24h 30h X 4 X 2 Control Set Elevated CO 2 Exp 2 Exp 3,4,5 Air 004% CO 2 Gamborg Media + Sucrose (58 mm) Air % CO 2 NaCl (50mM) or Trehalose (2mM) or ACC Combined Stresses Exp 6,7,8 LIGHT (80-00 µmole/cm2/s) Humidity 60% TEMP (23 C) Air % CO 2 NaCl (50mM) or Trehalose (2mM) or ACC Dutta et al 2008 Time-series integrated omic analyses to elucidate short-term stress-induced responses in plant liquid cultures Biotech Bioeng (In Press; E-print Available)

24 GC-MS Metabolomic analysis (Harin Kanani, UMD) 6 cdna Microarray Transcriptomic analysis (Bhaskar Dutta, UMD) 2 6 % CO Control Ambient 2 % CO 2 CO Sucrose (58 mm) mm NaCl NaCl (50 mm) 7 4 % CO mm NaCl Trehalose (2 mm) ACC (00 mm) Glucose (58 mm) 8 Elevated CO 2 Effect Media Pert effect 0 Exp * 20 samples * 2 Injections * 550 Peaks = 220,000 Total Measurements (8 Exp * 20 samples) Trizol extractions 60 mrna amplifications 640 cdna syntheses 320 Dye Injections 320 Micro-array hybridizations (flip-dye)

25 GC-MS Metabolomic Data Correction Methodology Paired-SAM analysis (TIGR MeV v3) delta = 2, FDR (median)= 0% Relative Peak Areas With Data Correction 27 + significant Unknown 022 Unknown 073 Unknown 03 Unknown 95 Unknown 083 Unknown 024 Unknown 6 Unknown 040 Unknown 044 Unknown 048 Unknown 074 Unknown 345 3,4-Dihydroxybutyrate Unknown 6 Glyoxylate Nicotinate Glycerol 3 P Unknown 33 Unknown 088 Unknown 285 Unknown 00 4-Hydroxybutanoate Unknown 059 Unknown 097 Unknown 089 Unknown 039 Unknown 36 Glycerate Relative Peak Areas Without Data Correction 26++ significant Unknown 022 Unknown 03 Unknown 073 Unknown 38 Unknown 95 Unknown 39 Unknown 37 Unknown 083 Unknown 368 Unknown 024 Unknown 6 Unknown 387 Unknown 040 Unknown 044 Unknown 048 Unknown 376 Unknown 074 Unknown 6 4-Hydroxybutanoate Unknown 345 3,4-Dihydroxybutyrate Unknown 00 Nicotinate Glycerol 3 P Unknown 285 Unknown 45 Unknown 088 Unknown 059 Unknown 390 Glyoxylate Unknown 33 Unknown 039 Unknown 42 Lysine Unknown 36 Unknown 097 Unknown 089 Uracil Cat- Cat-2 Cat-3 Η Κanani and MI Klapa # 2007 Data Correction Strategy for Metabolomics Analysis using Gas Chromatography-Mass Spectrometry, Metabolic Engineering 9:39-5 Kanani HH, Chrysanthopoulos P, Klapa MI 2008 Standardizing GC-MS Metabolomics J Chromatogr B Analyt Technol Biomed Life Sci 87: 9-20

26 PCA- Metabolomic Data: Individual Stress Response PC: 48% PC PC2: 4% PC3: 9% % CO2 Total: 7% Control PC 3 PC 2

27 Microarray Time-series Significance Analysis Identification of Significant Genes at each time-point Consists of 4 modules Analysis of Gene Variability in Significance Level Among Time Points Correlation Analysis between timepoints with respect to their common significant genes Comparison of significant genes GO analysis results between time-points US Patent Application, 2006 Dutta B, Snyder R, Klapa MI 2007 Significance Analysis of Time-Series Transcriptomic Data: A methodology that enables the identification and further exploration of the differentially expressed genes at each time-point Biotech Bioeng 98:

28 Metabolomic Analysis fraction of total number (295) of metabolites (%) Elevated CO 2 stress : Time-profile of No of Significant Genes Metabolomic Analysis h 3 h 6 h 9 h 2 h 8 h 24 h 30 h Paired SAM Positively Significant Negatively Significant Total Significant % FDR fraction of total number (23) of genes (%) Transcriptomic Analysis h 3h 6h 9h 2h 8h 24h 30h Paired SAM % FDR Dutta et al 2008 Time-series integrated omic analyses to elucidate short-term stress-induced responses in plant liquid cultures Biotech Bioeng (In Press; E-print Available)

29 Acknowledgements Funding US NSF Grant: QSB-03332, UMD Minta Martin Foundation, UMD Department of Chemical & Biomolecular Engineering, FORTH/ICE-HT Bayer HealthCare LLC

30