Metabolomics Gabi Kastenmüller Helmholtz Zentrum München Institute of Bioinformatics and Systems Biology (g.kastenmueller@helmholtz-muenchen.de) Munich, 17.02.16
What is metabolomics? Metabolomics = analysis of metabolomes Metabolome = complete set Metabonomics = the quantitative measurement of the dynamic multiparametric metabolic response of living systems to pathophysiological stimuli or genetic modification (Wikipedia) of all small-molecules (<1000 Da) found within a biological system ( at a specific time under specific conditions) => Detection and quantitative measurement of (ideally) all small molecules (= metabolites) in a biological system
What is metabolomics? Genomics complete set of genes 20000-30000 genes Transcriptomics complete set of transcripts DNA splicing variants*genes mrna Proteomics complete set of proteins protein modifications*transcripts Metabolomics. complete set of metabolites ~2500 (+~3500 food +~1200 drugs)
Factors Why another omics? Genetic Energy Fatty acids ATP Sugars Regulatory Building Blocks Nucleotides Amino acids Phospholipids Signaling Hormones Neurotransmitter Environmental Xenobiotics Drugs Food KEGG Metabolites are the true end points of most biological processes
How can we measure all these metabolites? Metabolon
EXPERIMENTAL BACKGROUND
SAMPLE COLLECTION
Sample types Blood Urine Further body fluids Cerebrospinal Fluid Peritoneal Fluid Saliva Sweat Tears Feces Breath air/condensate
Sample types Blood Urine Further body fluids Tissue Cell cultures Plant extracts Liver Kidney Muscle Brain Fat
Sample collection Blood Plasma Serum Spots Additives (EDTA, Citrate, Heparin) Storage (N 2, -80 C, -20 C, 4 C, RT) Venous, Capillary, Arterious
Things to think about BEFORE sample collection Is there any established metabolomics method for the sample type ( matrix)? Additives can disturb the measurement (e.g. DNA stabilisors). Reactions go on at room temperature => standard operating procedures (SOPs) to ensure comparability Lab differences might be large => cases/controls from all sites Discuss study design with collaborators for analytics and data analysis!
Sample collection what you loose here you will never see (again)!
METABOLITE DETECTION & QUANTIFICATION
Targeted vs non-targeted approaches Targeted Non-targeted Preselected set of metab. signals All metab. signals that can be detected Pros: Known identity; better quant.; Pros: New/Unknown metabolites Cons: No new metabolites Cons: Difficult metabolite identification; less reliable quant Routine Discovery
Technologies for high-throughput metabolomics Mass spectrometry (MS) Nuclear magnetic resonance (NMR)
NUCLEAR MAGNETIC RESONANCE (NMR)
NMR H He 1 H, 2 H Li 7 Li Na Be 9 Be Mg nuclei with NMR active isotop nuclei with I=1/2 isotop B 11 B Al C 13 C Si N 15 N P O 17 O S F 19 F Cl 3 He Ne 21 Ne Ar 23 Na K 25 Mg Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn 27 Al Ga 29 Si Ge 31 P As 33 S Se 35 Cl Br Kr 39 K Rb 43 Ca Sr 45 Sc Y 49 Ti Zr 50 V Nb 53 Cr Mo 55 Mn Tc 57 Fe Ru 59 Co Rh 61 Ni Pd 63 Cu Ag 67 Zn Cd 71 Ga In 73 Ge Sn 75 As Sb 77 Se Te 81 Br I 83 Kr Xe 87 Rb Cs 87 Sr Ba 89 Y Ln 91 Zr Hf 93 Nb Ta 95 Mo W Re 101 Ru Os 103 Rh Ir 105 Pd Pt 107 Ag Au 113 Cd Hg 115 In Tl 119 Sn Pb 121 Sb Bi 125 Te 127 I 129 Xe Po At Rn 133 Cs 137 Ba 138 Ln Fr Ra Ac 179 Hf 181 Ta Ce 183 W Pr 187 Re Nd 187 Os Pm 193 Ir Sm 195 Pt Eu 197 Au Gd 199 Hg Tb 205 Tl Dy 207 Pb Ho 209 Bi Er Tm Yb Lu Th 141 Pr Pa 143 Nd U Np 147 Sm Pu 153 Eu Am 157 Gd Cm 159 Tb Bk 163 Dy Cf 165 Ho Es 167 Er Fm 169 Tm Md 171 Yb No 175 Lu Lr 235 U
NMR z w m N E DE = g h B o = h n B o S I=-1/2 DE=100 MHz DE=300 MHz DE=500 MHz m w S N 2 4 6 8 10 12 I=+1/2 B 0 [T]
Metabolite fingerprint H O H H H C C O C C H H H H 4 3 2 1 0 ppm 4.12 ppm 2.03 ppm 0.98 ppm
NMR spectrum for a plasma sample
Metabolite identification using spectra library
Quantification 1 H NMR Spectrum H O H H H C H C O C H C H H J coupling Integral 4 3 2 1 0 ppm 4.12 ppm 2.03 ppm 0.98 ppm
Technologies for high-throughput metabolomics Sample preparation Chromatography LC or GC- Mass spectrometry (MS) Nuclear magnetic resonance (NMR)
Sample preparation Homogenized sample Extraction (e.g. solvent extraction with MeOH) => Depending on the extraction method, different metabolite classes may be analyzed Addition of standards (for QC and quantification) Derivatisation
Sample preparation
Separation techniques Liquid chromatography liquid mobile phase; solid stationary phase Gas chromatography carrier gas mobile phase; liquid stationary phase Capillary Electrophoresis => Reduces complexity by introducing a temporal dimension (elution/retention time)
Separation techniques 4.01 5.84 14.43 4.38 10.66 8.46 10.18 4.55 6.52 11.76 6.73 7.74 8.01 9.34 11.03 11.79 9.47 13.05 7.50 5.34 11.21 3.17 12.89 13.30 4 5 6 7 8 9 10 11 12 13 14 Time (min) Chromatogram
MASS SPECTROMETRY (MS)
Mass spectrometry Principle: Separation/Identification of molecules by mass (precisely: mass/charge) 180.1 g/mol 396.7 g/mol 3.1 g 5.4 g
Don t be afraid of abbreviations Separation Ionisation Mass detection HPLC UPLC GC FIA EI ESI MALDI APCI FTICR Triple Quad TOF Ion Trap Orbitrap
Mass spectrometry Possible molecular formulas: C6H12O6 Which molecule is it? Taylor et al., 2005
Fragmentation e.g. by collision with gas molecules 180.1 g/mol 3.1 g 5.4 g
Relative abundance of isotopes The ratio of peaks containing 79 Br and its isotope 81 Br (100/98) confirms the presence of bromine in the compound. http://www.chem.arizona.edu/massspec/intro_html/intro.html
LC-MS spectra for a sample 4.01 5.84 14.43 4.38 10.66 8.46 10.18 4.55 6.52 6.73 7.74 11.76 8.01 9.34 11.03 11.79 5.34 7.50 9.47 13.05 3.17 11.21 12.8913.30 4 5 6 7 8 9 10 11 12 13 14 Time (min) Chromatogram 3.17 min MS Scan MS/MS Fragmentation
Intensity LC-MS spectra for a sample 4.01 5.84 14.43 4.38 10.66 8.46 10.18 4.55 6.52 6.73 7.74 11.76 8.01 9.34 11.03 11.79 5.34 7.50 9.47 13.05 3.17 11.21 12.8913.30 4 5 6 7 8 9 10 11 12 13 14 Time (min) Chromatogram 3.17 min Time MS/MS Fragmentation
Relative Abundance Relative Abundance Metabolite fingerprint 100 RT: 95 90 85 80 75 Retention 2.02 ± 0.05 min 100 80 60 166.1 166.1 MS Mass Profile (m/z) Molecular ion, adducts, multimers, in-source fragments, isotopes 70 65 60 55 50 45 40 40 quant. ion 35 30 25 20 15 10 5 Time 0 1.88 1.90 1.92 1.94 1.96 1.98 2.00 2.02 2.04 2.06 2.08 2.10 2.12 2.14 2.16 2.18 20 0 120.1 330.8 MS/MS 120.1 100 50 148 246 344 442 540 638 736 Fragmentation Spectra (EI or MS/MS) 120.1 120.1 166.1 m/z MS/MS 166.1 MS/MS 330.8 100 100 80 80 80 60 93.1 60 60 40 103.1 40 40 20 0 20 0 146.1 20 0 103.1 93.1 50 61 72 83 94 105 116 127 50 65 80 95 110 125 140 155 170 73 104 135 166 197 228 259
Metabolite identification with spectra library Fingerprints Library automated matching Suggested matching metabolites
Automated matching Forward-Fit Reverse-Fit Matches everything in the component to the library Matches the library entry to the component MS/MS 120.1 exp. MS/MS 120.1 lib. 100 120.1 Fit 100 120.1 80 80 60 60 93.1 40 103.1 40 103.1 71.8 93.1 20 20 0 0 50 61 72 83 94 105 116 127 50 61 72 83 94 105 116 127
Metabolite identification with spectra library Fingerprints Library? automated matching Suggested matching metabolites manual curation manual curation Knowns Known unknowns
Quantification Internal standard (isotope labled) Time RT: 2.02 m/z: 120.14 Area: 120089 RT: 2.02 m/z: 121.15 Area: 40112
DATA ANALYSIS
Raw metabolomics data retention time retention time retention time retention time mass mass mass mass fragmentation spectrum fragmentation spectrum fragmentation spectrum fragmentation spectrum LC or GC-MS NMR
Four steps Raw data processing peak detection, peak alignment, peak integration, identification of metabolites, Primary data analysis (QC): outlier detection, normalization (batch effects, dilution), missing value handling/imputing Statistical analysis: univariate/multivariate hypothesis tests, supervised/unsupervised machine learning (classification/clustering), Bioinformatic analysis: biological context, network analyses, data integration
Raw data analysis tools metlin.scripps.edu/xcms/ www.metaboanalyst.ca Batman (NMR) batman.r-forge.r-project.org/
Raw data analysis Targeted Non-targeted Preselected set of metab. signals All metab. signals that can be detected Ident. metabolites Peak list Sample
Metabolomics Data Metabolite concentrations Sample phenotypes including batch, collection date, etc
Four steps Raw data processing peak detection, peak alignment, peak integration, identification of metabolites, Primary data analysis (QC): outlier detection, normalization (batch effects, dilution), missing value handling/imputing Statistical analysis: univariate/multivariate hypothesis tests, supervised/unsupervised machine learning (classification/clustering), Bioinformatic analysis: biological context, network analyses, data integration
Primary data analysis (exploratory) server metap.helmholtz-muenchen.de/metap2 PCA Quality control Distributions
Four steps Raw data processing peak detection, peak alignment, peak integration, identification of metabolites, Primary data analysis (QC): outlier detection, normalization (batch effects, dilution), missing value handling/imputing Statistical analysis: univariate/multivariate hypothesis tests, supervised/unsupervised machine learning (classification/clustering), Bioinformatic analysis: biological context, network analyses, data integration
Results from statistical analysis e.g. cases vs. control: Fumarate Arginine Citrulline Ornithine Glutamine Urea Aspartate N-acetylglutamate.
Four steps Raw data processing peak detection, peak alignment, peak integration, identification of metabolites, Primary data analysis (QC): outlier detection, normalization (batch effects, dilution), missing value handling/imputing Statistical analysis: univariate/multivariate hypothesis tests, supervised/unsupervised machine learning (classification/clustering), Bioinformatic analysis: biological context, network analyses, data integration
Approach 1: Mapping results onto pathway maps Fumarate Arginine
But: mapping problem Gaps because not all metabolites are measured Measured and map metabolites do not match exactly No mapping of unknown metabolites
Approach 2: Reconstruction of networks from data Metabolomics data Metabolic network Sample
Reconstruction of metabolic networks using correlations
Problem: indirect effects Krumsiek et al., BMC Systems Biology, 2011
Problem: indirect effects
Eliminating indirect effects: partial correlation Krumsiek et al., BMC Systems Biology, 2011
Reconstruction of metabolic networks using partial correlation networks (=GGMs) Krumsiek et al., BMC Systems Biology, 2011
Reconstructions by GGMs: Closer inspection + Metabolic databases Krumsiek et al., PLoS Genet., 2012
APPLICATIONS & AIMS
Applications & Aims of Metabolomics Biomarkers discovery Diagnosis Response on therapy disease healthy Stratification => Personalized Medicine Pathomechanistic insights Preclinical drug testing
Applications & Aims of Metabolomics Biomarkers discovery Diagnosis Response on therapy Stratification => Personalized Medicine Pathomechanistic insights Preclinical drug testing
Metabolites as Diagnostic Biomarkers Diagnostic urine charts were widely used from the Middle Ages onwards These charts linked the colors, smells and tastes of urine to various medical conditions. So what is new? Pinder, Epiphanie medicorum (1506), Universitätsbibliothek München Nicholson & Lindon
Increase of molecular resolution www.biocrates.at
Example: Newborn screening ~40 metabolites (amino acids, carnitines) tested to identify inborn errors of metabolism Phenylketonuria (>1 in 25,000) phenylalanine tyrosine
[Inborn errors of metabolism] are merely extreme examples of variations of chemical behavior which are probably everywhere present in minor degrees. A.E. Garrod, Lancet, 1902 Garrod suggested a link between chemical individuality and predisposition to disease.
THE NORMAL HUMAN METABOLOME (HUMET) Krug et al., FASEB, 2012
HuMet: Studying the normal human metabolome 15 young healthy men:
Metabolome a snapshot of biochemical state 8 am fasting Metabolome 1 9 am after breakfast Metabolome 2 4 pm sports Metabolome 3
HuMet: Studying the normal human metabolome 15 young healthy men: age: 22-33 BMI: 20-25 kg/m 2 Controlled trial over the time course of 4 days 4 nutritional interventions: (fasting, standard meal, OGTT, OLTT) physical exercise stress test
HuMet: Study design 4 weeks 56 blood samples 25 urine samples 32 breath condensate samples breath air... 8:00 24:00 8:00 18:00 8:00 16:00 8:00 18:00
HuMet plasma samples @ metap 15 individuals x 56 plasma samples OH O O H O O N + OH H NH2 OH hexanoylcarnitine 0.05 µm tyrosine 100 µm 163 metabolites Targeted metabolomics per sample O O O N + O P O O O O phosphatidylcholine 20 µm
Metabolites levels largely vary during the day and on response to challenges glucose acetylcarnitine Krug et al., FASEB, 2012
Switching from anabolism to catabolism and back standardized concentrations
pushing volunteers through the metabolic space
Metabolite levels differ between individuals Normal ranges!
Concentration Metabolic response differs between individuals Extended fasting Meals Acetyl- carnitine Time of day Krug et al., FASEB, 2012
Metabolite profiles are individual
Personal metabolomes are stable Short-term (days), challenges plasma, 15 young men, MS-based Krug et al., FASEB, 2012 Metabolic individuality Mid-term (months) urine, 22 subjects, NMR-based Assfalg et al., PNAS, 2008 Assfalg et al., PNAS, 2008 Yousri et al., Metabolomics, 2014 Krug et al., FASEB, 2012 Chua et al., PNAS, 2013
Long-term stability (years)?
Long-term stability of metabolite profiles? n=818 1999/2001 (S4) 7 years 2006/08 (F4) measured in 2010 measured in 2012 Non-targeted metabolomics: 212 metabolites
901 subjects Assessing ranks of self-correlation Baseline 7-year follow-up 212 metabolites M1 M2 M3 S1 1.3 5.3 2.3 S2 1.9 6.6 2.1 S3 S4 S5 S6 M1 M2 M3 S1 1.5 4.9 2.0 S2 2.0 7.1 1.3 S3 S4 S5 S6 Yousri et al., Metabolomics, 2014
High long-term stability of metabolite profiles 40% of subjects: strongest correlation with own profiles in 7y follow-up Conserved metabolome 95% of subjects: self correlation ranked among the 30% strongest Major changes in metabolome Yousri et al., Metabolomics, 2014
Are metabolomes with major changes special? Individuals with major changes in their metabolome Baseline 7-year follow-up Yousri et al., Metabolomics, 2014
Reasons for high conservation? Genetic variation Steve Gschmeissner/SPL Microbiome Lifestyle
Heritability of metabolite levels Estimation of heritability based on twins (n=6000) (ACE model) Which fraction of variance inherited, which due to common environment? Median heritability: 25% Max heritability: 76% (butyrylcarnitine) Shin et al., Nature Genet, 2014
Conservation of metabolites Baseline 7-year follow-up M1 M2 M3 S1 1.3 5.3 2.3 S2 1.9 6.6 2.1 S3 S4 S5 S6 M1 M2 M3 S1 1.5 4.9 2.0 S2 2.0 7.1 1.3 S3 S4 S5 S6 pairwise correlations -> rank correlations -> most stable metabolites? Yousri et al., Metabolomics, in press
Conservation vs heritability C4 carnitine hormones Associated with sex Associated with BMI Associated with age
Can heritability be explained by common genetic variants?
Partnerships with industry and science IGE AME Thank you for your attention IEG! IOEC K. Suhre