Experimental Design and MS Workflows for Omics Applications. David A. Weil, Ph.D. Senior Application Scientist Schaumburg, IL

Similar documents
Expand Your Research with Metabolomics and Proteomics. Christine Miller Omics Market Manager ASMS 2017

Metabolomics: From Samples to Softwares

Instrumental Solutions for Metabolomics

Agilent Software Tools for Mass Spectrometry Based Multi-omics Studies

Agilent s NEW MassHunter Profinder

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

MassHunter Profinder: Batch Processing Software for High Quality Feature Extraction of Mass Spectrometry Data

Agilent Solutions for Metabolomics EMERGING INSIGHTS

Agilent Solutions for Metabolomics YOUR PATH TO SUCCESS

Research Powered by Agilent s GeneSpring

The Agilent Metabolomics Dynamic MRM Database and Method

FROM DISCOVERY TO INSIGHT

AGILENT SOLUTIONS FOR METABOLOMICS

A Highly Accurate Mass Profiling Approach to Protein Biomarker Discovery Using HPLC-Chip/ MS-Enabled ESI-TOF MS

ProteinPilot Report for ProteinPilot Software

faster, confident more path to all the information in your samples. Agilent MassHunter Workstation Software Our measure is your success.

Metabolomics Workflow: From Data Acquisition to Data Analysis

Metabolomics Workflow: From Data Acquisition to Data Analysis

Daniel Cuthbertson Agilent Technologies Denver, Colorado

MS data: Understand it. Don t just process it. Agilent Mass Profiler Professional Software. Our measure is your success.

Xevo G2-S QTof and TransOmics: A Multi-Omics System for the Differential LC/MS Analysis of Proteins, Metabolites, and Lipids

Estoril Education Day

Next-Generation Sequencing Gene Expression Analysis Using Agilent GeneSpring GX

The Human Toxome Project a test case for pathway identification by multiomics. Thomas Hartung

Ensure your Success with Agilent s Biopharma Workflows

Data Analysis in Metabolomics. Tim Ebbels Imperial College London

The Five Key Elements of a Successful Metabolomics Study

Faster, easier, flexible proteomics solutions

See More, Much More Clearly

Normalization of metabolomics data using multiple internal standards

Thermo Scientific Compound Discoverer Software. Integrated solutions. for small molecule research

Metabolomics: Techniques and Applications ABRF

IROA Metabolic Profiling Kits FAQ

Agilent GeneSpring GX 10: Beyond. Pam Tangvoranuntakul Product Manager, GeneSpring October 1, 2008

Agilent 6000 Series LC/MS Solutions. Confidence in Quantitative and Qualitative Analysis

High Resolution GC-MS Application: Metabolomics Vladimir Tolstikov, PhD

MetaboScape 2.0. Innovation with Integrity. Quickly Discover Metabolite Biomarkers and Use Pathway Mapping to Set them in a Biological Context

Metabolic Changes in Lung Tissue of Tuberculosis-Infected Mice Using GC/Q-TOF with Low Energy EI

Introduction. Benefits of the SWATH Acquisition Workflow for Metabolomics Applications

Integrated Biology. A Pathway-centric Approach to Multiomics Research Powered by GeneSpring Analytics

METABOLOMICS. Biomarker and Omics Solutions FOR METABOLOMICS

Cell Signaling Technology

Agilent Technologies. Carl Myerholtz, Ph.D. Senior Director of QTOF Technology. FAPESP/Agilent Workshop August 30,

SCIEX OS Software One Touch Productivity

IMPLEMENTATION OF PANORAMA INTO DAILY WORKFLOWS AND QUANTITATIVE PROTEIN PK ANALYSIS IN THE PHARMACEUTICAL SETTING

Confident Protein ID using Spectrum Mill Software

Commercial and non-commercial software How do these work together and can benefit form one another?

ProMass HR Applications!

Agilent Metabolomics Laboratory: The breadth of tools you need for successful metabolomics research

Data processing for LCMS-based metabolomics

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

Pharmaceutical LC/MS Solutions from Agilent Technologies

Strategies for Quantitative Proteomics. Atelier "Protéomique Quantitative" La Grande Motte, France - June 26, 2007

The LC/MS Walk-Up Solution from Agilent

Precise Characterization of Intact Monoclonal Antibodies by the Agilent 6545XT AdvanceBio LC/Q-TOF

Key Words Q Exactive Focus, SIEVE Software, Biomarker, Discovery, Metabolomics

Important Information for MCP Authors

Confident Pesticides Analysis with the Agilent LC/Triple Quadrupole and TOF/QTOF Solutions

Clearly better LC/MS solutions

Metabolomics. Innovation with Integrity. Novel solutions for Metabolomics and high-throughput Phenomics. Metabolomics

AB SCIEX TripleTOF 5600 SYSTEM. High-resolution quant and qual

Agilent Food Profiling Solutions ASSESS FOOD QUALITY AND POINT OF ORIGIN

UNIFI: The user environment Ken Eglinton Nordic User Training, September 2013

Agenda: Metabolomics Society September 18, 2012

Pushing the Leading Edge in Protein Quantitation: Integrated, Precise, and Reproducible Protein Quantitation Workflow Solutions

Highly Confident Peptide Mapping of Protein Digests Using Agilent LC/Q TOFs

IMPORTANCE OF CELL CULTURE MEDIA TO THE BIOPHARMACEUTICAL INDUSTRY

Synthetic Biology: Complexity and Challenges

Improving Productivity with Applied Biosystems GPS Explorer

The Use of High Resolution Accurate Mass GC/Q-TOF and Chemometrics in the Identification of Environmental Pollutants in Wastewater Effluents

Informatics and High Resolution QTof MS. How can we ask better questions and get better answers in DMPK? Mark D. Wrona

Pathway analysis. Martina Kutmon Department of Bioinformatics Maastricht University

MRMPilot Software: Accelerating MRM Assay Development for Targeted Quantitative Proteomics

Increasing MS Sensitivity for Proteomics. ifunnel Technology for the QQQ and Q-TOF

An Integrated Approach to Metabolomics Studies: Discovery to Quantitation on a Single Platform

Quantification of Host Cell Protein Impurities Using the Agilent 1290 Infinity II LC Coupled with the 6495B Triple Quadrupole LC/MS System

Agilent Genomics Software Future Directions

MasterView Software. Turn data into answers reliable data processing made easy ONE TOUCH PRODUCTIVITY

LC/MS Based Quantitation of Intact Proteins for Bioanalytical Applications

QUANLYNX APPLICATION MANAGER

MRC-NIHR National Phenome Centre

Introduction to RNA-Seq in GeneSpring NGS Software

New Approaches to Quantitative Proteomics Analysis

SpectroDive NEXT GENERATION TARGETED PROTEOMICS. Integration of Ready-Made Panels Improved Workflow for Custom Panels

Automated Purification Software for Agilent OpenLAB CDS

Metabolite Identification Workflows: A Flexible Approach to Data Analysis and Reporting

Ingenuity Pathway Analysis (IPA )

Skyline & Panorama: Key Tools for Establishing a Targeted LC/MS Workflow

Application Note LCMS-87 Automated Acquisition and Analysis of Data for Monitoring Protein Conjugation by LC-MS Using BioPharma Compass

Basic principles of NMR-based metabolomics

The Agilent 6545 Q-TOF LC/MS System. The Answers You Need:

The Agilent 6545 Q-TOF LC/MS System. The Answers You Need:

Outline and learning objectives. From Proteomics to Systems Biology. Integration of omics - information

Capabilities & Services

Developing an LC-MS/MS method for hepcidin quantitative analysis

Metabolomics: Sifting Through Complex Samples

Fiehn GC/MS Metabolomics RTL Library IDENTIFY MORE METABOLITES

Robustness of the Agilent Ultivo Triple Quadrupole LC/MS for Routine Analysis in Food Safety

Transcription:

Experimental Design and MS Workflows for Omics Applications David A. Weil, Ph.D. Senior Application Scientist Schaumburg, IL David_Weil@agilent.com 1

Acknowledgements: Support Team AFO Metabolomics Team Mark Sartain, Sumit Shah, Anne Blackwell, Dan Cuthbertson Yuqin Dai Steve Madden Rick Reinsdorf MPP Support Team Santa Clara, CA Software Product Manager National Jewish, Denver, CO Global 2

Outline Basics of Experimental Design and Statistics in Omics Research Agilent s LCMS Omics Data Analysis Workflow Targeted, Untargeted, Pathway Directed Update What s New in MPP and Profinder Sample Correlation, Pathways, Metadata, etc 3

Statistics There are three kinds of lies: lies, damned lies and statistics- Benjamin Disraeli 98% of all statistics are made up Unknowns Statistics is the mathematics of impressions Ref: David Wishart, Informatics and Statistics for Metabolmoics July 8-9, 2013 at the Canadian Bioinformatics Workshops 4

Applications.. From Food, Fuel, Metabolism Page 5

Experimental design is critical for Omics experiments as.. All steps are critical from sample preparation to identification Variance in a step can lead to false positive conclusions Page 6

Why Are Statistics So Important For Omics Your hypothesis will drive your experimental design. Your statistical design is critical to answering the question. Have a statistical plan before starting. Your statistical plan will drive all other aspects of your experimental design. 7

Why Do Statistics in MPP? Statistics in MPP help us: Avoid jumping to conclusions. Avoid finding patterns in random data Understand multiple comparisons Consider alternative explanations. 8

What Will Statistics Do and Not Do? Will not: Give you absolute answers. Statistical conclusions give only probabilities. Give you scientific or clinical conclusions. It is your job to interpret the statistics and draw conclusions in context of the hypothesis. Will: Help you assess variability. Help you extrapolate from a sample to a population. Help you uncover relationships between variables. 9

Core Definitions For Statistics: Mean: The center of a population or probability distribution Arithmetic Mean: The sum of values divided by N µ=ʃx i /N Geometric Mean: All values multiplied together then raised the to 1/Nth power. Equivalent to arithmetic mean of log values, what MPP uses. Confidence Interval: Interval in which the mean is expected to lie with X% (often 95%) confidence. Median: The middle value in the dataset. More robust than mean when outliers are present. The median value is useful to baseline to in MPP. 10

Measures of Variability Standard Deviation (σ) : A measure of data dispersion around the mean. Three measurements with the same mean but different standard deviations Variance (σ 2 ): Standard deviation squared. Often partitioned in various sources. Coefficient of Variation (CV): A ratio measure of the standard deviation in relation to the mean. Useful for comparing scatter between variables. 11

Normalized Abundance Variables Experimental Parameters Examples: - Control vs. treated groups - Lots or batches of raw materials - Clinical phenotypes - Drug dose response and time course - Forced degradation time points Time Course and Dose Response 350 A b u n d a n c e 300 250 200 150 100 50 10 nanomolar 100 nanomolar 1000 nanomolar A B C Phenotype 0 0 hr 4 hr 8hr 12 hr 24 hr Time 12

Variables Non-Experimental Parameters Examples: - Column changes during experiment - Significant changes in sample prep Change in person doing the preparation Storage issues alteration of temperature, freeze/thaw cycle, etc. High complexity data is subject to seemingly minor changes in methods, protocols, sample handling, etc. Anything that could reasonably affect data should be considered carefully. 13

What Should You Consider When Determining Number of Replicates Type of replicate: - Technical Replicates: Instrument and/or Sample Prep ONLY tell you about the variability in the mean measurement of a single sample - Condition Replicates: Biological, Raw Material Lots,etc. Must be independently sampled from population Will tell you about variability in the population Statistical Power - Asks how many replicates you need to see the desired effect? - You must have a hypothesis: You must estimate the effect size or group means and their variability 14

What is Statistical Power? A measure of confidence in statistical results Key Terms to know α : Error in finding something statistically significant when the null hypothesis is true (set typically to desired p-value ie. 0.05) -False Positive rate β: Error in finding something not statistically significant when the null hypothesis is false False Negative rate Power = 1- β If your experiment is underpowered you cannot be confident in the differences you see (or don t see) 15

Sampling and Sample Preparation Sample Preparation is the primary source of non-experimental variance in differential analysis Starting sample amount Inconsistent pipetting Operator/ chemist Sample degradation and freeze thaw cycles Mixing and fractionation Reduce impact of non-experimental variance Careful method validation will help estimate variability Use controls and QCs to account for variability Normalization will only get you so far 16

No universal separation method exits to profile all the classes of metabolites in a single LC-MS run Metabolome Small Molecule Extract Nucleo(s/t)ides Sugars Amino Acids Lipids Polar/Hydrophilic Aqueous Normal Phase HILIC Mixed Mode Reverse Phase (with Ion Pairing) NonPolar/Hydrophobic Reverse Phase GC-MS Page 17

Chromatography Is A Source Of Variability. Variability in retention time and peak shape are sources of error Each phase and each compound has it s own inherent variability How often will you need to change columns during an experiment? Plan for enough columns! MPP can help with alignment and RT correction! RT 5.197 ~12% height D RT 5.390 D RT 0.193 18

How to avoid technical variability Use Randomization within Sample Plates Normalization Internal Standards in Sample Prep and Samples QA/QC Pooled Samples Model Studies to determine measurement variance Robust LCMS system key! 19

Accounting for Variability Normalization in MPP Normalization: External Scalars Osmolality, Protein Content, Cell Count, Etc. Algorithmic - Total Useful Current and Percentile Shift: Assumes that the sum concentration of all analytes is the same in every sample. - Quantile: Assumes that the distribution of intensities in the samples is the same Key Point: Each normalization makes different assumptions about the sample. Violating these assumptions can introduce more error than you hope to correct. Key Point: Normalizations can be very sensitive to zero values Key Point: Normalization is not always necessary 20

Just a brief overview. Lots of books and online learning.. 21

Data Acquisition & Analysis --- start with Mass Hunter Targeted Data Acquisition QQQ Known metabolites only Higher sensitivity than profiling approach Absolute quantitation Typically develop methods for tens/hundreds of metabolites Project results onto biological pathways for interpretation Untargeted Data Acquisition TOF / Q-TOF Acquire data and detect all compounds; relative quantitation. Targeted data mining: Pathway Directed Analyze data with a database of known compounds Project results onto pathways Untargeted data mining: Find all compounds Naïve data mining: Discovery based approach Track metabolites using mass, or mass spectra and retention time 22

LC-QQQ Targeted Metabolomics Workflow 1 Optimize LC/MRMs for metabolite standards using LC/QQQ and Optimizer 2 3 Acquire MRM data using LC/QQQ and Study Manager Quantitation of metabolites using MassHunter Quantitative Analysis exporting project into MPP 4 5 Statistical analysis using Mass Profiler Professional Choose pathway database and species Analyze differential data and visualize the results on pathways 23

Targeted Metabolomics Quant Software Batch processing of samples Easy construction of calibration curves Automate quant analysis and reporting Compound centric review for manual editing Import results into MPP Compound identifier included with results for pathway analysis MassHunter Quantitative Analysis 24

Pathway Targeted Untargeted Untargeted/Pathway Directed Separate & Detect Feature Finding Alignment & Statistics Identify Pathways LC-TOF/QTOF Profinder Mass Profiler Professional ID Browser Pathway Architect LC-TOF/QTOF Profinder Mass Profiler Professional Pathway Architect 25

MassHunter Profinder Software MassHunter Profinder is a productivity tool for processing multiple samples in metabolomics, proteomics, intact protein analyses Fast Compound Finding Untargeted using MFE Targeted using Find by Formula Visualize, review, and edit results by compound across many samples Higher quality results based on cross-sample processing Minimizes false positive and negative results Batch Processing Compound group level information Details for a single compound Stacked/ove Group 1 rlaid EICs Group 2 Group 3 Group 4 MS spectra 26

Pathway Directed Pathways to PCDL Convert pathway metabolite information into Agilent personal compound database Pathway database source - WikiPathways, BioCyc and KEGG Select one to many pathways Removes redundant metabolites Adds compound information Formula, Compound ID(s), Name, Structure Can link to METLIN PCDL to add compound information Retention time or MS/MS spectra 27

Increasing Your Confidence in Compound Identification Accurate Mass (AM) AM + Isotope Pattern (IP) AM + Retention Time (RT) + IP MS/MS Library AM + RT + MS/MS Library Confident compound identification is crucial for pathway visualization! 28

Identification With An MS/MS Library Match METLIN Personal Compound Database & Library Q-TOF based LC-MS/MS library Compounds in database - 64092 Compounds with MS/MS - 8040 Collected in ESI +/- modes Selected monoisotopic ion Collected at three collision energies Spectra reviewed for quality The Only Commercially Available Database and MS/MS Library for Metabolomics 29

Identification Without An MS/MS Library Match Molecular Structure Correlator (MSC) Confirm proposed structures in minutes Uses accurate mass MS/MS data Facilitates compound identification Search compounds in a database, i.e. PCDL or ChemSpider Assigns fragment ions to substructures of a proposed parent structure Provides a score to estimate goodness of match to MS/MS data Molecular Structure Correlator (MSC) 30

Mass Profiler Professional (MPP) 13: Statistical Analysis and Visualization Software Designed for Mass Spectrometry data from multiple platforms Can Import, Store, and Visualize Agilent LC/MS Q(TOF), and QQQ Agilent GC/MS Quad, QQQ, and QTOF Agilent ICP/MS and NMR (Craft) Generic file format import Extensive statistical analyses tools ANOVA, Clustering, PCA, Fold-change, Volcano plots Correlation Analysis, including multi- omic correlation! ID Browser for compound identification Integrated Biology Omics Pathway Architect for biological contextualization NEW! KEGG Pathways New Meta Data

Examples of MPP Data Processing 32

Pathway Architect: Changing Data Into A Pathway Visualization Start here Finish here

Pathway Architect 13: Canonical Pathway Data Mapping and Visualization Central Carbon Metabolism Browse, filter, and search Analyze one or two types of omic data Supports biological pathways from publicly available databases WikiPathways BioCyc KEGG Supported formats BioPAX3 Pathway Commons, Reactome NCI Nature Pathway GPML PathVisio custom drawing Export compound list from pathways Easy Mining of Complex Pathways for Biological Understanding

New in Pathway Architect 13: KEGG Pathways Curated biological information available from KEGG Download from Agilent server Integrated directly in Pathway Architect KEGG license required

Integrating Genomics and Metabolomics Data Amino Acid Metabolism 37

What s New in MPP 13.0 and Profinder 38

MPP 13 Supports Meta Data Analytical Results to Meta Data Metadata can be displayed as Heat maps Colored strips Graphical plots Discrete plots Text Visualize the significant relationships! Maps observable sample information to analytical results Provides flexible visualization of metadata (e.g. time points, growth conditions, patient data etc.) to facilitate interpretation 39

Correlation Analysis MPP 13 adds support for Correlation Analysis. Researchers can view relationships between entities (compounds or proteins) or samples. Clicking on a cell of a heat map, they can quickly view the specific parameters of the correlation 40

MPP Correlation Analysis Sample to Sample ph 9 2 7 Sample correlation Heatmap indicates sample relationships both within a particular ph and between samples of different ph 41

MPP Correlation Analysis Entity to Entity Supported comparisons: Within an experiment Between experiments (genes and proteins, proteins and metabolites etc.) Clicking on any square brings up the regression analysis 42

What s New in Profinder B.06.00 SP1 Support for Intact Proteins Profinder B.06.00 SP1 has added the Large Molecular Feature Extraction (LMFE) algorithm, which enables profiling of intact proteins using multiply charged mass spec data. Now small molecule, peptide, and intact proteins can be processed in the same program! 43

Cloud Deployment GeneSpring MPP was internally and beta tested using a cloud configuration including Amazon Web Services (AWS) Server specifications (e.g. CPU), many of which are configurable on request An upcoming technical overview will discuss how to configure a cloud or virtual machine deployment and advantages like flexibility and collaboration GeneSpring MPP running on Amazon Web Services (AWS) 44

Customer training courses are bundled with new licenses and upgrades: R2260A Mass Profiler Professional (MPP) Workshop R3904A Pathway Architect Workshop The MPP Supplemental DVD contains: Workflow Guides on Metabolomics, Lipidomics, Integrated Biology and Class Prediction ( ) Video Tutorials on using MPP including Advanced Operations and Class Prediction ( 45

Agilent Solutions for Metabolomics A broad family of products (LC/MS, GC/MS and NMR) to meet your needs Agilent TOF technology is most-suited for untargeted metabolomics High mass accuracy and mass stability Profinder Wide dynamic is the most range advanced with good software resolution for and high high quality isotope ratio feature fidelity finding METLIN Highest is sensitivity the only commercially for low level metabolite available MS/MS detection library for metabolites MSC assists ID of MS/MS spectra without library matches Software designed to analyze metabolomics data Pathway compound ID mapping using automated Agilent BridgeDB Ability to visualize metabolomics data in pathways 46

47

48