Diagnostic visualizations for normalization of gene expression microarray data. Visualization of LC-MS/MS data and quality scores

Size: px
Start display at page:

Download "Diagnostic visualizations for normalization of gene expression microarray data. Visualization of LC-MS/MS data and quality scores"

Transcription

1 nature methods Visualization of omics data for systems biology Nils Gehlenborg, Seán I O Donoghue, Nitin S Baliga, Alexander Goesmann, Matthew A Hibbs, Hiroaki Kitano, Oliver Kohlbacher, Heiko Neuweger, Reinhard Schneider, Dan Tenenbaum & Anne-Claude Gavin Supplementary figures and text: Supplementary Figure 1 Supplementary Figure 2 Supplementary Figure 3 Supplementary Table 1 Diagnostic visualizations for normalization of gene expression microarray data Visualization of LC-MS maps Visualization of LC-MS/MS data and quality scores Visualization tools for omics measurement data

2 Supplementary Figure 1. Diagnostic visualizations for normalization of gene expression microarray data (a) MA-plots before (left) and after LOWESS within-array normalization (right) for data from a two-color array. (b) Box and whiskers plots demonstrating the effects of LOWESS within- (center) and quantile between-array (right) normalization. Original data is shown on the left. The data for Array1 is also shown in the MA-plots.

3 Supplementary Figure 2. Visualization of LC-MS maps (a) LC-MS data from proteomics displayed as a color-coded 2D plot of the data together with projections of the map onto the retention time and mass-to-charge axes and (b) as a 3D plot. Clearly recognizable are the sets of pairs showing distinct elution profiles (along the retention time axis) and isotope patterns (along the mass-to-charge ratio axis). Visualizations created with TOPPView 1.

4 Supplementary Figure 3. Visualization of LC-MS/MS data and quality scores Peak intensities encoded into a grayscale value and plotted into a retention time t vs m/z coordinate system. The square markers represent MS-MS spectra used for identification purposes. Blue squares represent MS-MS spectra that were not identified as peptides or peptides with a p-value lower than 0.5. All other markers represent peptide identifications and corresponding p-values. Note that the program was set up to ignore all peptide identifications with a p- value below 0.5. This is reflected in the range of the color scale for the p-value. Image produced with Pep3D 2.

5 Supplementary Table 1 Visualization tools for omics measurement data Name Cost OS Data Description URL Stand-alone Expression Console Free Win A Low-level analysis of Affymetrix array data;,diagnostic plots (Affymetrix) Insilicos Viewer Free Win M Data viewer; mass spectrum and chromatogram visualizations (Insilicos) GenePix Pro (Molecular $ Win A Image acquisition of microarray slides; basic visualizations of raw data Devices) MetaboMiner 3 Free Win Mac Linux N Tool for metabolite identification; visual inspection of matched peaks msinspect/qurate 4 Free Win Mac Linux M Quantitative mass spectrometry data; chromatogram and LC-MS map MzMine 2 5 Free Win Mac Linux M Full analysis platform; most standard visualizations; PCA scatter plots Pep3D 2 Free Win Mac Linux M Mass spectrum plots, LC-MS maps; integration of statistical analyses Prequips 6 Free Win Mac Linux M Mass spectrum plots, chromatograms and LC-MS maps. Proteowizard/SeeMS 7 Free Win M Collection of tools; mass spectrum plots and LC-MS maps. TOPPView* 1 Free Win Mac Linux M Mass spectrum plot, chromatogram, 2D and 3D LC-MS maps R/BioConductor Packages affy 8 Free Win Mac Linux A Exploratory analysis of Affymetrix array data; several diagnostic plots affycomp 9 Free Win Mac Linux A Comparison of Affymetrix array data, e.g. with MA-plots. arrayqualitymetrics* 10 Free Win Mac Linux A MA-plots, intensity density plots and spatial distribution plots; reports edger 11 Free Win Mac Linux S Estimation and testing for differential expression; diagnostic plots limma 12 Free Win Mac Linux A Analysis with linear models; measurement level diagnostic plots shortread 13 Free Win Mac Linux S Processing and evaluation of short read sequencing data Web-based MeltDB* 14 Free M Extensive analysis platform; mass spectrum plots, heatmaps,pathways MetaboAnalyst 15 Free M Extensive analysis platform; PCA scatter plots, heatmaps and more Some of the tools in this table have capabilities similar to tools that are listed in other tables. To avoid listing tools in more than one table we assigned tools to tables based on what we understand is their primary purpose. Abbreviations: An asterisk (*) means the tool is recommended. Free means the tool is free for academic use. Win refers to Microsoft Windows, Mac refers to Mac OS X, tools running on Linux usually also run on other versions of Unix. A = for oligonucleotide microarray data, M = mass spectrometry data, S = deep sequencing data, N = Nuclear Magnetic Resonance

6 References 1. Sturm, M. and Kohlbacher, O. TOPPView: an open-source viewer for mass spectrometry data. J Proteome Res 8 (7), (2009) 2. Li, X. J. et al. A tool to visualize and evaluate data obtained by liquid chromatography-electrospray ionization-mass spectrometry. Anal Chem 76 (13), (2004) 3. Xia, J., Bjorndahl, T. C., Tang, P., and Wishart, D. S. MetaboMiner--semiautomated identification of metabolites from 2D NMR spectra of complex biofluids. BMC Bioinformatics 9, 507 (2008) 4. May, D., Law, W., Fitzgibbon, M., Fang, Q., and McIntosh, M. Software platform for rapidly creating computational tools for mass spectrometry-based proteomics. J Proteome Res 8 (6), (2009) 5. Katajamaa, M., Miettinen, J., and Oresic, M. MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22 (5), (2006) 6. Gehlenborg, N. et al. Prequips - an extensible software platform for integration, visualization and analysis of LC-MS/MS proteomics data. Bioinformatics 25, (2009) 7. Kessner, D., Chambers, M., Burke, R., Agus, D., and Mallick, P. ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics 24 (21), (2008) 8. Gautier, L., Cope, L., Bolstad, B. M., and Irizarry, R. A. affy--analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20 (3), (2004) 9. Irizarry, R. A., Wu, Z., and Jaffee, H. A. Comparison of Affymetrix GeneChip expression measures. Bioinformatics 22 (7), (2006) 10. Kauffmann, A., Gentleman, R., and Huber, W. arrayqualitymetrics--a bioconductor package for quality assessment of microarray data. Bioinformatics 25 (3), (2009) 11. Robinson, M. D., McCarthy, D. J., and Smyth, G. K. edger: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26 (1), (2010) 12. Smyth, G. K., Yang, Y. H., and Speed, T. Statistical issues in cdna microarray data analysis. Methods Mol Biol 224, (2003) 13. Morgan, M. et al. ShortRead: a bioconductor package for input, quality assessment and exploration of high-throughput sequence data. Bioinformatics 25 (19), (2009) 14. Neuweger, H. et al. MeltDB: a software platform for the analysis and integration of metabolomics experiment data. Bioinformatics 24 (23), (2008) 15. Xia, J., Psychogios, N., Young, N., and Wishart, D. S. MetaboAnalyst: a web server for metabolomic data analysis and interpretation. Nucleic Acids Res 37 (Web Server issue), W (2009)