Application Note TOF/MS New Level of Confidence for Protein Identification: Results Dependent Analysis and Peptide Mass Fingerprinting Using the 4700 Proteomics Discovery System Purpose The Applied Biosystems 4700 Proteomics Discovery System employs novel Results Dependent Analysis features to trigger MS/MS analysis of peptides based on single-stage MS peptide mass fingerprinting (PMF) results. This type of targeted analysis saves time, conserves sample, and increases confidence in identification results. In this application note, protein identifications obtained from MS-acquired data are used to automatically select matching peptides for MS/MS to confirm the top proteins identified, or alternatively, to automatically select non-matching peptides to identify low abundance constituents. RDA software greatly increases the efficiency and degree of confidence of PMF by automating targeted analysis. Figure 1. GPS Explorer software provides features to remotely submit acquisition jobs to the 4700 Proteomics Analyzer. Using RDA software, the user specifies the confidence interval for protein identification and the number of matching or non-matching peptides to acquire MS/MS spectra for. Overview Peptide mass fingerprinting has been used to identify proteins by database search using molecular weight information obtained from peptides derived by proteolytic digestion of a protein. Using the enzyme specificity, many proteins can be identified uniquely by the masses of proteolytic peptides. This approach is applied primarily to samples derived from in-gel digestion of proteins, separated electrophoretically using 1D or 2D gels. However, in situations where multiple proteins are isolated in one sample, PMF may not yield correct or complete results, as peptides from different proteins are present in the sample; and sometimes other factors, such as mass accuracy or post-translational modifications, make PMF results questionable and in need of further validation or verification. Tandem mass spectrometry (MS/MS), intelligently applied as the result of initial PMF analyses, can be used to confirm PMF results or identify components that were initially missed by PMF. This RDA software functionality is now available with the 4700 Proteomics Discovery System.
Key Features Automated, results-driven MS/MS analysis resulting from intelligent evaluation of MS-based PMF results. Optimization of sample consumption by acquiring MS/MS spectra only from peptides which are likely to provide confirmation of high confidence PMF results. Comprehensive display and reporting of mass spectral data and database search results. Use of industry-standard MASCOT database search engine. Multiple user access to data and results with the ability to set up and submit analyses to the 4700 Proteomics Analyzer using remote client software. Figure 2. Peptide mass fingerprinting results show twelve peptides matching peptides from RHSC protein precursor. This substance was not included in the equimolar mixture of tryptic digests prepared for this experiment. Further MS/MS analysis was required to confirm this peptide mass fingerprinting result was erroneous. The peptides in red show matched peptides. The peptides in blue show unmatched peptides. Experimental Conditions In by spot PMF analyses, the user deposits samples (manually or with the aid of robotic equipment) on individual spots on the MALDI plate. Each spot contains peptides obtained from a single or a small number of proteins, perhaps from in-gel digestion of a spot or band excised from a 1D or 2D gel. MS molecular weight data are acquired for each spot, with the user setting up mass spectral acquisitions either directly on the 4700 Proteomics Analyzer using 4700 Explorer software, or through GPS Explorer protein application software, which can automate peak list importation. A database search returns protein identifications based on PMF, with proteins identified with various degrees of confidence. Following PMF, the user can set up automated Results Dependent Analyses using GPS Explorer software, by instructing the software to generate lists for precursor masses for MS/MS analysis based on the top protein identifications. Alternatively, GPS Explorer software can generate a mass list for MS/MS that includes only the peptide masses that do not match up with identified proteins. Both options can be selected simultaneously or in sequence. The user interface for a Results Dependent Analysis is shown in Figure 1. The user specifies a protein score confidence interval for those protein identifications that are considered significant. Although a 95% confidence interval is typically used as the cutoff value, separating statistically significant results from random hits, it is userselectable to allow for different statistical considerations resulting from the use of specialized databases. For the Top Protein Hit Confirmation option, the user selects how many peptide precursor ions, from those matched by PMF to each protein with score greater than that specified in the Protein Score C.I. % text box, are to be selected and submitted to the 4700 Proteomics Analyzer for MS/MS data acquisition. For further investigation of peptides that were not matched to proteins in the database search, the user has the option to select peptide precursor ions, which have not been matched by PMF to any protein with score greater or equal to that specified, for MS/MS analysis. The maximum number of precursors and a minimum signal-to-noise ratio are specified by the user, depending on the characteristics of a particular sample.
Results and Discussion The RDA software features outlined earlier are illustrated in the following examples, using known proteins. In the first example, an equimolar mixture of tryptic digests from two proteins, glutathione S-transferase and ovotransferrin, was deposited onto a spot on the MALDI target and MS data were acquired with the 4700 Proteomics Analyzer. PMF analysis of the MS data, comprising the peptide signals from both proteins, resulted in the identification with very high confidence (C.I. 97.7%) of the RHSC precursor protein, which was not, in fact, introduced into the mixture. This is because, by coincidence, peptide masses from both of the proteins in the mixture match peptides from RHSC precursor protein with a statistically significant value. Obviously, in the analysis of unknown samples, a protein identified by PMF with the above confidence would likely be accepted as the correct match, if the option of confirmation by MS/MS were not available. The GPS Explorer software Results Browser interface for this experiment is shown in Figure 2. Note that more than a dozen peptides from the two different proteins in this sample matched peptides from the RHSC protein precursor. In this case, the Results Dependent Analysis was set up to confirm the top protein hit, exactly as one might do with any unknown sample, by automatically acquiring MS/MS spectra of the three peptides with the strongest signals among those matched to the RHSC protein precursor sequence in the PMF experiment. After the MS/MS data were acquired, database searches were carried out using the MS/MS Figure 3. Results obtained using Top Protein Hit Confirmation correctly identified ovotransferrin and glutathione S-transferase from MS/MS results. In this example, a maximum of only three peptides were selected, and from these, two matched ovotransferrin and one matched glutathione S-transferase. This targeted selection for MS/MS can greatly reduce experiment time, and rapidly provide reliable identifications. spectra from each of these peptides, and one was matched to glutathione S-transferase and two to ovotransferrin, all with very high confidence. These were, indeed, the two proteins that had been deposited on the spot analyzed. The protein identified erroneously by PMF was removed from consideration, as no MS/MS data that could be matched to its sequence were obtained. The GPS Explorer software Results Browser interface for the Top Protein Hit Confirmation experiment is shown in Figure 3. Although more often than not PMF analyses of single component or simple protein mixtures using high quality MS data acquired on well calibrated instruments will yield correct protein identifications, this is not always the case, as shown in the example above. So the Top Protein Hit Confirmation feature of the Results Dependent Analysis, which much of the time would simply confirm correct PMF results, was used in this case to correct erroneous PMF results. The RDA software feature of GPS Explorer software can also be used to identify peptides which cannot be matched to a protein with high confidence. Again, the user may specify the maximum number of precursors and a minimum signal-to-noise depending on the characteristics of the sample. In the example shown in Figure 4, a hemoglobin sample containing both the alpha and beta hemoglobin chains was digested with trypsin and deposited onto the MALDI target. PMF analysis identified the beta chain only with high confidence, as shown in Figure 4, with a number of peptides not matched to any protein.
Five precursor ions of peptides that had not been matched in the PMF experiments with any protein with high confidence were automatically submitted to the 4700 Proteomics Analyzer, then the MS/MS spectra were acquired and used to carry out database searching. These resulted in the identification of the alpha chain of hemoglobin with very high confidence, as illustrated in Figure 5. Conclusions The Results Dependent Analysis features of the 4700 Proteomics Discovery System allow the user to leverage the high performance tandem mass spectrometry capabilities of the 4700 Proteomics Analyzer to confirm with high confidence the results from peptide mass fingerprinting experiments, as well as identify proteins which may be represented by too few peptides for PMF to yield reliable results. In addition to the RDA software features, the confirmation by MS/MS allows for the identification and exclusion of erroneous PMF results, which can sometimes be obtained with high confidence. All this is automated, resulting in ease of use and significant economy in sample consumption, since only those ions which are likely to yield useful confirmatory results are selected for MS/MS acquisition. Figure 4. Peptide mass fingerprinting results of a trypsin digested sample of both alpha and beta chain hemoglobin. In this result, only the beta chain hemoglobin was identified with high confidence. The peptides in red show matched peptides. RDA software acquired MS/MS data automatically on the unmatched peptides, and correctly identified alpha chain hemoglobin see Figure 5. Figure 5. Subsequent MS/MS analysis of unmatched peptides resulted in high confidence identification of the alpha chain component of a tryptic digest mixture of alpha and beta chain hemoglobin. The acquisition of MS/MS data was performed from the PMF results by having the software automatically select peptide ions in the spectra of the identified protein that are not associated with that protein.
iscience. To better understand the complex interaction of biological systems, life scientists are developing revolutionary approaches to discovery that unite technology, informatics, and traditional laboratory research. In partnership with our customers, Applied Biosystems provides the innovative products, services, and knowledge resources that make this new, Integrated Science possible. Worldwide Sales Offices Applied Biosystems vast distribution and service network, composed of highly trained support and applications personnel, reaches 150 countries on six continents. For international office locations, please call the division headquarters or refer to our Web site at. Applera is committed to providing the world s leading technology and information for life scientists. Applera Corporation consists of the Applied Biosystems and Celera Genomics businesses. Headquarters 850 Lincoln Centre Drive Foster City, CA 94404 USA Phone: 650.638.5800 Toll Free: 800.345.5224 Fax: 650.638.5884 For Research Use Only. Not for use in diagnostic procedures. AB (Design), Applera, iscience, iscience (Design), Explorer, and RDA are trademarks and Applied Biosystems is a registered trademark of Applera Corporation or its subsidiaries in the US and/or certain other countries. MASCOT is a registered trademark of Matrix Science Ltd. 2003 Applied Biosystems. All rights reserved. Information subject to change without notice. Printed in the USA, 11/2003, LD Publication 115AP19-01