An Improved Approach for Glycan Structure Identification from HCD MS/MS Spectra

Size: px
Start display at page:

Download "An Improved Approach for Glycan Structure Identification from HCD MS/MS Spectra"

Transcription

1 Introduction Method Experiments An Improved Approach for Glycan Structure Identification from HCD MS/MS Spectra Weiping Sun, Yi Liu, Gilles Lajoie, Bin Ma and Kaizhong Zhang Department of Computer Science, Western University October 3, / 18

2 Overview Introduction Method Experiments 1 Introduction 2 Main Method 3 Experiments and Discussion 2 / 18

3 Introduction Method Experiments Glycosylation Glycoproteomics Approaches Glycosylation One of the most important PTMs 70% human proteins are glycosylated 3 / 18

4 Introduction Method Experiments Glycosylation Glycoproteomics Approaches Glycosylation One of the most important PTMs 70% human proteins are glycosylated Types of Glycosylation N-linked glycosylation Attached to N(Asn) Motif; core structure O-linked glycosylation Linked to S/T, or hydroxylysine residues 3 / 18

5 Introduction Method Experiments Glycosylation Glycoproteomics Approaches MS-Based Glycoproteomic Analysis 4 / 18

6 Introduction Method Experiments Glycosylation Glycoproteomics Approaches Tandem Mass Spectrometry Three common fragmentation techniques: CID: B- and Y-ions HCD: B- and Y-ions, as well as A- and X-ions ETD/ECD: C- and Z-ions 5 / 18

7 Introduction Method Experiments Glycosylation Glycoproteomics Approaches Glycan Identification from HCD Spectrum 6 / 18

8 Introduction Method Experiments Glycosylation Glycoproteomics Approaches Glycan Identification from HCD Spectrum 7 / 18

9 Introduction Method Experiments Glycosylation Glycoproteomics Approaches Computational Approaches for Spectral Data Interpretation Database search Search from a glycan database to find matched glycan candidates. Examples: GlycoSearchMS, GlycoWorkBench, MAGIC, GlycoMaster DB, etc. 8 / 18

10 Introduction Method Experiments Glycosylation Glycoproteomics Approaches Computational Approaches for Spectral Data Interpretation Database search Search from a glycan database to find matched glycan candidates. Examples: GlycoSearchMS, GlycoWorkBench, MAGIC, GlycoMaster DB, etc. VS De novo sequencing Computation does not rely on glycan database knowledge, instead the algorithms directly construct glycans from MS/MS. Examples: Glycan: STAT, GlyCH, StrOligo, etc. Glycopeptide: GlycoMaster, etc. 8 / 18

11 Introduction Method Experiments Mathematical Model Method Motivations De novo sequencing need high-quality mass spectra. Database search has the ability to obtain more reliable results. Our previous de novo sequencing method can at least provide useful structures. 9 / 18

12 Introduction Method Experiments Mathematical Model Method Glycan Database Search Problem Glycan: a labelled rooted unordered tree with bounded degree. Glycan database search problem: Input: - An MS/MS spectrum M - A glycan database D - A predefined mass error tolerance δ Output: A glycan structure T in D that satisfies, - T + P + H 2 O + 1 M p δ; - Matching score between M and T is maximized. 10 / 18

13 Main Idea Introduction Method Experiments Mathematical Model Method Use de novo sequencing result to filter glycans selected from database. 11 / 18

14 Introduction Method Experiments Mathematical Model Method Step 1: Peptide mass calculation 12 / 18

15 Introduction Method Experiments Mathematical Model Method Step 2: Glycan candidate selection and raw score calculation 1. Calculate glycan mass 2. Screen glycan database for possible glycan candidates 3. Calculate their raw score S raw = α f (m B i, h B i )+β f (m Y j, h Y j )+θ f (m I k, h I k ) 13 / 18

16 Introduction Method Experiments Mathematical Model Method Step 3: Filtration A list of de novo sequencing glycans: L n = {R 1, R 2,..., R m } A list of database glycans: L d = {Q 1, Q 2,..., Q n } rank(r S comp (Q i, R j ) = S align (Q i, R j ) e j ) S(Q i ) = K k=1 S comp(q i, R k ) 1 K S raw (Q i ) 1 14 / 18

17 Dataset Introduction Method Experiments Dataset Results Protein samples: - Alpha-1-acid glycoprotein (Bovine) - Ovomucoid (Chicken) - Ig gamma-3 chain C region (Human) Thermo Scientific Orbitrap Elite hybrid mass spectrometer HCD fragmentation technique GlycoMaster DB was used for comparison 46 HCD spectra of glycopeptides were contained 15 / 18

18 Introduction Method Experiments Dataset Results Experimental Results Software program: GlycoNovoDB Table: Performance of De Novo Sequencing Algorithm and GlycoNovoDB Compared with GlycoMaster DB Rank 1 De Novo Sequencing Algorithm GlycoNovoDB Number Percentage(%) Number Percentage(%) > can t find Rank refers to the ranking status of the reference structure (the top glycan structure reported by GlycoMaster DB) in our results for a spectrum. 16 / 18

19 Introduction Method Experiments Dataset Results Experimental Results GlycoNovoDB can report more confident results than GlycoMaster DB. - There are 6 spectra that GlycoMaster DB reported more than one top-ranked glycans with the same score. - An example: 17 / 18

20 Introduction Method Experiments Acknowledgement Thank you! Questions? University of Western Ontario - Prof. Kaizhong Zhang - Prof. Gilles A. Lajoie - Dr. Weiping Sun - Dr. Yi Liu University of Waterloo - Prof. Bin Ma This work was supported in part by the NSERC Discovery Grant and a Discovery Accelerator Supplements Grant. 18 / 18