The Immune Epitope Database and Analysis Resource (IEDB)
|
|
- Jordan Hutchinson
- 5 years ago
- Views:
Transcription
1 The Immune Epitope Database and Analysis Resource (IEDB) Jason Greenbaum La Jolla Institute for Allergy and Immunology EMBRACE Bioinformatics of Immunology Workshop January 24, 2007 Lyngby, Denmark
2 Overview The IEDB Concept & Goals Walkthrough Analysis Tools Information Exchange Ontology Development Summary & Future Plans
3 Overview The IEDB Concept & Goals Walkthrough Analysis Tools Information Exchange Ontology Development Summary & Future Plans
4 What is an epitope? An epitope is defined as the chemical structure recognized by specific receptors of the immune system (antibodies, MHC molecules, and/or T cell receptors)
5 IEDB Organization Discovery Groups NIAID External Tool Developers Database Analysis Resource
6 Goals for the IEDB Catalog and organize an ever growing body of immunological information B and T cell epitopes from infectious pathogens, experimental and self antigens Priority on Category A-C pathogens and emerging diseases Humans, non human primates, rodents, and other species for which detailed epitope information is available Develop new methods to predict and model immune responses Assist in the development of vaccines and diagnostics Incorporate input from scientific community (Feedback and Forums)
7 Other sources of epitope information MHCPep SYFPEITHI FIMM HLA Ligand/Motif Database HIV Sequence Database
8 What sets the IEDB apart? Incorporation of positive AND negative data Finely detailed curation 10 full-time Ph.D.-level curators
9 What information is stored? Epitope information Sequence, structure, species, source protein, etc. Reference information Authors, journal, PMID, etc. Assay data Assay type, cell line(s), species, measurements, etc.
10 Curation Priorities Category A-C pathogens & toxins Emerging and Re-emerging infectious diseases Other Infectious diseases Allergens Self antigens involved in autoimmunity Transplant rejection antigens and other alloantigens Cancer epitopes
11 IEDB philosophy Inclusive curation We believe that is not our job to decide what is good or bad data Context information Use of assay standards Our job is to catalog the information and allow the users (scientific community) to easily access it
12 Data sources Literature Targeted PubMed query Direct submission Large-scale epitope discovery contracts
13 The curation process PubMed query Finalized Curation Peer review Epitope council review Abstract scan Curators
14 Selection and Curation of Influenza A Literature References More than 16 millions references are available in Pubmed 2063 references are influenza related (~0.01%) 743 selected after abstract scan (~36%) 429 curated into IEDB (~58%)
15 It all starts with PubMed
16 Automated Text Classification ~21,000 abstracts classified by expert into relevant: Yes / No Can this process be automated? Naive Bayes Classifier Analyze word frequencies in classified abstracts Cross validated result: 50% of the irrelevant abstracts can be identified with few (<5%) false negative classifications Curatable words Yes No Abstracts tag 0 30 epitope-tagged 0 28 superantigens 3 97 adverse 2 21 seroconversion 1 20 phage-displayed 8 1 overlapping epitope-based 8 0 ~X-mers~ 12 0 ~MHC allele~ 72 10
17 Current database statistics January 17, ,853 references 54,626 records 23,979 distinct epitopes
18 Overview The IEDB Concept & Goals Walkthrough Analysis Tools Information Exchange Ontology Development Summary & Future Plans
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39 Overview The IEDB Concept & Goals Walkthrough Analysis Tools Information Exchange Ontology Development Summary & Future Plans
40 Analysis tools available Viewing tools Epitope viewer Analytical tools Population coverage Conservancy analysis Predictive tools T cell epitope predictions MHC binding Antigen processing B cell epitope predictions
41
42 Analysis tools screenshot
43 B-cell epitope prediction tools Five different methods to predict linear Ab epitopes were selected 1. Chou and Fasman beta turn prediction Chou PY, Fasman GD. Adv Enzymol Relat Areas Mol Biol. 1978;47: Emini surface accessibility scale Emini EA, Hughes JV, Perlow DS, Boger J. J Virol Sep;55(3): Karplus and Schulz flexibility scale Karplus PA, Schulz GE. Naturwissenschafren 1985; 72: Kolaskar and Tongaonkar antigenicity scale Kolaskar AS, Tongaonkar PC. FEBS Lett Dec 10;276(1-2): Parker Hydrophilicity Prediction Parker JM, Guo D, Hodges RS. Biochemistry Sep 23; 25(19): These methods were implemented in the IEDB ( Workshop on B cell epitope prediction tools Greenbaum et al. (2007) J. Mol Recognit.
44 1. Specify protein sequence 2. Select a method 3. Click Submit
45
46 Planned/desired future tools Self-similarity Visualization of epitopes in genome New B cell epitope prediction tools Discotope Bepipred Ellipro SDSC
47 Overview The IEDB Concept & Goals Walkthrough Analysis Tools Information Exchange Ontology Development Summary & Future Plans
48 Data exchange XML Submissions IEDB XML Database export Tools HTTP Querying Linking
49 XML format (figure)
50 Direct submission issues Submitters unfamiliar with XML Completely automated validation impossible with XSD
51 Linking to the IEDB Two methods Linking to query result NCBI linking based on IEDB-supplied XML
52
53
54 Linking to other databases Links provided to source protein records in GenBank/SwissProt Pubmed ID at NCBI
55 Overview The IEDB Concept & Goals Walkthrough Analysis Tools Information Exchange Ontology Development Summary & Future Plans
56 Ontology Development "define:ontology" 28 results on google "An ontology is a controlled vocabulary that describes objects and the relations between them in a formal way, and has a grammar for using the vocabulary terms to express something meaningful within a specified domain of interest." Ontology of Immune Epitopes
57 Ontology goals Enumerate and unambiguously define all terms in database Determine relationships among entities Apply ontology to current database and to new records as they are entered
58 Ontology projects Ontology for Biomedical Investigations (OBI) Gene Ontology (GO) MGED Ontology (MO)
59 Why is it important? Minimizing redundant work Error checking Enforcing data constraints Information exchange
60 Assay Components Are Shared T T B APC APC MHC Binding MHC Ligand Elution T Cell Response B Cell Response
61 IEDB Data Structure Sathiamurthy et al, Immunome Research, 2005
62 Towards a Formal Ontology Goal: Upper Connecting level ontology information fromresources Basic Formal Ontology (BFO) Shared concepts from other biomedical ontologies (OBI, GO, NCBI, FMA,...) We will host next OBI workshop in San Diego
63 Overview The IEDB Concept & Goals Walkthrough Analysis Tools Information Exchange Ontology Development Summary & Future Plans
64 Summary IEDB seeks to organize and consolidate all existing and future epitope information Expert curation Rich set of integrated analysis tools available/under development Interacting with IEDB possible through several channels Formal ontology development under way to ensure data consistency
65 Future Plans Continue with curations Add capability to the Analysis Resource Work with Discovery Groups in submitting data Develop interface for external tools Complete ontology development Expand the IEDB s utility and exposure
66 Acknowledgments Scott Stewart Scott Way Tom Carolan Louis Bulger Hussein Emami John Quaresma Rob Thurmond Ryan Shyffer Jane Herron Cindy Oliver Phil Bourne Julia Ponomarenko Ole Lund Søren Buus
67 Further reading & surfing Peters et al PLoS Biology Sette et al Immunity IEDB: