CBRS Chlamydiae community re-annotation
|
|
- Prudence Craig
- 5 years ago
- Views:
Transcription
1 CBRS Chlamydiae community re-annotation
2 Session schedule Intro and automatic annotation (T. Weinmaier) Nomenclature Manual refinement Submission / Publication
3 Session schedule Intro and automatic annotation (T. Weinmaier) Nomenclature Manual refinement Submission / Publication
4 Concept for Chlamydiae community re-annotation Thomas Weinmaier Chlamydial Basic Research Society meeting, San Antonio, Texas
5 Prototypic workflow for genome projects Sequencing Assembly Primary annotation Functional annotation
6 Prototypic workflow for genome projects Sequencing Assembly Primary annotation Functional annotation
7 Prototypic workflow for genome projects Sequencing Assembly Primary annotation Functional annotation
8 Prototypic workflow for genome projects Sequencing Assembly Primary annotation Functional annotation MxiH
9 Prototypic workflow for genome projects Sequencing Assembly Primary annotation Functional annotation MxiH Annotation drawbacks: - No standard procedure - Later not updated
10 Comparative genomics C. pneumoniae clinical isolates: AR39, CWL029, J138 and TW183 Genome sizes ~1,23 Mb 6000 nucleotides different (99.5% identical) Annotated genes in Genbank Genes without ortholog Isolate # Genes Isolate AR39 AR CWL J TW AR39 - CWL J TW183 72
11 BLAST search at NCBI Query: chlamydial protease like activity factor (CPAF) [Waddlia chondrophila WSU ] Search: BLASTP against NCBI RefSeq database
12 BLAST search at NCBI Query: chlamydial protease like activity factor (CPAF) [Waddlia chondrophila WSU ] Search: BLASTP against NCBI RefSeq database Description chlamydial protease-like activity factor (CPAF) [Waddlia chondrophila WSU ] putative chlamydial protease-like activity factor [Parachlamydia acanthamoebae str. Hall s coccus] >ref Y protease-like activity factor [Protochlamydia amoebophila UWE25] hypothetical protein CAB712 [Chlamydophila abortus S26/3] hypothetical protein CAB1_0732[Chlamydophila abortus LLG]
13 BLAST search at NCBI Query: chlamydial protease like activity factor (CPAF) [Waddlia chondrophila WSU ] Search: BLASTP against NCBI RefSeq database Description chlamydial protease-like activity factor (CPAF) [Waddlia chondrophila WSU ] putative chlamydial protease-like activity factor [Parachlamydia acanthamoebae str. Hall s coccus] >ref Y protease-like activity factor [Protochlamydia amoebophila UWE25] hypothetical protein CAB712 [Chlamydophila abortus S26/3] hypothetical protein CAB1_0732[Chlamydophila abortus LLG]
14 83 chlamydial genomes Problem: - Different annotation strategies - No update after submission
15 83 chlamydial genomes Problem: - Different annotation strategies - No update after submission Re-annotation goal: - Consistency - Currentness
16 83 chlamydial genomes Problem: - Different annotation strategies - No update after submission Re-annotation goal: - Consistency - Currentness Solution: - Consistent automatic re-annotation - Manual refinement
17 Proposed re-annotation strategy DNA sequences Resubmission Automatic annotation Reannotation of all publicly available chlamydial genomes CBRS reannotation project consortium Publication Manually refined annotation Chlamydiae researchers
18 Proposed re-annotation strategy DNA sequences Resubmission Automatic annotation Reannotation of all publicly available chlamydial genomes CBRS reannotation project consortium Publication Manually refined annotation Chlamydiae researchers DNA sequences from Genbank Automatic re-annotation
19 Proposed re-annotation strategy DNA sequences Resubmission Automatic annotation Reannotation of all publicly available chlamydial genomes CBRS reannotation project consortium Publication Manually refined annotation Chlamydiae researchers DNA sequences from Genbank Automatic re-annotation Manual refinement
20 Proposed re-annotation strategy DNA sequences Resubmission Automatic annotation Reannotation of all publicly available chlamydial genomes CBRS reannotation project consortium Publication Manually refined annotation Chlamydiae researchers DNA sequences from Genbank Automatic re-annotation Manual refinement Resubmission to Genbank Publication
21 Proposed re-annotation strategy DNA sequences Resubmission Automatic annotation Reannotation of all publicly available chlamydial genomes CBRS reannotation project consortium Publication Manually refined annotation Chlamydiae researchers DNA sequences from Genbank Automatic re-annotation Manual refinement Resubmission to Genbank Publication
22 Automatic annotation software Intrinsic information (Gene prediction tools) Extrinsic information (BLAST search) Resolving overlaps (trnas, rrnas, ncrnas) ConsPred Consensus prediction Functional annotation Best prediction by integrating multiple evidences spurious predictions are discarded ConsPred annotation
23 Comparative genomics after re-annotation C. pneumoniae clinical isolates: AR39, CWL029, J138 and TW183 Genome sizes ~1,23 Mb 6000 nucleotides different (99.5% identical) Annotated genes from ConsPred Genes without ortholog Isolate # Genes Isolate AR39 AR CWL J TW AR39 - CWL J TW
24 ConsPred Genbank Gene counts genes trnas rrnas ncrnas Cmu Cps Cps Cfe Cpn Cpn Cps Cps Cps Env Cmu/ ps Cca/ pe Cp n Cp s En v Cmu Cps Cpn Cps Cps Env Cmu Cps Cpn Cps Cps Env Cmu Cps Cps Cfe Cpn Cpn Cps Cps Cps Env Cmu/ ps Cca/ pe Cp n Cp s En v Cmu/ ps Cca/ pe Cp n Cps Env Cmu/ ps Cca/ pe Cp n Cps Env
25 Session schedule Intro and automatic annotation (T. Weinmaier) Nomenclature Manual refinement Submission / Publication
26 Session schedule Intro and automatic annotation (T. Weinmaier) Nomenclature Manual refinement Submission / Publication
27 Proposed re-annotation strategy DNA sequences Resubmission Automatic annotation Reannotation of all publicly available Chlamydiae CBRS reannotation project consortium Publication Manually refined annotation Chlamydiae researchers DNA sequences from Genbank Automatic re-annotation Discussion with community (rules, nomenclature, evidences,..) Resubmission to Genbank Publication
28 Nomenclature Existing references are retained Same locus_tag: Genes with unchanged coordinates Genes with changed start coordinate (new protein ID in DB) In between locus_tag: Newly annotated genes (e.g. CT_444.1) Removed genes lose locus_tag Evidence (PMID) of functional annotation is added as note Gene names?
29 References - now LOCUS AE bp DNA circular BCT 05-MAR-2010 DEFINITION Chlamydophila pneumoniae CWL029, complete genome. ACCESSION AE AE AE VERSION AE GI: DBLINK BioProject: PRJNA248 SOURCE Chlamydophila pneumoniae CWL029 ORGANISM Chlamydophila pneumoniae CWL029 Bacteria; Chlamydiae; Chlamydiales; Chlamydiaceae; Chlamydia/Chlamydophila group; Chlamydia. REFERENCE 1 (bases 1 to ) AUTHORS Kalman,S., Mitchell,W., Marathe,R., Lammel,C., Fan,J., Hyman,R.W., Olinger,L., Grimwood,J., Davis,R.W. and Stephens,R.S. TITLE Comparative genomes of Chlamydia pneumoniae and C. trachomatis JOURNAL Nat. Genet. 21 (4), (1999) PUBMED REFERENCE 2 (bases 1 to ) AUTHORS Kalman,S., Mitchell,W., Marathe,R., Lammel,C., Fan,J., Olinger,L., Grimwood,J., Davis,R.W. and Stephens,R.S. TITLE Direct Submission JOURNAL Submitted (01-DEC-1998) Program in Infectious Diseases, University of California, 235 Earl Warren Hall, Berkeley, CA 94720, USA Genome paper Direct submission
30 References after re-annotation LOCUS AE bp DNA circular BCT 05-MAR-2010 DEFINITION Chlamydophila pneumoniae CWL029, complete genome. ACCESSION AE AE AE VERSION AE GI: DBLINK BioProject: PRJNA248 SOURCE Chlamydophila pneumoniae CWL029 ORGANISM Chlamydophila pneumoniae CWL029 Bacteria; Chlamydiae; Chlamydiales; Chlamydiaceae; Chlamydia/Chlamydophila group; Chlamydia. REFERENCE 1 (bases 1 to ) AUTHORS Kalman,S., Mitchell,W., Marathe,R., Lammel,C., Fan,J., Hyman,R.W., Olinger,L., Grimwood,J., Davis,R.W. and Stephens,R.S. TITLE Comparative genomes of Chlamydia pneumoniae and C. trachomatis JOURNAL Nat. Genet. 21 (4), (1999) PUBMED REFERENCE 2 (bases 1 to ) AUTHORS The Chlamydia re-annotation project consortium TITLE Reannotation of all publicly available chlamydial genomes JOURNAL XXX (2013) PUBMED XXX REFERENCE 3 (bases 1 to ) AUTHORS Kalman,S., Mitchell,W., Marathe,R., Lammel,C., Fan,J., Olinger,L., Grimwood,J., Davis,R.W. and Stephens,R.S. TITLE Direct Submission JOURNAL Submitted (01-DEC-1998) Program in Infectious Diseases, University of California, 235 Earl Warren Hall, Berkeley, CA 94720, USA Genome paper Re-annotation paper Direct submission
31 Session schedule Intro and automatic annotation (T. Weinmaier) Nomenclature Manual refinement Submission / Publication
32 Session schedule Intro and automatic annotation (T. Weinmaier) Nomenclature Manual refinement Submission / Publication
33 Proposed re-annotation strategy DNA sequences Resubmission Automatic annotation Reannotation of all publicly available Chlamydiae CBRS reannotation project consortium Publication Manually refined annotation Chlamydiae researchers DNA sequences from Genbank Automatic re-annotation Discussion with community (rules, nomenclature, evidences,..) Resubmission to Genbank Publication
34 Manual refinement: your contributions Locus_tag Product Gene name EC number Pubmed ID Taxonomic transfer level CPn_0008 HB2 protein hb Chlamydophila pneumoniae CPn_0056 CPn_ CPn1016 Deleted: CPn1119 Phosphoglucomutase/ phosphomannomutase Tryptophanyl trna Synthetase Chlamydial protease like activity factor Chlamydophila pneumoniae trps_ Chlamydiaecae cpaf Chlamydiales Hypothetical protein Chlamydophila pneumoniae CWL029
35 Manual refinement: your contributions Locus_tag Product Gene name EC number Pubmed ID Taxonomic transfer level CPn_0008 HB2 protein hb Chlamydophila pneumoniae CPn_0056 CPn_ CPn1016 Deleted: CPn1119 Phosphoglucomutase/ phosphomannomutase Tryptophanyl trna Synthetase Chlamydial protease like activity factor Chlamydophila pneumoniae trps_ Chlamydiaecae cpaf Chlamydiales Hypothetical protein Chlamydophila pneumoniae CWL029
36 Genbank entry - now gene /locus_tag="cpn_0008" CDS /locus_tag="cpn_0008" /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="aad " /db_xref="gi: " /translation="misgllfllvrrevptvrseeiprgvsvtpseepalekaqkepe TKKILDRLPKELDQLDTYIQEVFACLERLKDPKYEDRGLLTEAKEKLRVFDVVEKDMM SEFLDIQRVLNEEAYYVEHCQDPLENIAYEIFSSQELRDYYCAGVCGYLPSGDARADR LKRSVKEVMDRFMRVTWKSWEASVMLDHSYGVARELFKKAVGVLEESVYKILFKSYRD AFYECEKAKIQRDGRFKWL" Automatic annotation
37 Genbank entry after re-annotation gene /locus_tag="cpn_0008 /gene="hb2" CDS /locus_tag="cpn_0008 /gene="hb2" /codon_start=1 /transl_table=11 /product="hb2 protein" /protein_id="aad " /db_xref="gi: /note ="product, gene name derived from literature (PMID )" /translation="misgllfllvrrevptvrseeiprgvsvtpseepalekaqkepe TKKILDRLPKELDQLDTYIQEVFACLERLKDPKYEDRGLLTEAKEKLRVFDVVEKDMM SEFLDIQRVLNEEAYYVEHCQDPLENIAYEIFSSQELRDYYCAGVCGYLPSGDARADR LKRSVKEVMDRFMRVTWKSWEASVMLDHSYGVARELFKKAVGVLEESVYKILFKSYRD AFYECEKAKIQRDGRFKWL" Manual annotation Manual annotation Manual annotation Indication of source
38 Genbank entry - now gene /gene="mrsa" /locus_tag="cpn_0056" CDS /gene="mrsa" /locus_tag="cpn_0056" /codon_start=1 /transl_table=11 /product="phosphomannomutase" /protein_id="aad " /db_xref="gi: " /translation="mkeveqrirslydavtaenicrwlsndctqqdaktilgwldtdp AQLEDLFGATLTFGTGGLRSLMGIGTNRINLFTIRRTTQGLVQVLRAHLPHPGDPMRV Automatic annotation Automatic annotation Automatic annotation.
39 Genbank entry after re-annotation gene /gene="mrsa" /locus_tag="cpn_0056" CDS /gene="mrsa" /locus_tag="cpn_0056" /codon_start=1 /transl_table=11 /product="phosphoglucomutase/phosphomannomutase /EC_number =" " /protein_id="aad " /db_xref="gi: /note ="product, EC_number derived from literature (PMID )" /translation="mkeveqrirslydavtaenicrwlsndctqqdaktilgwldtdp AQLEDLFGATLTFGTGGLRSLMGIGTNRINLFTIRRTTQGLVQVLRAHLPHPGDPMRV Automatic annotation Automatic annotation Manual annotation Manual annotation Indication of source.
40 Session schedule Intro and automatic annotation (T. Weinmaier) Nomenclature Manual refinement Submission / Publication
41 Session schedule Intro and automatic annotation (T. Weinmaier) Nomenclature Manual refinement Submission / Publication
42 Proposed re-annotation strategy DNA sequences Resubmission Automatic annotation Reannotation of all publicly available Chlamydiae CBRS reannotation project consortium Publication Manually refined annotation Chlamydiae researchers DNA sequences from Genbank Automatic re-annotation Discussion with community (rules, nomenclature, evidences,..) Resubmission to Genbank Publication
43 Submission Why? Provides up-to-date knowledge for BLAST searches Facilitates future (multi-)genome projects How? Chlamydia re-annotation project generates new Genbank files Owners of Genbank entries re-submit
44 Publication Authors: all contributors to manual re-annotation Research article Re-annotation of all publicly available chlamydial genomes CBRS reannotation project consortium Manuscript outline Goal: improving existing genome annotations Method: community-based re-annotation Results Comparison between old and new annotation Analysis of the chlamydial pan-genome Further ideas / contributions?
45 Session schedule Intro and automatic annotation (T. Weinmaier) Nomenclature Manual refinement Submission / Publication
46 Thank you for your attention!
47