NIAID-funded Bacterial Bioinformatics Resource Center (BRC) designed to support infectious disease research

Size: px
Start display at page:

Download "NIAID-funded Bacterial Bioinformatics Resource Center (BRC) designed to support infectious disease research"

Transcription

1

2

3 July 2018

4 } } NIAID-funded Bacterial Bioinformatics Resource Center (BRC) designed to support infectious disease research Special emphasis on 22 pathogenic genera Bacillus, Bartonella, Borrelia, Brucella, Burkholderia, Campylobacter, Chlamydophila, Clostridium, Coxiella, Ehrlichia, Escherichia, Francisella, Helicobacter, Listeria, Mycobacterium, Rickettsia, Salmonella, Shigella, Staphylococcus, Streptococcus, Vibrio, and Yersinia } } Merger of PATRIC, RAST, SEED and other resources built by teams at ANL, UC, FIG, and Virginia Tech Usage: >30,000 users >4,000 citations

5 } > 130,000 public microbial genomes more added every month 10 host genomes and their annotations } Uniform genome annotations across all genomes Genes, RNAs, proteins, protein functions, GO, EC, protein families AMR genes, virulence factors, drug targets, essential genes Biochemical pathways and metabolic models Annotations of all public genomes updated every 3-4 months } Curated genome metadata and AMR phenotypes Disease, isolation, phenotype, clinical and environmental AMR phenotype data: >15,000 genomes and >100 antibiotics } } } Transcriptomics data (>800 datasets) Protein-protein and host-pathogen interactions Proteomics, metabolomics and Tn-seq data

6 140, , ,000 80,000 60,000 40,000 20,000 0 Dec-09 Mar-10 Jun-10 Sep-10 Dec-10 Mar-11 Jun-11 Sep-11 Dec-11 Mar-12 Jun-12 Sep-12 Dec-12 Mar-13 Jun-13 Sep-13 Dec-13 Mar-14 Jun-14 Sep-14 Dec-14 Mar-15 Jun-15 Sep-15 Dec-15 Mar-16 Jun-16 Sep-16 Dec-16 Mar-17 Jun-17 Sep-17

7 } Using curated AMR phenotype data in PATRIC as training sets, build machine learning classifiers } Predict the antimicrobial resistance (AMR) phenotypes for new genomes } Predict the genomic regions associated with AMR } Use these predictions to identify new AMR genes and enhance our understanding of AMR mechanisms } To date, 40 AMR phenotype prediction classifiers have been deployed.

8 } Genome Assembly Many Assemblers (short, long reads), Compare Assembly Output } Genome Annotation High-speed genome annotation using RASTtk and controlled vocabulary from SEED project Specialized annotation modules - New Prediction of AMR phenotype and AMR genes Prediction of gene essentiality } Similar Genome Finder - New Find genomes that are most similar to a genome of interest } Proteome Comparison Compare up to 8 genomes to a reference using bi-directional BLAST hits } Variation Analysis - New Identify SNPs, SNVs, and insertion / deletion

9 } } Comparative Analysis and Visualizations Protein family and metabolic pathway comparisons Gene list, gene set, projections, heatmaps Transcriptome analysis, up/down fold changes Metadata, disease, and PPI data Comprehensive Searching AMR genes (ARDB, CARD PATRIC AMR db), genome features, external ID mapping, similarity, gene pages, gene collections, correlated genes, genome finder, transcriptome, EC, GO, etc.

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31 Comparison of thousands of protein families across hundreds of genomes

32 Comparison of thousands of protein families across hundreds of genomes

33 Comparison of thousands of protein families across hundreds of genomes

34 Comparison of metabolic pathways across hundreds of genomes

35

36

37

38

39

40

41

42

43

44

45

46 } Seeking NIAID funding funding extension } A possible pre-evergreen 2019 workshop } Pushing for community annotation Undergraduate students (I have about 20 in training) } PLEASE MAKE YOUR VOICE HEARD: What s on your wish list? What do we need to improve phage therapy resources, in particular?

47 Robert A. Edwards, PhD Katelyn McNair } RASTtk and PhiRAST development: Ross Overbeek, Robert Olson, Jim Davis, Gordon Pusch, Terry Disz, Bruce Parrello } Phage annotators (Phantomers): Bhakti Dwivedi, Mya Breitbart, et al. } FIG and all SEED annotators: VeronikaV, SvetaG, OlgaV/Z, et al. & NSF $$

48 } SEED, RAST, myrast, phirast: RAST: Aziz et al., BMC Genomics 2008 SEED servers: Aziz RK,, et al. (2012) PLoS ONE 7(10): e Nucleic Acids Res Jan;42(Database issue):d } PATRIC: Antonopolus et al., Brief. Bioinf Jul 31; Online early

49

50