Functional profiling with HUMAnN2

Size: px
Start display at page:

Download "Functional profiling with HUMAnN2"

Transcription

1 Eric Franzosa Jason Lloyd-Price Functional profiling with HUMAnN2 Curtis Huttenhower Galeb Abu-Ali Ali Rahnavard Harvard T.H. Chan School of Public Health Department of Biostatistics STAMPS

2 The two big questions of microbial community profiling: Who is there? (taxonomic profiling) What are they doing? (functional profiling) Like many great bioinformatics problems, answering these questions begins with sequence2search! 2

3 HUMAnN2 for taxon-specific metagenome and metatranscriptome functional profiling Eric Franzosa Lauren McIver 3

4 HUMAnN2: stratified output 4 UniRef gene cluster Gene name Total gene abundance (RPK) UniRef90_R6K3Z5: IMP dehydrogenase UniRef90_R6K3Z5: IMP dehydrogenase Bacteroides_caccae UniRef90_R6K3Z5: IMP dehydrogenase Bacteroides_dorei UniRef90_R6K3Z5: IMP dehydrogenase Bacteroides_ovatus UniRef90_R6K3Z5: IMP dehydrogenase Bacteroides_stercoris UniRef90_R6K3Z5: IMP dehydrogenase Bacteroides_vulgatus UniRef90_R6K3Z5: IMP dehydrogenase unclassified Σ ~HUMAnN1 Per-species & unclassified stratifications MetaCyc pathway Pathway abundance & coverage PWY-7221: GTP biosynthesis PWY-7221: GTP biosynthesis Bacteroides_caccae PWY-7221: GTP biosynthesis Bacteroides_dorei

5 HUMAnN2 real-world performance 5 Applied HUMAnN2 s tiered search to profile >2K human metagenomes (HMP1-II, six major body sites) Pangenome search tier 1-2 orders of magnitude faster than comprehensive translated search DIAMOND w/ comprehensive protein db bowtie2 w/ sample-specific pangenome db 5

6 HUMAnN2 real-world performance 6 ~60% of reads align before translated search ~15% more reads align during translated search (total ~75%) Applied HUMAnN2 s tiered search to profile >2K human metagenomes (HMP1-II, six major body sites) Pangenome search tier 1-2 orders of magnitude faster than comprehensive translated search DIAMOND w/ comprehensive protein db bowtie2 w/ sample-specific pangenome db 6

7 And it works on non-human meta omes, too Luke Thompson 7

8 Quantifying the diversity of species contributing a function within and across subjects low between-subject diversity high 8 low within-subject diversity high A pathway s contributional alpha-diversity is calculated from the distribution of taxa providing it (DNA or RNA) within a community; contributional beta-diversity is the corresponding comparison between communities. 8

9 Quantifying the diversity of species contributing a function within and across subjects low between-subject diversity high 9 low within-subject diversity high 9

10 Quantifying the diversity of species contributing a function within and across subjects low between-subject diversity high 10 low within-subject diversity high simple, consistent complex, consistent simple, variable complex, variable 10

11 HUMAnN2 reveals unusual relative expression in paired metatranscriptomes & metagenomes In collaboration with the STARR Consortium & HPFS cohort Sucrose degradation follows a complex attribution pattern across ~200 human gut metagenomes but its expression can be dominated by a single species in paired gut metatranscriptomes! 11 11

12 The HMP2 IBD Multi omics Data resource Funded by National Institutes of Health, Dept. of Health and Human Services With Ramnik Xavier

13 The IBD Multi omics DataBase Cesar Arze Funded by National Institutes of Health, Dept. of Health and Human Services 13

14 The IBD metatranscriptome in the HMP2 IBDMDB 117 Subjects: 59 Crohn s Disease 34 Ulcerative Colitis 24 non-ibd Controls Gender: 57 Female 59 Male 1 unknown Cohorts: 32 MGH adult new onset 30 Cedars-Sinai adult establ. 31 Cincinnati peds new onset 11 Emory peds new onset 13 MGH peds new onset CD Collection week non IBD UC Collection week Diagnosis non IBD UC CD Data type only MG MG & MT 14 Melanie Schirmer

15 Different microbes can transcribe shared pathways Species Alistipes_onderdonkii Alistipes_putredinis Alistipes_shahii Bacteroides_cellulosilyticus Bacteroides_dorei Bacteroides_eggerthii Bacteroides_faecis Bacteroides_fragilis Bacteroides_massiliensis Bacteroides_ovatus Bacteroides_sp_4_3_47FAA Bacteroides_stercoris Bacteroides_thetaiotaomicron Bacteroides_uniformis Bacteroides_vulgatus Bacteroides_xylanisolvens Clostridium_symbiosum Parabacteroides_distasonis Parabacteroides_merdae unclassified Other 3e 04 2e 04 RNA 1e 04 0e+00 4e 04 3e 04 HISDEG-PWY: L-histidine degradation I RNA DNA 2e 04 DNA e e

16 Different microbes can transcribe shared pathways Species Alistipes_onderdonkii Alistipes_putredinis Alistipes_shahii Bacteroides_cellulosilyticus Bacteroides_dorei Bacteroides_eggerthii Bacteroides_faecis Bacteroides_fragilis Bacteroides_massiliensis Bacteroides_ovatus Bacteroides_sp_4_3_47FAA Bacteroides_stercoris Bacteroides_thetaiotaomicron Bacteroides_uniformis Bacteroides_vulgatus Bacteroides_xylanisolvens Clostridium_symbiosum Parabacteroides_distasonis Parabacteroides_merdae unclassified Other 3e 04 2e 04 RNA 1e 04 0e+00 4e 04 3e 04 HISDEG-PWY: L-histidine degradation I RNA DNA 2e 04 DNA e e A. putredinis has been implicated in IBD Major contributor to transcription in subsets of IBD patients

17 Pathways can be contributed by different microbes over time PWY-7094: fatty acid salvage RNA Faecalibacterium prausnitzii 1.00 Time-courses for individual patients: PWY 7094: fatty acid salvage CD DNA 0.50 CD Patient 1 value bug Faecalibacterium_prausnitzii unclassified G89309 G89361 G89364 G89332 G89296 variable CD Patient 2 value PWY 7094: fatty acid salvage CD bug Eubacterium_hallii Faecalibacterium_prausnitzii unclassified G89354 G89297 G89287 G89316 G89330 variable

18 HUMAnN2 tutorial 18

19