Introduction to taxonomic analysis of metagenomic amplicon and shotgun data with QIIME. Peter Sterk EBI Metagenomics Course 2014
|
|
- Laureen Richards
- 6 years ago
- Views:
Transcription
1 Introduction to taxonomic analysis of metagenomic amplicon and shotgun data with QIIME Peter Sterk EBI Metagenomics Course
2 Taxonomic analysis using next-generation sequencing Objective we want to obtain samples from a particular environment to find out what lives in it. Know your sample What kind of samples do we have? Soil, water, host-associated (e.g. gut), etc. What do we expect to find in those samples? Prokaryotes, eukaryotic microorganisms (e.g. protists, fungi), viruses? Decide what you want to find out, e.g. bacteria/archaea populations all microbes (including eukaryotic ones) Design your experiment around that 2
3 Some terminology Amplicon: a DNA fragment that is amplified with PCR, e.g. one or more 16S rrna variable regions, or other marker genes. Most researchers will make use of standard PCR primers Clustering: grouping sequences in bins (or clusters) based on a percent similarity threshold. Operational Taxonomic Unit (OTU): species distinction in microbiology. Typically using rdna and a percent similarity threshold for classifying microbes within the same, or different, OTUs. Note that an OTU is distinct from a species. For bacteria/ archaea, OTUs are clusters of reads that are >97% identical. Barcode: a short DNA sequence that is added to each read during amplification and that is specific for a given sample. This allows samples to be mixed (multiplexed) to reduce sequencing cost. During analysis sequences need to be demultiplexed, i.e. separated by sample. 3
4 Common approaches: amplicon-based Sequencing of (regions of) target genes (amplicons) obtained by PCR using gene specific primers. For bacteria/archaea, the target is usually a 16S rrna gene fragment containing one of more variable regions, internal transcribed spacer (ITS) for fungi, 18S rrna gene fragments for eukaryotes Analysis usually requires a reference database that is searched to find the closest match to an OTU from which a taxonomic lineage is inferred. Some examples: Greengenes ( (16S) Ribosomal Database Project ( (16S) Silva ( 16S + 18S Unite ( ITS Less suitable for certain groups of organisms such as protists these are extremely diverse and only few have sequence information. The same goes for viruses. We will mainly focus on 16S analysis during the hands-on as this is most common, but you must decide whether this is suitable for your work. We will also spend a little time on taxonomic analysis of Illumina shotgun data 4
5 Hands-on QIIME tutorial QIIME is an open source software package for comparison and analysis of microbial communities, primarily based on high-throughput amplicon sequencing data (such as SSU rrna) generated on a variety of platforms. It is widely used and supported. We will use the latest version of QIIME (Quantitative Insights Into Microbial Ecology; qiime.org; version 1.8), pronounced chime to analyze 26 soil samples from a diesel-contaminated railway site (Sutton et al. 2013). You will have an electronic copy of the paper with your training materials. We have randomly picked 5000 reads from the original Roche 454 dataset to speed up the analysis. We also provide a pre-computed analysis of the full dataset. QIIME is used in the EBI metagenomics pipeline with whole genome shotgun data. EBI metagenomics currently does not analyze amplicon data as standard. However, with the help of this tutorial you could soon be analyzing your own amplicon data sets. We will spend some time on the analysis of an Illumina shotgun dataset, a metagenome of a microbial consortium obtained from the Tuna oil field in the Gippsland Basin, Australia (Dongmei et al and Sutcliffe et al. 2013). 5
6 OTU picking strategies in QIIME De novo Use for amplicons that overlap Use if you do not have a reference sequence collection Clusters all reads without using a reference Not very suitable for very large data sets (cannot be run in parallel) (I will explain this strategy in more detail) Closed-reference Use if amplicons (or shotgun reads) do not overlap And you have a reference sequence collection Note: reads that do not hit a reference sequence are discarded Open-reference Use for amplicons that overlap Reads are clustered against a reference sequence Reads that do not match are clustered de novo 6
7 Common approaches: metagenomic analysis Identification of reads with 16S sequence (e.g. using rrnaselector) and closedreference OTU picking in QIIME. We will analyze an artificially small Illumina dataset during the hands-on. Blast-based analysis. E.g. blasting reads against the NCBI non-redundant nucleotide or protein data databases and inferring taxonomic lineage from the best hit The tool MEGAN requires Blast output. A major drawback is that without preprocessing of NGS datasets and access to a major computational resource, this is not an option for most. MetaPhlAn approach ( relies on unique clade-specific marker genes identified from 3,000 reference genomes fast, but limited to certain types of study (mainly human microbiome) 7
8 De novo OTU picking in detail We will now go through the de novo OTU picking steps in more detail and focus on the diesel-contaminated railway line study. We will perform the actual analysis during the hands-on session today. We will largely follow the QIIME 454 overview tutorial at Aim of our study: Understand interrelationship among microbial community composition, pollution level, and soil geochemical and physical properties. Sequencing technology/chemistry: Roche 454 FLX Titanium Amplicon: V3 + V4 region of the 16S rrna gene 8
9 Overview of the diesel-contaminated railway site In samples were taken from 9 locations at different depths: A1: Fill; Polluted A2: Fill_Polluted B1: Fill; Clean B2: Clay; Polluted B3: Peat; Polluted B4: Peat; Polluted C1: Fill; Clean C2: Peat; Clean C3: Peat; Polluted D1: Fill; Clean D2: Clay; Clean D3: Clay; Polluted D4: Peat; Polluted D5: Sand; Polluted E1: Fill; Clean E2: Fill; Polluted F1: Sand; Clean F2: Sand; Polluted G1: Fill; Clean G2: Fill; Clean G3: Fill; Clean H1: Peat; Clean H2: Peat; Clean H3: Sand; Clean I1: Sand; Clean I2: Sand; Clean 9
10 The targeted 16S rrna gene region The targeted region is a 466 bp fragment containing the 16S rrna V3 and V4 hypervariable region Each sample has a sequence primer adapter and 10 nucleotide barcode to allow multiplexing (sequencing all samples on the same plate mainly to reduce sequencing cost) The sequence file is in Roche 454 SFF format 10
11 The analysis in detail (1) File preparation The standard 454 data format is SFF. We need to extract the fasta sequences and quality scores in two separate files. We will use the tool sffinfo from Roche. >GW6RNWL02GKV5K length=463 xy=2581_0822 region=2 run=r_2011_02_04_06_15_22_ ACATACGCGTCCTATGGGATGCAGCAGGCGCGAAAACTTTACAATGCCGGCAACGGCGAT >GW6RNWL02HFI7P length=418 xy=2930_0883 region=2 run=r_2011_02_04_06_15_22_ ACATACGCGTCCTATGGGATGCAGCAGGCGCGAAAACTTTACAATGCTGGCAACAGCGAT... AAGGGAACCTCGAGTGCCAGGTTACAAATCTGGCTGTCGAGATGCCTAAAAAGCATTTCA... >GW6RNWL02GKV5K length=463 xy=2581_0822 region=2 run=r_2011_02_04_06_15_22_ >GW6RNWL02HFI7P length=418 xy=2930_0883 region=2 run=r_2011_02_04_06_15_22_
12 The analysis in detail (2) Assign reads to samples using barcode information and perform some quality control We need to provide a tab-delimited mapping file that provides at a minimum the name of each sample, the barcode to identify the different samples, the linker/primer sequence used to amplify the DNA, and a description of the sample #SampleID BarcodeSequence LinkerPrimerSequence Description A1 ACATACGCGT CCTAYGGGRBGCASCAG A1_Fill_Polluted A2 ACGCGAGTAT CCTAYGGGRBGCASCAG A2_Fill_Polluted B1 ACTACTATGT CCTAYGGGRBGCASCAG B1_Fill_Clean etc. For example, sequence reads that have the sequence ACATACGCGT near the start will be assigned to sample A1. The procedure we use will rename headers in the fasta and quality files accordingly. It also removes the barcode and primer sequences from the reads as these interfere with the OTU picking. 12
13 Optional: Denoising 454 data (flowgram clustering) A small number of reads from Roche 454 pyrosequencing runs have characteristic errors when longer homopolymer runs are present. These reads give rise to erroneous OTUs. A procedure called denoising or flowgram clustering removes problematic reads and increases the accuracy of the taxonomic analysis Denoising is computationally expensive and we will therefore skip this procedure in the hands-on. If you work with 454 amplicon data and your file uses the older regular flow pattern, consider denoising. See Read the warning about the new random flow patterns. Remember that denoising does not make sense with shotgun data. 13
14 The analysis in detail (3) Pick Operational Taxonomic Units. These are collections of sequences that are highly similar (here 97% or more). Taxonomic assignments are done on these OTUs. We will perform de novo OTU picking. The QIIME workflow will produce a number of output files. A list of OTUs with taxonomic assignments with the hierarchy: kingdom, phylum, class, order, family, genus, species. Most OTUs cannot be classified up to species level. E.g: denovo745 k Bacteria; p Proteobacteria; c Alphaproteobacteria; o Rhizobiales; f Rhizobiaceae; g Agrobacterium; s A representation of a taxonomic tree in Newick format. The tree can be visualized in applications like FigTree. A file in biom (Biological Observation Matrix) format representing OTU tables. We will import this file into Megan 5 to visualize our results 14
15 De novo OTU picking in detail (1) Generate OTUs by clustering reads based on similarity (default is 97%) Sort reads according to size (long -> short) Cluster OTU1 OTU2 OTU3 OTU4 OTU5 15
16 De novo OTU picking in detail (2) Pick representative sequence for each OTU Assign taxonomy to each OTU OTU1 lineage 1 OTU2 lineage 2 OTU3 OTU4 OTU5 lineage 3 lineage 4 lineage 5 Reference database 16
17 De novo OTU picking in detail (3) Align OTU sequences (if you want to do further phylogenetic analysis) Optional: remove chimaeras from your alignment Filter alignment Create tree file in Newick format Create OTU table in biom format We can now visualize the results and do further analysis, such as alpha-diversity analysis (diversity within a sample) and beta-diversity analysis (diversity across samples) We will first have a quick look at Megan 5, a tool we will use to visualize our results. 17
18 A quick look at MEGAN 5 MEGAN stands for MEtaGenome ANalyzer and was written to help understand the composition and operation of complex microbial consortia. It is free for academic users and can be downloaded from In order to use MEGAN for both functional analysis and taxonomic analysis, a Blast step needs to be performed whereby a metagenomic dataset is Blast-ed against e.g. one of NCBI s non-redundant nucleotide or protein databases. This steps is extremely computationally expensive and not an option for many users. Recently support for the BIOM format was added, which allows us to visualize and analyze taxonomic analysis results from QIIME. Select import BIOM from the File menu. 18
19 Taxonomic tree display in MEGAN5 19
20 Rarefaction curves in MEGAN 5 20
21 Taxonomic composition of samples in MEGAN5 21
22 Selecting 16S rdna sequence with rrnaselector from shotgun data and closed-reference OTU picking with QIIME Amplicon studies offer insight into taxonomic diversity of samples, but they cannot be used to study function (or coding potential). Instead we need shotgun data. In an ideal world, to get the most out of your physical samples you d prepare multiple libraries (amplicon, metagenomic, transcriptomic). In practice most people don t. It is possible to get taxonomic information out of shotgun data. We ll discuss how we have approached this at the EBI. rrnaselector (1): select reads with rdna rrnaselector (2): remove non-rdna rdna sequence 22
23 Closed-reference OTU picking The set of clipped rdna reads obtained with rrnaselector is clustered against a reference database. 16S rdna reference set uclust X 23
24 Further phylogenetic analyses: taxa summary We can visualize the taxonomic composition of our samples. We will reproduce this figure during the hands-on session. We are looking at the composition at phylum level. A legend is also produced (not shown) 24
25 Further phylogenetic analyses: alpha diversity and rarefaction curves Alpha diversity looks at the species diversity within samples If you produced more sequence from your sample, you would expect the number of species to increase until a point where producing more sequence does not significantly increase the number of observed species. You can perform rarefaction analysis on your sample to find out whether you have sequenced at sufficient depth. Rarefaction analysis involves in silico repeated subsampling of your data at different intervals. For example, if your sample consists of 1000 sequences, you could randomly sample 100 reads (with e.g. 10 repetitions), then 200, 300 etc. You can then plot these subsamples against the number of observed species. If curves flatten, then you have sequenced at sufficient depth. 25
26 Divergence measurements between organisms Divergence-based diversity measures estimate the degree to which pairs of organisms differ Sequence distance: measure of sequence identity Phylogenetic distance: sum of branch lengths that separate two organisms in a phylogenetic tree (see fig A) Topological distance: as phylogenetic distance, but all branch lengths set the same (usually 1) Taxonomic distance. Taxonomic level separating two organisms (e.g. same species = 1, same genus = 2, same family = 3, etc) Usually, where sequence data is available (e.g. 16S rrna), sequence or phylogenetic distance measurements are most powerful If phylogenetic trees with meaningful branch lengths are not available, but taxonomic relationships are well defined, topological or taxonomic distance measures can be used (most commonly used for macroorganisms) PD for grey is sum of grey brachnes 26
27 Measures of alpha diversity A community that contain taxa that are more divergent from each other is more diverse There are many ways to measure alpha diversity, below a few examples: Phylogenetic Diversity (PD): measures the total sum of branch lengths in a phylogenetic tree that leads to each community member. Qualitative measure of divergence Theta: measures the average divergence between two randomly chosen sequences (individuals). Quantitative as it accounts for both evenness and divergence between taxa (Low evenness: numerically dominance of a few species) Chao 1: species-based qualitative measure Shannon: species-based quantitative measure 27
28 Further phylogenetic analyses: beta diversity Beta diversity analysis compares diversity between each sample in your study. We calculate the distance between a pair of samples and we do this for all samples. We obtain a distance matrix that we can visualize in a number of ways, e.g. as a tree, a network or a principal coordinates (PCoA) plot. During the hands-on we will generate PCoA plots to visualize the distances between our samples in 3-dimensional space. We ll have a separate tutorial on visualization with Emperor. As our samples show variation in sequencing depth, we will use the number of reads from the smallest sample as our sequencing depth and rarify all other samples at this depth. 28
29 Measures of community distance: UniFrac There are many ways to measure beta diversity (see e.g. Lozupone and Knight, 2009) for summary Divergence-based measures: communities are considered more related if the taxa they contain are more closely related. UniFrac (qualitative): Measures phylogenetic distance between sets of taxa in a tree. Weighted UniFrac (quantitative): Variation of UniFrac that accounts for changes in relative abundance of lineages between communities. Quantitative measures depends on accurate information of relative abundance of sequences (could be biased by lab procedures) UniFrac allows you to: Determine if the environments in the input phylogenetic tree have significantly different microbial communities. Determine if community differences are concentrated within particular lineages of the phylogenetic tree. Cluster environments to determine whether there are environmental factors (such as temperature or salinity) that group communities together. Determine whether the environments were sampled sufficiently to support cluster nodes. 29
30 QIIME analysis of Illumina amplicon data Data preparation differs from 454 analysis Closed-reference OTU picking can be parallelized and is therefore preferred For demultiplexing you need a mapping file (as discussed for 454), the fastq file containing the barcode sequence and the fastq file containing the reads. It is also possible to demultiplex samples if your data is from multiple lanes. For details see the following QIIME tutorial: Note: for a full HiSeq2000 run, this process can take up to 500 CPU hours! 30
31 Finally This concludes the introduction to taxonomic analysis with QIIME. If taxonomic analysis is important to your work, then do spend time going through the different QIIME tutorials at Thank you 31
CBC Data Therapy. Metagenomics Discussion
CBC Data Therapy Metagenomics Discussion General Workflow Microbial sample Generate Metaomic data Process data (QC, etc.) Analysis Marker Genes Extract DNA Amplify with targeted primers Filter errors,
More informationIntroduction to Bioinformatics analysis of Metabarcoding data
Introduction to Bioinformatics analysis of Metabarcoding data Theoretical part Alvaro Sebastián Yagüe Experimental design Sampling Sample processing Sequencing Sequence processing Experimental design Sampling
More informationMicrobiomes and metabolomes
Microbiomes and metabolomes Michael Inouye Baker Heart and Diabetes Institute Univ of Melbourne / Monash Univ Summer Institute in Statistical Genetics 2017 Integrative Genomics Module Seattle @minouye271
More informationMetagenomics Computational Genomics
Metagenomics 02-710 Computational Genomics Metagenomics Investigation of the microbes that inhabit oceans, soils, and the human body, etc. with sequencing technologies Cooperative interactions between
More informationI AM NOT A METAGENOMIC EXPERT. I am merely the MESSENGER. Blaise T.F. Alako, PhD EBI Ambassador
I AM NOT A METAGENOMIC EXPERT I am merely the MESSENGER Blaise T.F. Alako, PhD EBI Ambassador blaise@ebi.ac.uk Hubert Denise Alex Mitchell Peter Sterk Sarah Hunter http://www.ebi.ac.uk/metagenomics Blaise
More informationCOMPARING MICROBIAL COMMUNITY RESULTS FROM DIFFERENT SEQUENCING TECHNOLOGIES
COMPARING MICROBIAL COMMUNITY RESULTS FROM DIFFERENT SEQUENCING TECHNOLOGIES Tyler Bradley * Jacob R. Price * Christopher M. Sales * * Department of Civil, Architectural, and Environmental Engineering,
More informationCarl Woese. Used 16S rrna to develop a method to Identify any bacterium, and discovered a novel domain of life
METAGENOMICS Carl Woese Used 16S rrna to develop a method to Identify any bacterium, and discovered a novel domain of life His amazing discovery, coupled with his solitary behaviour, made many contemporary
More informationWhat is metagenomics?
Metagenomics What is metagenomics? Term first used in 1998 by Jo Handelsman "the application of modern genomics techniques to the study of communities of microbial organisms directly in their natural environments,
More informationMicrobiome: Metagenomics 4/4/2018
Microbiome: Metagenomics 4/4/2018 metagenomics is an extension of many things you have already learned! Genomics used to be computationally difficult, and now that s metagenomics! Still developing tools/algorithms
More informationdbcamplicons pipeline Amplicons
dbcamplicons pipeline Amplicons Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu Microbial community analysis Goal:
More informationCarl Woese. Used 16S rrna to developed a method to Identify any bacterium, and discovered a novel domain of life
METAGENOMICS Carl Woese Used 16S rrna to developed a method to Identify any bacterium, and discovered a novel domain of life His amazing discovery, coupled with his solitary behaviour, made many contemporary
More informationBioinformatics for Microbial Biology
Bioinformatics for Microbial Biology Chaochun Wei ( 韦朝春 ) ccwei@sjtu.edu.cn http://cbb.sjtu.edu.cn/~ccwei Fall 2013 1 Outline Part I: Visualization tools for microbial genomes Tools: Gbrowser Part II:
More informationdbcamplicons pipeline Amplicons
dbcamplicons pipeline Amplicons Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu Microbial community analysis Goal:
More informationMicrobiomics I August 24th, Introduction. Robert Kraaij, PhD Erasmus MC, Internal Medicine
Microbiomics I August 24th, 2017 Introduction Robert Kraaij, PhD Erasmus MC, Internal Medicine r.kraaij@erasmusmc.nl Welcome to Microbiomics I Infection & Immunity MSc students Only first day no practicals
More informationExperimental Design Microbial Sequencing
Experimental Design Microbial Sequencing Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu General rules for preparing
More informationChapter 7. Motif finding (week 11) Chapter 8. Sequence binning (week 11)
Course organization Introduction ( Week 1) Part I: Algorithms for Sequence Analysis (Week 1-11) Chapter 1-3, Models and theories» Probability theory and Statistics (Week 2)» Algorithm complexity analysis
More informationMicrobiome Analysis. Research Day 2012 Ranjit Kumar
Microbiome Analysis Research Day 2012 Ranjit Kumar Human Microbiome Microorganisms Bad or good? Human colon contains up to 100 trillion bacteria. Human microbiome - The community of bacteria that live
More informationApplications of Next Generation Sequencing in Metagenomics Studies
Applications of Next Generation Sequencing in Metagenomics Studies Francesca Rizzo, PhD Genomix4life Laboratory of Molecular Medicine and Genomics Department of Medicine and Surgery University of Salerno
More informationDevelopment of NGS metabarcoding. characterization of aerobiological samples. Lucia Muggia
Development of NGS metabarcoding for the characterization of aerobiological samples Lucia Muggia Alberto Pallavicini, Elisa Banchi, Claudio G. Ametrano, David Stankovic, Silvia Ongaro, Enrico Tordoni,
More informationDiversity Profiling Service: Sample preparation guide
Diversity Profiling Service: Sample preparation guide CONTENTS 1 Overview: Microbial Diversity Profiling at AGRF... 2 2 Submission types to the Microbial Diversity Profiling Service... 3 2.1 Diversity
More informationNGS part 2: applications. Tobias Österlund
NGS part 2: applications Tobias Österlund tobiaso@chalmers.se NGS part of the course Week 4 Friday 13/2 15.15-17.00 NGS lecture 1: Introduction to NGS, alignment, assembly Week 6 Thursday 26/2 08.00-09.45
More informationDiversity Profiling Service: Sample preparation guide
Diversity Profiling Service: Sample preparation guide CONTENTS 1 Overview: Microbial Diversity Profiling at AGRF... 2 2 Submission types to the Microbial Diversity Profiling Service... 3 2.1 Diversity
More informationInfectious Disease Omics
Infectious Disease Omics Metagenomics Ernest Diez Benavente LSHTM ernest.diezbenavente@lshtm.ac.uk Course outline What is metagenomics? In situ, culture-free genomic characterization of the taxonomic and
More informationContents 16S rrna SEQUENCING DATA ANALYSIS TUTORIAL WITH QIIME... 5
QIIME Analysis 1 Contents 16S rrna SEQUENCING DATA ANALYSIS TUTORIAL WITH QIIME... 5 Report Overview... 5 How to Obtain Microbiome Data... 6 How to Setup QIIME... 7 Essential files for QIIME... 7 Sequence
More informationngs metagenomics target variation amplicon bioinformatics diagnostics dna trio indel high-throughput gene structural variation ChIP-seq mendelian
Metagenomics T TM storage genetics assembly ncrna custom genotyping RNA-seq de novo mendelian ChIP-seq exome genomics indel ngs trio prediction metagenomics SNP resequencing bioinformatics diagnostics
More informationchoose MBL-REGISTER user: dm00834 password: dm00834 http://register.mbl.edu/ stamps.mbl.edu this uses the username and password on your STAMPS name badge Strategies for Analysis of Microbial Population
More informationConducting Microbiome study, a How to guide
Conducting Microbiome study, a How to guide Sam Zhu Supervisor: Professor Margaret IP Joint Graduate Seminar Department of Microbiology 15 December 2015 Why study Microbiome? ü Essential component, e.g.
More informationIntroduction to Microbial Community Analysis. Tommi Vatanen CS-E Statistical Genetics and Personalised Medicine
Introduction to Microbial Community Analysis Tommi Vatanen CS-E5890 - Statistical Genetics and Personalised Medicine Structure of the lecture Motivation: human microbiome Terminology Data types, analysis
More informationRobert Edgar. Independent scientist
Robert Edgar Independent scientist robert@drive5.com www.drive5.com Reads FASTQ format Millions of reads Many Gb USEARCH commands "UPARSE pipeline" OTU sequences FASTA format >Otu1 GATTAGCTCATTCGTA >Otu2
More informationmothur Workshop for Amplicon Analysis Michigan State University, 2013
mothur Workshop for Amplicon Analysis Michigan State University, 2013 Tracy Teal MMG / ICER tkteal@msu.edu Kevin Theis Zoology / BEACON theiskev@msu.edu mothur Mission to develop a single piece of open-source,
More informationJoint RuminOmics/Rumen Microbial Genomics Network Workshop
Joint RuminOmics/Rumen Microbial Genomics Network Workshop Microbiome analysis - Amplicon sequencing Dr. Sinéad Waters Animal and Bioscience Research Department, Teagasc Grange, Ireland Prof. Leluo Guan
More informationIntroduc)on to QIIME on the IPython Notebook
Strategies and Techniques for Analyzing Microbial Population Structures Introduc)on to QIIME on the IPython Notebook Rob Knight Adam Robbins- Pianka Will Van Treuren Yoshiki Vázquez- Baeza ( @yosmark )
More informationNature Biotechnology: doi: /nbt Supplementary Figure 1. MBQC base beta diversity, major protocol variables, and taxonomic profiles.
Supplementary Figure 1 MBQC base beta diversity, major protocol variables, and taxonomic profiles. A) Multidimensional scaling of MBQC sample Bray-Curtis dissimilarities (see Fig. 1). Labels indicate centroids
More informationTECHNIQUES FOR STUDYING METAGENOME DATASETS METAGENOMES TO SYSTEMS.
TECHNIQUES FOR STUDYING METAGENOME DATASETS METAGENOMES TO SYSTEMS. Ian Jeffery I.Jeffery@ucc.ie What is metagenomics Metagenomics is the study of genetic material recovered directly from environmental
More informationParts of a standard FastQC report
FastQC FastQC, written by Simon Andrews of Babraham Bioinformatics, is a very popular tool used to provide an overview of basic quality control metrics for raw next generation sequencing data. There are
More informationContact us for more information and a quotation
GenePool Information Sheet #1 Installed Sequencing Technologies in the GenePool The GenePool offers sequencing service on three platforms: Sanger (dideoxy) sequencing on ABI 3730 instruments Illumina SOLEXA
More informationIntroduction to OTU Clustering. Susan Huse August 4, 2016
Introduction to OTU Clustering Susan Huse August 4, 2016 What is an OTU? Operational Taxonomic Units a.k.a. phylotypes a.k.a. clusters aggregations of reads based only on sequence similarity, independent
More informationHMP Data Set Documentation
HMP Data Set Documentation Introduction This document provides detail about files available via the DACC website. The goal of the HMP consortium is to make the metagenomics sequence data generated by the
More informationRHIZOSPHERE METAGENOMICS OF THREE BIOFUEL CROPS. Jiarong Guo
RHIZOSPHERE METAGENOMICS OF THREE BIOFUEL CROPS By Jiarong Guo A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Microbiology and Molecular
More informationPractical Bioinformatics for Life Scientists. Week 14, Lecture 27. István Albert Bioinformatics Consulting Center Penn State
Practical Bioinformatics for Life Scientists Week 14, Lecture 27 István Albert Bioinformatics Consulting Center Penn State No homework this week Project to be given out next Thursday (Dec 1 st ) Due following
More informationSUPPLEMENTARY INFORMATION
doi:10.1038/nature09944 Supplementary Figure 1. Establishing DNA sequence similarity thresholds for phylum and genus levels Sequence similarity distributions of pairwise alignments of 40 universal single
More informationReport on database pre-processing
Multiscale Immune System SImulator for the Onset of Type 2 Diabetes integrating genetic, metabolic and nutritional data Work Package 2 Deliverable 2.3 Report on database pre-processing FP7-600803 [D2.3
More informationHmmUFOtu: An HMM and phylogenetic placement based ultra-fast taxonomic assignment and OTU picking tool for microbiome amplicon sequencing studies
Zheng et al. Genome Biology (2018) 19:82 https://doi.org/10.1186/s13059-018-1450-0 SOFTWARE HmmUFOtu: An HMM and phylogenetic placement based ultra-fast taxonomic assignment and OTU picking tool for microbiome
More informationAn introduction into 16S rrna gene sequencing analysis. Stefan Boers
An introduction into 16S rrna gene sequencing analysis Stefan Boers Microbiome, microbiota or metagenomics? Microbiome The entire habitat, including the microorganisms, their genomes (i.e., genes) and
More informationA comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome
Allali et al. BMC Microbiology (2017) 17:194 DOI 10.1186/s12866-017-1101-8 RESEARCH ARTICLE Open Access A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the
More informationAssigning Sequences to Taxa CMSC828G
Assigning Sequences to Taxa CMSC828G Outline Objective (1 slide) MEGAN (17 slides) SAP (33 slides) Conclusion (1 slide) Objective Given an unknown, environmental DNA sequence: Make a taxonomic assignment
More informationSequencing Errors, Diversity Estimates, and the Rare Biosphere
Sequencing Errors, Diversity Estimates, and the Rare Biosphere or Living in the shadow of Errares Susan Huse Marine Biological Laboratory June 13, 2012 Consistent Community Profile across samples and environments
More informationmothur tutorial STAMPS, 2013 Kevin R. Theis Department of Zoology BEACON Center for the Study of Evolution in Action Michigan State University
mothur tutorial STAMPS, 2013 Kevin R. Theis Department of Zoology BEACON Center for the Study of Evolution in Action Michigan State University mothur Mission to develop a single piece of open-source, expandable
More informationSANBio BIOINFORMATICS TRAINING COURSE THE MICROBIOME: ANALYSIS OF NGS DATA CBIO-PIPELINE SAMSON, KM
SANBio BIOINFORMATICS TRAINING COURSE THE MICROBIOME: ANALYSIS OF NGS DATA CBIO-PIPELINE SAMSON, KM 10/23/2017 Microbiome : Analysis of NGS Data 1 Outline Background Wet Lab! Raw reads Quality Assessment
More informationNext Generation Sequencing. Tobias Österlund
Next Generation Sequencing Tobias Österlund tobiaso@chalmers.se NGS part of the course Week 4 Friday 13/2 15.15-17.00 NGS lecture 1: Introduction to NGS, alignment, assembly Week 6 Thursday 26/2 08.00-09.45
More informationEvaluation of a Short-Term Scientific Mission (STSM) Cost Action ES1406 KEYSOM soil biodiversity of European transect
Evaluation of a Short-Term Scientific Mission (STSM) Cost Action ES1406 KEYSOM soil biodiversity of European transect Name: KEYSOM soil biodiversity of European transect COST STSM Reference Number: COST-STSM-ES1406-35328
More informationSupplementary Figure 1 Schematic view of phasing approach. A sequence-based schematic view of the serial compartmentalization approach.
Supplementary Figure 1 Schematic view of phasing approach. A sequence-based schematic view of the serial compartmentalization approach. First, barcoded primer sequences are attached to the bead surface
More informationOMNIgene GUT stabilizes the microbiome profile at ambient temperature for 60 days and during transport
OMNIgene GUT stabilizes the microbiome profile at ambient temperature for 60 days and during transport Evgueni Doukhanine, Anne Bouevitch, Ashlee Brown, Jessica Gage LaVecchia, Carlos Merino and Lindsay
More informationName: Ally Bonney. Date: January 29, 2015 February 24, Purpose
Name: Ally Bonney Title: Genome sequencing and annotation of Pseudomonas veronii isolated from Oregon State University soil and 16S rrna characterization of Corvallis, OR soil microbial populations Date:
More informationMatthew Tinning Australian Genome Research Facility. July 2012
Next-Generation Sequencing: an overview of technologies and applications Matthew Tinning Australian Genome Research Facility July 2012 History of Sequencing Where have we been? 1869 Discovery of DNA 1909
More informationSupplementary Figures
Supplementary Figures Supplementary Fig. S1 - Nationwide contributions of the most abundant genera. The figure shows log 10 of the relative percentage of genera, forming 80% of total abundance. (Russian
More informationMicrobial community structure and a core microbiome in biological rapid sand filters at Danish waterworks
Downloaded from orbit.dtu.dk on: Jan 24, 2019 Gülay, Arda; Musovic, Sanin; Albrechtsen, Hans-Jørgen; Smets, Barth F. Publication date: 2013 Document Version Publisher's PDF, also known as Version of record
More informationEvaluation of the liver abscess microbiome and liver abscess prevalence in cattle reared for production of natural branded beef
Evaluation of the liver abscess microbiome and liver abscess prevalence in cattle reared for production of natural branded beef K.L. Huebner, J.N. Martin C.J. Weissend, K.L. Holzer, M. Weinroth, Z. Abdo,
More informationOptimizing taxonomic classification of marker gene amplicon sequences
1 2 3 4 5 Optimizing taxonomic classification of marker gene amplicon sequences Nicholas A. Bokulich 1# *, Benjamin D. Kaehler 2# *, Jai Ram Rideout 1, Matthew Dillon 1, Evan Bolyen 1, Rob Knight 3, Gavin
More informationNext-generation sequencing and quality control: An introduction 2016
Next-generation sequencing and quality control: An introduction 2016 s.schmeier@massey.ac.nz http://sschmeier.com/bioinf-workshop/ Overview Typical workflow of a genomics experiment Genome versus transcriptome
More informationMeasuring the human gut microbiome: new tools and non alcoholic fatty liver disease
Western University Scholarship@Western Electronic Thesis and Dissertation Repository July 2016 Measuring the human gut microbiome: new tools and non alcoholic fatty liver disease Ruth G. Wong The University
More informationIntroduction to Microbial Sequencing
Introduction to Microbial Sequencing Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu General rules for preparing
More informationFungal ITS Bioinformatics Efforts in Alaska
Fungal ITS Bioinformatics Efforts in Alaska D. Lee Taylor ltaylor@iab.alaska.edu Institute of Arctic Biology University of Alaska Fairbanks Shawn Houston Minnesota Supercomputing Institute University of
More informationExercices: Metagenomics. Find Rapidly OTU with Galaxy Solution
Exercices: Metagenomics Find Rapidly OTU with Galaxy Solution F R É D É R I C E S C U D I É * a n d L U C A S A U E R *, M A R I A B E R N A R D, L A U R E N T C A U Q U I L, K AT I A V I D A L, S A R
More informationBioinformatic tools for metagenomic data analysis
Bioinformatic tools for metagenomic data analysis MEGAN - blast-based tool for exploring taxonomic content MG-RAST (SEED, FIG) - rapid annotation of metagenomic data, phylogenetic classification and metabolic
More informationMethods for the phylogenetic inference from whole genome sequences and their use in Prokaryote taxonomy. M. Göker, A.F. Auch, H. P.
Methods for the phylogenetic inference from whole genome sequences and their use in Prokaryote taxonomy M. Göker, A.F. Auch, H. P. Klenk Contents 1. Short introduction into the role of DNA DNA hybridization
More informationIntegrating Evolutionary, Ecological and Statistical Approaches to Metagenomics. A proposal to the Gordon and Betty Moore Foundation
Integrating Evolutionary, Ecological and Statistical Approaches to Metagenomics A proposal to the Gordon and Betty Moore Foundation Jonathan A. Eisen University of California, Davis U. C. Davis Genome
More informationEvaluation of the liver abscess microbiome and liver abscess prevalence in cattle reared for production of natural branded beef
Evaluation of the liver abscess microbiome and liver abscess prevalence in cattle reared for production of natural branded beef K.L. Huebner, J.N. Martin C.J. Weissend, K.L. Holzer, M. Weinroth, Z. Abdo,
More informationA FRAMEWORK FOR ANALYSIS OF METAGENOMIC SEQUENCING DATA
A FRAMEWORK FOR ANALYSIS OF METAGENOMIC SEQUENCING DATA A. MURAT EREN Department of Computer Science, University of New Orleans, 2000 Lakeshore Drive, New Orleans, LA 70148, USA Email: aeren@uno.edu MICHAEL
More informationNext Gen Sequencing. Expansion of sequencing technology. Contents
Next Gen Sequencing Contents 1 Expansion of sequencing technology 2 The Next Generation of Sequencing: High-Throughput Technologies 3 High Throughput Sequencing Applied to Genome Sequencing (TEDed CC BY-NC-ND
More informationStrain/species identification in metagenomes using genome-specific markers. Tu, He and Zhou Nucleic Acids Research
Strain/species identification in metagenomes using genome-specific markers. Tu, He and Zhou. 2014 Nucleic Acids Research Journal Club Triinu Kõressaar 25.04.2014 Introduction (1/2) Shotgun metagenome sequencing
More informationCBC Data Therapy. Metatranscriptomics Discussion
CBC Data Therapy Metatranscriptomics Discussion Metatranscriptomics Extract RNA, subtract rrna Sequence cdna QC Gene expression, function Institute for Systems Genomics: Computational Biology Core bioinformatics.uconn.edu
More informationGetting of the representative sequences from the clusters (consensus/most abundant) *(MAFFT) Identification of OTUs *(BLAST)
Illumina pair-end data (R1 & R2 FASTQ) FASTA FASTQ TEXT joining of pair-end data *(fastq-join) v2.0 Quality filtering/sequence trimming/removing of ambiguous bases Grouping sequences by BARCODE motives
More informationAdvisors: Prof. Louis T. Oliphant Computer Science Department, Hiram College.
Author: Sulochana Bramhacharya Affiliation: Hiram College, Hiram OH. Address: P.O.B 1257 Hiram, OH 44234 Email: bramhacharyas1@my.hiram.edu ACM number: 8983027 Category: Undergraduate research Advisors:
More informationDistribution-Based Clustering: Using Ecology To Refine the Operational Taxonomic Unit
Distribution-Based Clustering: Using Ecology To Refine the Operational Taxonomic Unit The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.
More informationSUPPLEMENTARY INFORMATION
SUPPLEMENTARY INFORMATION doi:10.1038/nature12212 Supplementary Discussion Contamination Assessment We evaluated the amount of human contamination in our viral DNA preparations by identifying sequences
More informationJianguo (Jeff) Xia, Assistant Professor McGill University, Quebec Canada June 26, 2017
Jianguo (Jeff) Xia, Assistant Professor McGill University, Quebec Canada jeff.xia@mcgill.ca www.xialab.ca June 26, 2017 Metabolomics http://metaboanalyst.ca Systems transcriptomics http://networkanalyst.ca
More informationMicroSEQ TM ID Rapid Microbial Identification System:
MicroSEQ TM ID Rapid Microbial Identification System: the complete solution for reliable genotypic microbial identification 1 The world leader in serving science Rapid molecular methods for pharmaceutical
More informationMicrobial sequencing solutions
Microbial sequencing solutions Scalable, simple, fast TARGETED GENOME Sequencing f every lab, every budget, every application Ion Trent semiconduct sequencing Ion Trent technology has pioneered an entirely
More informationProtist diversity along a salinity gradient in a coastal lagoon
The following supplement accompanies the article Protist diversity along a salinity gradient in a coastal lagoon Sergio Balzano*, Elsa Abs, Sophie C. Leterme *Corresponding author: sergio.balzano@nioz.nl
More information16s Metagenomic Analysis Tutorial Max Planck Society
We have made it easy for you to find a PDF Ebooks without any digging. And by having access to our ebooks online or by storing it on your computer, you have convenient answers with 16s metagenomic analysis
More informationAnalyzing the Leaf Microbiome. Jason Wallace Cornell University
Analyzing the Leaf Microbiome Z across 270 Diverse Maize Lines Jason Wallace Cornell University Harnessing the crop microbiome Need to unravel the complex interactions among: Plant genotype Brian Marshall
More informationSupplementary Information for
Supplementary Information for Microbial community dynamics and stability during an ammonia- induced shift to syntrophic acetate oxidation Jeffrey J. Werner 1,2, Marcelo L. Garcia 3, Sarah D. Perkins 3,
More informationMicroSEQ Rapid Microbial Identifi cation System
APPLICATION NOTE MicroSEQ Rapid Microbial Identifi cation System MicroSEQ Rapid Microbial Identification System Giving you complete control over microbial identifi cation using the gold-standard genotypic
More informationGenome Sequence Assembly
Genome Sequence Assembly Learning Goals: Introduce the field of bioinformatics Familiarize the student with performing sequence alignments Understand the assembly process in genome sequencing Introduction:
More informationAssessing barley malt associated microbial diversity using next generation sequencing
Assessing barley malt associated microbial diversity using next generation sequencing Mandeep Kaur 1, Evan Evans 1, Doug Stewart 2, Agnieszka Janusz 2, Barbara Holland 1 and John Bowman 1 1 University
More informationIntroductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie. Sander van Boheemen Medical Microbiology
Introductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie Sander van Boheemen Medical Microbiology Next-generation sequencing Next-generation sequencing (NGS), also known as
More informationLecture 7. Next-generation sequencing technologies
Lecture 7 Next-generation sequencing technologies Next-generation sequencing technologies General principles of short-read NGS Construct a library of fragments Generate clonal template populations Massively
More informationIntroduction to Microbiome Omics Technologies
BICF Education Monthly Topics in Bioinformatics and Genomics https://portal.biohpc.swmed.edu/content/training/ BICF Astrocyte Workflows in Sequence Variation, RNASeq, ChipSeq, CRISPR BICF Data Resources
More informationMICROBIOMICS Current and future tools of the trade
MICROBIOMICS Current and future tools of the trade Ingeborg Klymiuk Core Facility Molecular Biology ZMF - CENTER FOR MEDICAL RESEARCH Medical University Graz MICROBIOMICS DEFINITION OF OMIC TECHNOLOGIES
More informationSO YOU WANT TO DO A: RNA-SEQ EXPERIMENT MATT SETTLES, PHD UNIVERSITY OF CALIFORNIA, DAVIS
SO YOU WANT TO DO A: RNA-SEQ EXPERIMENT MATT SETTLES, PHD UNIVERSITY OF CALIFORNIA, DAVIS SETTLES@UCDAVIS.EDU Bioinformatics Core Genome Center UC Davis BIOINFORMATICS.UCDAVIS.EDU DISCLAIMER This talk/workshop
More informationDNA. bioinformatics. genomics. personalized. variation NGS. trio. custom. assembly gene. tumor-normal. de novo. structural variation indel.
DNA Sequencing T TM variation DNA amplicon mendelian trio genomics NGS bioinformatics tumor-normal custom SNP resequencing target validation de novo prediction personalized comparative genomics exome private
More informationAnalysis of milk microbial profiles using 16s rrna gene sequencing in milk somatic cells and fat
Analysis of milk microbial profiles using 16s rrna gene sequencing in milk somatic cells and fat Juan F. Medrano Anna Cuzco* Alma Islas-Trejo Armand Sanchez* Olga Francino* Dept. of Animal Science University
More informationUSEARCH software and documentation Copyright Robert C. Edgar All rights reserved.
USEARCH software and documentation Copyright 2010-11 Robert C. Edgar All rights reserved http://drive5.com/usearch robert@drive5.com Version 5.0 August 22nd, 2011 Contents Introduction... 3 UCHIME implementations...
More informationSupplementary Online Content
Supplementary Online Content Pannaraj PS, Li F, Cerini C, et al. Association between breast milk bacterial communities and establishment and development of the infant gut microbiome. JAMA Pediatr. Published
More informationGenetic Sequencing Methodologies to Assess Human Contributions of Fecal Coliforms to a Freshwater Receiving Stream Introduction Sample Collection
Genetic Sequencing Methodologies to Assess Human Contributions of Fecal Coliforms to a Freshwater Receiving Stream D. A Graves1, D. E. Chestnut1, E. B. Rabon1, W. J. Jones2, J. G. Moore3, C. Johnston3
More informationIntroduction to metagenome assembly. Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014
Introduction to metagenome assembly Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014 Sequencing specs* Method Read length Accuracy Million reads Time Cost per M 454
More informationExperimental Design. Dr. Matthew L. Settles. Genome Center University of California, Davis
Experimental Design Dr. Matthew L. Settles Genome Center University of California, Davis settles@ucdavis.edu What is Differential Expression Differential expression analysis means taking normalized sequencing
More informationWelcome to the NGS webinar series
Welcome to the NGS webinar series Webinar 1 NGS: Introduction to technology, and applications NGS Technology Webinar 2 Targeted NGS for Cancer Research NGS in cancer Webinar 3 NGS: Data analysis for genetic
More informationLecture 01: Overview of Metagenomics
Lecture 01: Overview of Metagenomics 1 Culture Independent Techniques: Metagenomics Universal Gene census Shotgun Metagenome Sequencing Transcriptomics (shotgun mrna) Proteomics (protein fragments) Metabolomics
More information