Size: px
Start display at page:

Download ""

Transcription

1 choose MBL-REGISTER user: dm00834 password: dm

2 stamps.mbl.edu this uses the username and password on your STAMPS name badge

3

4

5

6 Strategies for Analysis of Microbial Population Structures Tag/Marker Gene Metagenomics (sequencing amplified markers, amplicons e.g. SSU rrna) Who s in the community? Shotgun Metagenomics (sequencing total community DNA) What is the functional potential of the community? (Partial) Assembly of community genomes Metatranscriptomics (sequencing tag or shotgun cdna) What fraction of the community is active? What is the community doing? Metaproteomics (sequencing community peptides) Meta-metabolomics (detecting and quantifying metabolites used by the community) Total Community Sampling vs Targeted or Enriched Sampling

7 Tag/Marker Gene Sequencing Metagenomics 0. Design your experiment to meet your goals! - Sampling depth (more reads/sample) vs. sampling breath (more samples) - Targeted approach vs trying to sample everything - Importance of rare taxa? - Importance of biological replicates - Collect appropriate contextual metadata - Standardize experimental approach before you begin

8 Tag/Marker Gene Sequencing Metagenomics 1. Design primers to a region of interest - Usually SSU rrna ( 16S ) - conserved primers flanking variable regions(s) - length dependent on sequencing platform 2. Extract DNA (RNA) (2a Create cdna from RNA) 3. Amplify DNA All of these steps introduce biases! 4. Sequence Also biased, but probably less so than the earlier steps!

9 Tag/Marker Gene Sequencing Metagenomics 5. Clean up the data - Remove low quality reads - Remove other error-prone reads - Remove chimeric reads - Remove non-target reads 6. Organize the reads into units Robert Edgar, Today Meren, Wednesday Susan Holmes, Wed Rob Knight, Thursday Tracy Teal, Friday - Assign reads to a taxonomic identifier - Assign reads to a phylogenetic clade - Group reads into (or assign reads to) Operational Taxonomic Units (OTUs) By sequence similarity By sequence information

10 Tag/Marker Gene Sequencing Metagenomics 7. Analyze your groups of sequences - Assumption: each read represents an individual - Assumption: steps 1-6 are effectively unbiased, or equally biased across samples - Quandary: what to do about low abundant groups (singletons)? - Quandary: how to compare samples of unequal size? Within a sample or combined samples - Richness, number of observed and estimated groups - Evenness, relative abundance of the groups Between samples or groups of samples - Beta diversity - Relating diversity to metadata Amy Willis, Tues Robert Edgar, Tues Meren, Wednesday Susan Holmes, Wed Rob Knight, Thurs Tracy Teal, Friday

11 An Abridged History of DNA Sequencing 1971 Wu & Taylor sequence overhang of phage λ (12 bp) 1974 Sogin, Woese and Pace sequence a 5S rrna (116 nt) 1977 Sanger et al. sequence phix174 (5,375bp) 1982 Sanger et al. sequence phage λ (48,501bp) 1985 Pace, Olsen, Sogin, and others sequence 16S rrna genes 1996 ABI Prism 310 Genetic Analyzer 1998 Phred automated base calling becomes standard 1998 ABI 3700; high-throughput capillary sequencing becomes widely available GS Solexa Genome Analyzer 2011 Illumina HiSeq + PacBio, Nanopore, 10X, etc...

12 Cost of DNA Sequencing $10, $1, : $10,000/bp (first sequencing) 1977: $10/bp (first Sanger sequencing) 1996: $1/bp (ABI 310) 1998: $0.1/bp (ABI 3700) 2001: $0.001/bp (ABI3730) Cost per Mb $ $10.00 $1.00 $0.10 $0.01 Sep-01 Mar-02 Sep-02 Mar-03 Sep-03 Mar-04 Sep-04 Mar-05 Sep-05 Mar-06 Sep-06 Mar-07 Sep-07 Wetterstrand KA. DNA Sequencing Costs: Data from the NHGRI Genome Sequencing Program (GSP) Available at: Mar-08 Sep-08 Mar-09 Sep-09 Mar-10 Sep-10 Mar-11 Sep-11 Mar-12 Sep-12 Mar-13 Sep-13 Mar-14 Sep-14 Mar-15

13 Q: To what degree does your research drive nextgen sequencing technology?

14 The Pattern of Perception true acceptance performance The Collins Curve disillusionment initial idea technology 1 technology 2 technology 3

15 Sanger Capillary ABI Sequencing 96 reactions 48 clones 30.5 MB 317K M10MaGB08.b.ab1 317K M10MaGB08.g.ab1 ABI 3700 chromatogram

16 Phred Quality Scores Q = -10*log(P) Q P in in in 1, in 10,000 In the era of high-throughput capillary sequencing, the gold standard to which next-gen methods are measured, the goal was to produce reads with an average phred score of 20.

17 Illumina HiSeq/MiSeq/NextSeq 275M KEG1_ACTTGA_L004_R1_001.fastq.gz 288M KEG1_ACTTGA_L004_R1_002.fastq.gz 277M KEG1_ACTTGA_L004_R1_003.fastq.gz 286M KEG1_ACTTGA_L004_R1_004.fastq.gz 279M KEG1_ACTTGA_L004_R1_005.fastq.gz 286M KEG1_ACTTGA_L004_R1_006.fastq.gz 279M KEG1_ACTTGA_L004_R1_007.fastq.gz 283M KEG1_ACTTGA_L004_R1_008.fastq.gz 279M KEG1_ACTTGA_L004_R1_009.fastq.gz 281M KEG1_ACTTGA_L004_R2_001.fastq.gz 293M KEG1_ACTTGA_L004_R2_002.fastq.gz 283M KEG1_ACTTGA_L004_R2_003.fastq.gz 292M KEG1_ACTTGA_L004_R2_004.fastq.gz 285M KEG1_ACTTGA_L004_R2_005.fastq.gz 291M KEG1_ACTTGA_L004_R2_006.fastq.gz 285M KEG1_ACTTGA_L004_R2_007.fastq.gz 287M KEG1_ACTTGA_L004_R2_008.fastq.gz 286M KEG1_ACTTGA_L004_R2_009.fastq.gz 75,000,000 reads 375,000,000 reads per lane 5.1GB of data... from one lane

18 fastq 2:N:0:ACTTGA 2:N:0:ACTTGA ATTCATGTCGCTGATGAAAGGCGCTGGAGCCAACATTCTCAGAGGCATTGCTGGCGCTGGTGTCCTATCAGGCTTCGACAAGCT + 2:N:0:ACTTGA TTGTAAACTTTGAAATTAAAACTCAAAATGGGTAGCCTTTATCAGAGTTCTATTCTCAATTCTATTTTAAAATGGACCAAGCTC + 2:N:0:ACTTGA GAGCCCAATTCCGCAGAAACTTGCCATCAAAGGCCATCGGAAGAAGAGTGCGAGTTATGTTGTATCCTTCAAGTATTTAAAAAG + 2:N:0:ACTTGA GTTTGATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTAAAATTCCAAATTTACCG + 2:N:0:ACTTGA ACTAATAGAATACTTTAACTGCCTCTACGTGAGAGTGGAAAGCAATAGGAAGGCGGGGTGATCACTCTAAATGTACAGAATTAT + 2:N:0:ACTTGA

19 fastq format machine run flowcell lane tile pair (1, 2, 0) x:y coordinates failed filter? multiplex 2:N:0:ACTTGA CAACTGCAGACGTTTGGTCCCAAAAGATAAAGCGATCAAAAAATTCATTATCAGAAATATTGTCGAGGCTGCAGCTGTCAGAGA Phred quality score from 0 to 93 using ASCII 33 to ! # $ A B C D ~ }

20 Tag Sequencing Metagenomics 7. Analyze your groups of sequences - Assumption: each read represents an individual - Assumption: steps 1-6 are effectively unbiased, or equally biased across samples - Quandary: what to do about low abundant groups (singletons)? - Quandary: how to compare samples of unequal size? Within a sample or combined samples - Richness, number of observed and estimated groups - Evenness, relative abundance of the groups Between samples or groups of samples - Beta diversity - Relating diversity to metadata

21 Operational Taxonomic Units (OTUs) one read one sequence five reads one sequence 26 reads 14 sequences 5 OTUs ten reads one sequence Singleton OTU

22 Rank Abundance Curve 100,000 one OTU has 647,903 tags tags / OTU 10,000 1, observed evenness observed richness ,000 20,000 10,000 40,000 50,000 60,000 80,664 OTUs have one tag each 70,000 80,000 90, , , , ,000 OTUs Ranked by Abundance

23 tags/otu Number of OTUs Species Abundance Curve

24 Species Abundance Curve How many OTUs have zero tags each (missing)? 80,664 OTUs have one tag each (singletons) 18,817 OTUs have two tags each (doubletons) Number of OTUs Total obs OTUs + missing OTUs = Total alpha diversity 160, ,609 = 637,763 1 OTU has 1,993 tags one OTU has 647,903 tags tags/otu

25 Rarefaction Curve: What is it? OTUs Observed Tags Sampled

26 Rarefaction Curve: what is it? Species Abundance Curve original sample has 78,650 tags in 3,320 OTUs Number of OTUs tags/otu

27 Rarefaction Curve: what is it? Species Abundance Curve original sample has 78,650 tags in 3,320 OTUs create pseudo-replicate sample with n tags by removing 78,650 n tags from the SA curve count how many OTUs have at least one tag remove 60,650 tags 18,000 tags remain 1,604 OTUs still represented repeat for n = 100, 200, or other interval

28 Rarefaction Curve OTUs Observed Many low-abundance OTUs always missing in pseudo replicates High abundance OTUs always present in pseudo replicates Tags Sampled

29 Rarefaction Curve OTUs Observed Same tag data, but: sub-samples of tags clustered into OTUs rarefaction run on these new OTUs Tags Sampled

30 OTUs Observed 2500 Rarefaction Curve Same tag data, but: sub-samples of tags clustered into OTUs rarefaction run on these new OTUs This should not happen! The solution is not to sub-sample to some common tag count to avoid the problem! Tags Sampled

31 Rarefaction Curve OTUs Observed This sub-sampling phenomenon indicated something was not happening the way it was supposed to happen. There was a problem with the procedure. An improved clustering method generates OTUs that do not display the phenomenon. Understand what the methods do so that you can recognize when there is a problem! Tags Sampled

32 Tag Sequencing Metagenomics 7. Analyze your groups of sequences - Assumption: each read represents an individual - Assumption: steps 1-6 are effectively unbiased, or equally biased across samples - Quandary: what to do about low abundant groups (singletons)? - Quandary: how to compare samples of unequal size? Within a sample or combined samples - Richness, number of observed and estimated groups - Evenness, relative abundance of the groups Between samples or groups of samples - Beta diversity - Relating diversity to metadata

33 International Census of Marine Life million bacterial sequence tags from 549 surveys

34 Species Abundance of 3% OTUs from the ICoMM Survey Number of OTUS % singletons! Global survey of crustaceans across 93 coral reef sites... 38% singletons! tags/otu Bacteria otus.icomm.sabund

35 total dataset 1 dataset 2 dataset 3 dataset 4 dataset 5 OTU OTU OTU OTU OTU Singleton in dataset 1 but not in total Singleton in dataset 3 not present in any other dataset unique singleton

36 Richness Distributions of 100 Most Abundant OTUs OTUs Bacteria otusizes.icomm

37 The most abundant OTUs are rare somewhere If an OTU appears with abundance greater than 100 in one dataset, there is a ~90% chance it appears as a singleton in another Most singletons in a dataset are found elsewhere 79% of the singleton OTUs in a dataset are not unique

38 What is a singleton? ~1 x 10 9 bacterial cells/liter epipelagic seawater Based on average ICoMM dataset abundances: 1-3 OTUs represent >1% each, or more than 100,000,000 cells/l ~10 OTUs represent 0.1% 1%, or 10, ,000,000 cells/l ~100 OTUs represent %, or 1,000 10,000,000 cells/l ~1,000 OTUs represent less than 0.001%, that s still ~ 100,000 cells/l

39 Cells/Liter Our Sampling is Very Incomplete, even with 20,000 Tags / Liter 10 4 Tags/Dataset Rank Abundance

40 P p (n N) = p = cells/ml n = 1 N = 20,000 What is a Singleton? If sampling 20,000 tags from 1 L at 10 9 cells/l, a singleton likely represents an OTU with an abundance of 50, ,000 cells/l P p (n N) = N! n!(n-n)! pn (1-p) N-n Cells/OTU/Liter

41 Is there a real difference in abundance between an OTU observed as singleton and as a doubleton? Singleton Doubleton Cells/OTU/Liter

42 Is there a real difference in abundance between an OTU observed as singleton and as a doubleton? Cells/OTU/Liter

43 What s Represented in 20,000 tags? (assuming sampling 20,000 tags from 1 liter at 10 9 cells/liter) singletons doubletons Missing! Cells/OTU/ml

44 Understand the methods you are using; recognize when the answers don t make sense Appreciate the biology of your system; recognize when your results are in conflict and be suspicious of your results when this happens Document everything you do

Sequencing Errors, Diversity Estimates, and the Rare Biosphere

Sequencing Errors, Diversity Estimates, and the Rare Biosphere Sequencing Errors, Diversity Estimates, and the Rare Biosphere or Living in the shadow of Errares Susan Huse Marine Biological Laboratory June 13, 2012 Consistent Community Profile across samples and environments

More information

CBC Data Therapy. Metagenomics Discussion

CBC Data Therapy. Metagenomics Discussion CBC Data Therapy Metagenomics Discussion General Workflow Microbial sample Generate Metaomic data Process data (QC, etc.) Analysis Marker Genes Extract DNA Amplify with targeted primers Filter errors,

More information

Experimental Design Microbial Sequencing

Experimental Design Microbial Sequencing Experimental Design Microbial Sequencing Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu General rules for preparing

More information

Introduction to Bioinformatics analysis of Metabarcoding data

Introduction to Bioinformatics analysis of Metabarcoding data Introduction to Bioinformatics analysis of Metabarcoding data Theoretical part Alvaro Sebastián Yagüe Experimental design Sampling Sample processing Sequencing Sequence processing Experimental design Sampling

More information

Introduction to Microbial Sequencing

Introduction to Microbial Sequencing Introduction to Microbial Sequencing Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu General rules for preparing

More information

Next-generation sequencing and quality control: An introduction 2016

Next-generation sequencing and quality control: An introduction 2016 Next-generation sequencing and quality control: An introduction 2016 s.schmeier@massey.ac.nz http://sschmeier.com/bioinf-workshop/ Overview Typical workflow of a genomics experiment Genome versus transcriptome

More information

What is metagenomics?

What is metagenomics? Metagenomics What is metagenomics? Term first used in 1998 by Jo Handelsman "the application of modern genomics techniques to the study of communities of microbial organisms directly in their natural environments,

More information

Metagenomics Computational Genomics

Metagenomics Computational Genomics Metagenomics 02-710 Computational Genomics Metagenomics Investigation of the microbes that inhabit oceans, soils, and the human body, etc. with sequencing technologies Cooperative interactions between

More information

TECHNIQUES FOR STUDYING METAGENOME DATASETS METAGENOMES TO SYSTEMS.

TECHNIQUES FOR STUDYING METAGENOME DATASETS METAGENOMES TO SYSTEMS. TECHNIQUES FOR STUDYING METAGENOME DATASETS METAGENOMES TO SYSTEMS. Ian Jeffery I.Jeffery@ucc.ie What is metagenomics Metagenomics is the study of genetic material recovered directly from environmental

More information

Chapter 7. Motif finding (week 11) Chapter 8. Sequence binning (week 11)

Chapter 7. Motif finding (week 11) Chapter 8. Sequence binning (week 11) Course organization Introduction ( Week 1) Part I: Algorithms for Sequence Analysis (Week 1-11) Chapter 1-3, Models and theories» Probability theory and Statistics (Week 2)» Algorithm complexity analysis

More information

Microbiomes and metabolomes

Microbiomes and metabolomes Microbiomes and metabolomes Michael Inouye Baker Heart and Diabetes Institute Univ of Melbourne / Monash Univ Summer Institute in Statistical Genetics 2017 Integrative Genomics Module Seattle @minouye271

More information

CBC Data Therapy. Metatranscriptomics Discussion

CBC Data Therapy. Metatranscriptomics Discussion CBC Data Therapy Metatranscriptomics Discussion Metatranscriptomics Extract RNA, subtract rrna Sequence cdna QC Gene expression, function Institute for Systems Genomics: Computational Biology Core bioinformatics.uconn.edu

More information

Quality Filtering of Illumina Sequences. Susan Huse Brown University August 6, 2015

Quality Filtering of Illumina Sequences. Susan Huse Brown University August 6, 2015 Quality Filtering of Illumina Sequences Susan Huse Brown University August 6, 2015 Illumina FASTQ Files File naming: NA10831_ATCACG_L002_R1_001.fastq.gz FA1_S1_L001_R1_001.fastq.gz Sample_Barcode/Index_Lane_Read#_Set#.fastq.gz

More information

Introduction to taxonomic analysis of metagenomic amplicon and shotgun data with QIIME. Peter Sterk EBI Metagenomics Course 2014

Introduction to taxonomic analysis of metagenomic amplicon and shotgun data with QIIME. Peter Sterk EBI Metagenomics Course 2014 Introduction to taxonomic analysis of metagenomic amplicon and shotgun data with QIIME Peter Sterk EBI Metagenomics Course 2014 1 Taxonomic analysis using next-generation sequencing Objective we want to

More information

How much sequencing do I need? Emily Crisovan Genomics Core September 26, 2018

How much sequencing do I need? Emily Crisovan Genomics Core September 26, 2018 How much sequencing do I need? Emily Crisovan Genomics Core September 26, 2018 How much sequencing? Three questions: 1. How much sequence is required for good experimental design? 2. What type of sequencing

More information

Day 3. Examine gels from PCR. Learn about more molecular methods in microbial ecology

Day 3. Examine gels from PCR. Learn about more molecular methods in microbial ecology Day 3 Examine gels from PCR Learn about more molecular methods in microbial ecology Genes We Targeted 1: dsrab 1800bp 2: mcra 750bp 3: Bacteria 1450bp 4: Archaea 950bp 5: Archaea + 950bp 6: Negative control

More information

Next- gen sequencing. STAMPS 2015 Hilary G. Morrison Joe Vineis, Nora Downey, Be>e Hecox- Lea, Kim Finnegan

Next- gen sequencing. STAMPS 2015 Hilary G. Morrison Joe Vineis, Nora Downey, Be>e Hecox- Lea, Kim Finnegan Next- gen sequencing STAMPS 2015 Hilary G. Morrison Joe Vineis, Nora Downey, Be>e Hecox- Lea, Kim Finnegan QuesIons What is the difference between standard and next- gen sequencing? How is next- gen sequencing

More information

dbcamplicons pipeline Amplicons

dbcamplicons pipeline Amplicons dbcamplicons pipeline Amplicons Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu Microbial community analysis Goal:

More information

Next generation sequencing techniques" Toma Tebaldi Centre for Integrative Biology University of Trento

Next generation sequencing techniques Toma Tebaldi Centre for Integrative Biology University of Trento Next generation sequencing techniques" Toma Tebaldi Centre for Integrative Biology University of Trento Mattarello September 28, 2009 Sequencing Fundamental task in modern biology read the information

More information

Read Quality Assessment & Improvement. UCD Genome Center Bioinformatics Core Tuesday 14 June 2016

Read Quality Assessment & Improvement. UCD Genome Center Bioinformatics Core Tuesday 14 June 2016 Read Quality Assessment & Improvement UCD Genome Center Bioinformatics Core Tuesday 14 June 2016 QA&I should be interactive Error modes Each technology has unique error modes, depending on the physico-chemical

More information

Nature Biotechnology: doi: /nbt Supplementary Figure 1. MBQC base beta diversity, major protocol variables, and taxonomic profiles.

Nature Biotechnology: doi: /nbt Supplementary Figure 1. MBQC base beta diversity, major protocol variables, and taxonomic profiles. Supplementary Figure 1 MBQC base beta diversity, major protocol variables, and taxonomic profiles. A) Multidimensional scaling of MBQC sample Bray-Curtis dissimilarities (see Fig. 1). Labels indicate centroids

More information

NEXT GENERATION SEQUENCING. Farhat Habib

NEXT GENERATION SEQUENCING. Farhat Habib NEXT GENERATION SEQUENCING HISTORY HISTORY Sanger Dominant for last ~30 years 1000bp longest read Based on primers so not good for repetitive or SNPs sites HISTORY Sanger Dominant for last ~30 years 1000bp

More information

Next-generation sequencing Technology Overview

Next-generation sequencing Technology Overview Next-generation sequencing Technology Overview UQ Winter School 2018 Christopher Noune, PhD AGRF Melbourne christopher.noune@agrf.org.au What is NGS? Ion Torrent PGM (Thermo-Fisher) MiSeq (Illumina) High-Throughput

More information

Applications of Next Generation Sequencing in Metagenomics Studies

Applications of Next Generation Sequencing in Metagenomics Studies Applications of Next Generation Sequencing in Metagenomics Studies Francesca Rizzo, PhD Genomix4life Laboratory of Molecular Medicine and Genomics Department of Medicine and Surgery University of Salerno

More information

Carl Woese. Used 16S rrna to develop a method to Identify any bacterium, and discovered a novel domain of life

Carl Woese. Used 16S rrna to develop a method to Identify any bacterium, and discovered a novel domain of life METAGENOMICS Carl Woese Used 16S rrna to develop a method to Identify any bacterium, and discovered a novel domain of life His amazing discovery, coupled with his solitary behaviour, made many contemporary

More information

Matthew Tinning Australian Genome Research Facility. July 2012

Matthew Tinning Australian Genome Research Facility. July 2012 Next-Generation Sequencing: an overview of technologies and applications Matthew Tinning Australian Genome Research Facility July 2012 History of Sequencing Where have we been? 1869 Discovery of DNA 1909

More information

Carl Woese. Used 16S rrna to developed a method to Identify any bacterium, and discovered a novel domain of life

Carl Woese. Used 16S rrna to developed a method to Identify any bacterium, and discovered a novel domain of life METAGENOMICS Carl Woese Used 16S rrna to developed a method to Identify any bacterium, and discovered a novel domain of life His amazing discovery, coupled with his solitary behaviour, made many contemporary

More information

dbcamplicons pipeline Amplicons

dbcamplicons pipeline Amplicons dbcamplicons pipeline Amplicons Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu Microbial community analysis Goal:

More information

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio

More information

How much sequencing do I need? Emily Crisovan Genomics Core

How much sequencing do I need? Emily Crisovan Genomics Core How much sequencing do I need? Emily Crisovan Genomics Core How much sequencing? Three questions: 1. How much sequence is required for good experimental design? 2. What type of sequencing run is best?

More information

Parts of a standard FastQC report

Parts of a standard FastQC report FastQC FastQC, written by Simon Andrews of Babraham Bioinformatics, is a very popular tool used to provide an overview of basic quality control metrics for raw next generation sequencing data. There are

More information

Microbiomics I August 24th, Introduction. Robert Kraaij, PhD Erasmus MC, Internal Medicine

Microbiomics I August 24th, Introduction. Robert Kraaij, PhD Erasmus MC, Internal Medicine Microbiomics I August 24th, 2017 Introduction Robert Kraaij, PhD Erasmus MC, Internal Medicine r.kraaij@erasmusmc.nl Welcome to Microbiomics I Infection & Immunity MSc students Only first day no practicals

More information

Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing Workshop

Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing Workshop Output (bp) Aaron Liston, Oregon State University Growth in Next-Gen Sequencing Capacity 3.5E+11 2002 2004 2006 2008 2010 3.0E+11 2.5E+11 2.0E+11 1.5E+11 1.0E+11 Adapted from Mardis, 2011, Nature 5.0E+10

More information

Genome Sequencing. I: Methods. MMG 835, SPRING 2016 Eukaryotic Molecular Genetics. George I. Mias

Genome Sequencing. I: Methods. MMG 835, SPRING 2016 Eukaryotic Molecular Genetics. George I. Mias Genome Sequencing I: Methods MMG 835, SPRING 2016 Eukaryotic Molecular Genetics George I. Mias Department of Biochemistry and Molecular Biology gmias@msu.edu Sequencing Methods Cost of Sequencing Wetterstrand

More information

Using New ThiNGS on Small Things. Shane Byrne

Using New ThiNGS on Small Things. Shane Byrne Using New ThiNGS on Small Things Shane Byrne Next Generation Sequencing New Things Small Things NGS Next Generation Sequencing = 2 nd generation of sequencing 454 GS FLX, SOLiD, GAIIx, HiSeq, MiSeq, Ion

More information

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio

More information

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio

More information

Sequencing Theory. Brett E. Pickett, Ph.D. J. Craig Venter Institute

Sequencing Theory. Brett E. Pickett, Ph.D. J. Craig Venter Institute Sequencing Theory Brett E. Pickett, Ph.D. J. Craig Venter Institute Applications of Genomics and Bioinformatics to Infectious Diseases GABRIEL Network Agenda Sequencing Instruments Sanger Illumina Ion

More information

Introduction to the MiSeq

Introduction to the MiSeq Introduction to the MiSeq 2011 Illumina, Inc. All rights reserved. Illumina, illuminadx, BeadArray, BeadXpress, cbot, CSPro, DASL, Eco, Genetic Energy, GAIIx, Genome Analyzer, GenomeStudio, GoldenGate,

More information

Bioinformatics for Microbial Biology

Bioinformatics for Microbial Biology Bioinformatics for Microbial Biology Chaochun Wei ( 韦朝春 ) ccwei@sjtu.edu.cn http://cbb.sjtu.edu.cn/~ccwei Fall 2013 1 Outline Part I: Visualization tools for microbial genomes Tools: Gbrowser Part II:

More information

Molecular methods to characterize the microbiota in the mouse tissues

Molecular methods to characterize the microbiota in the mouse tissues Molecular methods to characterize the microbiota in the mouse tissues Olivier Bouchez, GeT-PlaGe, INRA Toulouse @GeT_Genotoul Who are we? Genomic and transcriptomic core facility spreads on 5 sites GeT

More information

Contact us for more information and a quotation

Contact us for more information and a quotation GenePool Information Sheet #1 Installed Sequencing Technologies in the GenePool The GenePool offers sequencing service on three platforms: Sanger (dideoxy) sequencing on ABI 3730 instruments Illumina SOLEXA

More information

CISC 889 Bioinformatics (Spring 2004) Lecture 3

CISC 889 Bioinformatics (Spring 2004) Lecture 3 CISC 889 Bioinformatics (Spring 004) Lecture Genome Sequencing Li Liao Computer and Information Sciences University of Delaware Administrative Have you visited The NCBI website? Have you read Hunter s

More information

Lecture 01: Overview of Metagenomics

Lecture 01: Overview of Metagenomics Lecture 01: Overview of Metagenomics 1 Culture Independent Techniques: Metagenomics Universal Gene census Shotgun Metagenome Sequencing Transcriptomics (shotgun mrna) Proteomics (protein fragments) Metabolomics

More information

De Novo Assembly of High-throughput Short Read Sequences

De Novo Assembly of High-throughput Short Read Sequences De Novo Assembly of High-throughput Short Read Sequences Chuming Chen Center for Bioinformatics and Computational Biology (CBCB) University of Delaware NECC Third Skate Genome Annotation Workshop May 23,

More information

Next Generation Sequencing Lecture Saarbrücken, 19. March Sequencing Platforms

Next Generation Sequencing Lecture Saarbrücken, 19. March Sequencing Platforms Next Generation Sequencing Lecture Saarbrücken, 19. March 2012 Sequencing Platforms Contents Introduction Sequencing Workflow Platforms Roche 454 ABI SOLiD Illumina Genome Anlayzer / HiSeq Problems Quality

More information

Day 3. Examine gels from PCR. Learn about more molecular methods in microbial ecology

Day 3. Examine gels from PCR. Learn about more molecular methods in microbial ecology Day 3 Examine gels from PCR Learn about more molecular methods in microbial ecology 1: dsrab 1800bp 2: mcra 750bp 3: Bacteria 1450bp 4: Archaea 950bp 5: Archaea + 950bp 6: Negative control Genes We Targeted

More information

Infectious Disease Omics

Infectious Disease Omics Infectious Disease Omics Metagenomics Ernest Diez Benavente LSHTM ernest.diezbenavente@lshtm.ac.uk Course outline What is metagenomics? In situ, culture-free genomic characterization of the taxonomic and

More information

Sequencing techniques

Sequencing techniques Sequencing techniques Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Learning objective: After this lecture, you should be able to account for different techniques for whole genome sequencing

More information

Introduction to OTU Clustering. Susan Huse August 4, 2016

Introduction to OTU Clustering. Susan Huse August 4, 2016 Introduction to OTU Clustering Susan Huse August 4, 2016 What is an OTU? Operational Taxonomic Units a.k.a. phylotypes a.k.a. clusters aggregations of reads based only on sequence similarity, independent

More information

Genomic resources. for non-model systems

Genomic resources. for non-model systems Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing

More information

Third Generation Sequencing

Third Generation Sequencing Third Generation Sequencing By Mohammad Hasan Samiee Aref Medical Genetics Laboratory of Dr. Zeinali History of DNA sequencing 1953 : Discovery of DNA structure by Watson and Crick 1973 : First sequence

More information

Genomic Technologies. Michael Schatz. Feb 1, 2018 Lecture 2: Applied Comparative Genomics

Genomic Technologies. Michael Schatz. Feb 1, 2018 Lecture 2: Applied Comparative Genomics Genomic Technologies Michael Schatz Feb 1, 2018 Lecture 2: Applied Comparative Genomics Welcome! The primary goal of the course is for students to be grounded in theory and leave the course empowered to

More information

Quality Control of Next Generation Sequence Data

Quality Control of Next Generation Sequence Data Quality Control of Next Generation Sequence Data January 17, 2018 Kane Tse, Assistant Bioinformatics Coordinator Canada s Michael Smith Genome Sciences Centre BC Cancer Agency Canada s Michael Smith Genome

More information

mothur Workshop for Amplicon Analysis Michigan State University, 2013

mothur Workshop for Amplicon Analysis Michigan State University, 2013 mothur Workshop for Amplicon Analysis Michigan State University, 2013 Tracy Teal MMG / ICER tkteal@msu.edu Kevin Theis Zoology / BEACON theiskev@msu.edu mothur Mission to develop a single piece of open-source,

More information

Experimental Design. Dr. Matthew L. Settles. Genome Center University of California, Davis

Experimental Design. Dr. Matthew L. Settles. Genome Center University of California, Davis Experimental Design Dr. Matthew L. Settles Genome Center University of California, Davis settles@ucdavis.edu What is Differential Expression Differential expression analysis means taking normalized sequencing

More information

Joint RuminOmics/Rumen Microbial Genomics Network Workshop

Joint RuminOmics/Rumen Microbial Genomics Network Workshop Joint RuminOmics/Rumen Microbial Genomics Network Workshop Microbiome analysis - Amplicon sequencing Dr. Sinéad Waters Animal and Bioscience Research Department, Teagasc Grange, Ireland Prof. Leluo Guan

More information

NGS part 2: applications. Tobias Österlund

NGS part 2: applications. Tobias Österlund NGS part 2: applications Tobias Österlund tobiaso@chalmers.se NGS part of the course Week 4 Friday 13/2 15.15-17.00 NGS lecture 1: Introduction to NGS, alignment, assembly Week 6 Thursday 26/2 08.00-09.45

More information

Data Basics. Josef K Vogt Slides by: Simon Rasmussen Next Generation Sequencing Analysis

Data Basics. Josef K Vogt Slides by: Simon Rasmussen Next Generation Sequencing Analysis Data Basics Josef K Vogt Slides by: Simon Rasmussen 2017 Generalized NGS analysis Sample prep & Sequencing Data size Main data reductive steps SNPs, genes, regions Application Assembly: Compare Raw Pre-

More information

Microbiome: Metagenomics 4/4/2018

Microbiome: Metagenomics 4/4/2018 Microbiome: Metagenomics 4/4/2018 metagenomics is an extension of many things you have already learned! Genomics used to be computationally difficult, and now that s metagenomics! Still developing tools/algorithms

More information

Human genome sequence

Human genome sequence NGS: the basics Human genome sequence June 26th 2000: official announcement of the completion of the draft of the human genome sequence (truly finished in 2004) Francis Collins Craig Venter HGP: 3 billion

More information

Introductory Next Gen Workshop

Introductory Next Gen Workshop Introductory Next Gen Workshop http://www.illumina.ucr.edu/ http://www.genomics.ucr.edu/ Workshop Objectives Workshop aimed at those who are new to Illumina sequencing and will provide: - a basic overview

More information

Bio(tech) Interlude. 3 Nobel Prizes: PCR: Kary Mullis, 1993 Electrophoresis: A.W.K. Tiselius, 1948 DNA Sequencing: Frederick Sanger, 1980

Bio(tech) Interlude. 3 Nobel Prizes: PCR: Kary Mullis, 1993 Electrophoresis: A.W.K. Tiselius, 1948 DNA Sequencing: Frederick Sanger, 1980 Bio(tech) Interlude 3 Nobel Prizes: PCR: Kary Mullis, 1993 Electrophoresis: A.W.K. Tiselius, 1948 DNA Sequencing: Frederick Sanger, 1980 PCR 1: 25ºC G 2: 95ºC A 3: 60ºC T 5 3 A A G 3 G T C 5 T T T 6: 72ºC

More information

Welcome to the NGS webinar series

Welcome to the NGS webinar series Welcome to the NGS webinar series Webinar 1 NGS: Introduction to technology, and applications NGS Technology Webinar 2 Targeted NGS for Cancer Research NGS in cancer Webinar 3 NGS: Data analysis for genetic

More information

GENES & GENOME DATABASES

GENES & GENOME DATABASES GENES & GENOME DATABASES BME 110/BIOL 181 Computational Biology Tools Prof. Todd Lowe April 5, 2012 ADMIN Discuss Fun Quiz Readings: Dummies Chapters 1, 2 (pp. 29-56), Ch 3; NYTimes piece on Jim Kent Assigned

More information

COMPARING MICROBIAL COMMUNITY RESULTS FROM DIFFERENT SEQUENCING TECHNOLOGIES

COMPARING MICROBIAL COMMUNITY RESULTS FROM DIFFERENT SEQUENCING TECHNOLOGIES COMPARING MICROBIAL COMMUNITY RESULTS FROM DIFFERENT SEQUENCING TECHNOLOGIES Tyler Bradley * Jacob R. Price * Christopher M. Sales * * Department of Civil, Architectural, and Environmental Engineering,

More information

Quality control for Sequencing Experiments

Quality control for Sequencing Experiments Quality control for Sequencing Experiments v2018-04 Simon Andrews simon.andrews@babraham.ac.uk Support service for bioinformatics Academic Babraham Institute Commercial Consultancy Support BI Sequencing

More information

Analysing genomes and transcriptomes using Illumina sequencing

Analysing genomes and transcriptomes using Illumina sequencing Analysing genomes and transcriptomes using Illumina uencing Dr. Heinz Himmelbauer Centre for Genomic Regulation (CRG) Ultrauencing Unit Barcelona The Sequencing Revolution High-Throughput Sequencing 2000

More information

Introduction to Bioinformatics and Gene Expression Technologies

Introduction to Bioinformatics and Gene Expression Technologies Introduction to Bioinformatics and Gene Expression Technologies Utah State University Fall 2017 Statistical Bioinformatics (Biomedical Big Data) Notes 1 1 Vocabulary Gene: hereditary DNA sequence at a

More information

Introduction to Bioinformatics and Gene Expression Technologies

Introduction to Bioinformatics and Gene Expression Technologies Vocabulary Introduction to Bioinformatics and Gene Expression Technologies Utah State University Fall 2017 Statistical Bioinformatics (Biomedical Big Data) Notes 1 Gene: Genetics: Genome: Genomics: hereditary

More information

Massive Analysis of cdna Ends for simultaneous Genotyping and Transcription Profiling in High Throughput

Massive Analysis of cdna Ends for simultaneous Genotyping and Transcription Profiling in High Throughput Next Generation (Sequencing) Tools for Nucleotide-Based Information Massive Analysis of cdna Ends for simultaneous Genotyping and Transcription Profiling in High Throughput Björn Rotter, PhD GenXPro GmbH,

More information

Advanced Technology in Phytoplasma Research

Advanced Technology in Phytoplasma Research Advanced Technology in Phytoplasma Research Sequencing and Phylogenetics Wednesday July 8 Pauline Wang pauline.wang@utoronto.ca Lethal Yellowing Disease Phytoplasma Healthy palm Lethal yellowing of palm

More information

NUCLEIC ACIDS. DNA (Deoxyribonucleic Acid) and RNA (Ribonucleic Acid): information storage molecules made up of nucleotides.

NUCLEIC ACIDS. DNA (Deoxyribonucleic Acid) and RNA (Ribonucleic Acid): information storage molecules made up of nucleotides. NUCLEIC ACIDS DNA (Deoxyribonucleic Acid) and RNA (Ribonucleic Acid): information storage molecules made up of nucleotides. Base Adenine Guanine Cytosine Uracil Thymine Abbreviation A G C U T DNA RNA 2

More information

Next Gen Sequencing. Expansion of sequencing technology. Contents

Next Gen Sequencing. Expansion of sequencing technology. Contents Next Gen Sequencing Contents 1 Expansion of sequencing technology 2 The Next Generation of Sequencing: High-Throughput Technologies 3 High Throughput Sequencing Applied to Genome Sequencing (TEDed CC BY-NC-ND

More information

HMP Data Set Documentation

HMP Data Set Documentation HMP Data Set Documentation Introduction This document provides detail about files available via the DACC website. The goal of the HMP consortium is to make the metagenomics sequence data generated by the

More information

Next Generation Sequencing. Tobias Österlund

Next Generation Sequencing. Tobias Österlund Next Generation Sequencing Tobias Österlund tobiaso@chalmers.se NGS part of the course Week 4 Friday 13/2 15.15-17.00 NGS lecture 1: Introduction to NGS, alignment, assembly Week 6 Thursday 26/2 08.00-09.45

More information

Overview of Next Generation Sequencing technologies. Céline Keime

Overview of Next Generation Sequencing technologies. Céline Keime Overview of Next Generation Sequencing technologies Céline Keime keime@igbmc.fr Next Generation Sequencing < Second generation sequencing < General principle < Sequencing by synthesis - Illumina < Sequencing

More information

Next-Generation Sequencing. Technologies

Next-Generation Sequencing. Technologies Next-Generation Next-Generation Sequencing Technologies Sequencing Technologies Nicholas E. Navin, Ph.D. MD Anderson Cancer Center Dept. Genetics Dept. Bioinformatics Introduction to Bioinformatics GS011062

More information

NextGen Sequencing and Target Enrichment

NextGen Sequencing and Target Enrichment NextGen Sequencing and Target Enrichment Laurent FARINELLI 7 September 2010 Agilent 3rd Analytic Forum Basel, Switzerland Outline The illumina HiSEQ 2000 system Applications Target enrichment Outlook 7

More information

DNA concentration and purity were initially measured by NanoDrop 2000 and verified on Qubit 2.0 Fluorometer.

DNA concentration and purity were initially measured by NanoDrop 2000 and verified on Qubit 2.0 Fluorometer. DNA Preparation and QC Extraction DNA was extracted from whole blood or flash frozen post-mortem tissue using a DNA mini kit (QIAmp #51104 and QIAmp#51404, respectively) following the manufacturer s recommendations.

More information

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Tuesday December 16, 2014

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Tuesday December 16, 2014 High Throughput Sequencing Technologies J Fass UCD Genome Center Bioinformatics Core Tuesday December 16, 2014 Sequencing Explosion www.genome.gov/sequencingcosts http://t.co/ka5cvghdqo Sequencing Explosion

More information

Next Generation Sequences & Chloroplast Assembly. 8 June, 2012 Jongsun Park

Next Generation Sequences & Chloroplast Assembly. 8 June, 2012 Jongsun Park Next Generation Sequences & Chloroplast Assembly 8 June, 2012 Jongsun Park Table of Contents 1 History of Sequencing Technologies 2 Genome Assembly Processes With NGS Sequences 3 How to Assembly Chloroplast

More information

Day 3. Examine gels from PCR. Learn about more molecular methods in microbial ecology. Tour the Bay Paul Center Keck Sequencing Facility

Day 3. Examine gels from PCR. Learn about more molecular methods in microbial ecology. Tour the Bay Paul Center Keck Sequencing Facility Day 3 Examine gels from PCR Learn about more molecular methods in microbial ecology Tour the Bay Paul Center Keck Sequencing Facility 1: dsrab 1800bp 2: mcra 750bp 3: Bacteria 1450bp 4: Archaea 950bp 5:

More information

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday September 15, 2014

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday September 15, 2014 High Throughput Sequencing Technologies J Fass UCD Genome Center Bioinformatics Core Monday September 15, 2014 Sequencing Explosion www.genome.gov/sequencingcosts http://t.co/ka5cvghdqo Sequencing Explosion

More information

Robert Edgar. Independent scientist

Robert Edgar. Independent scientist Robert Edgar Independent scientist robert@drive5.com www.drive5.com Reads FASTQ format Millions of reads Many Gb USEARCH commands "UPARSE pipeline" OTU sequences FASTA format >Otu1 GATTAGCTCATTCGTA >Otu2

More information

Introduction to NGS. Josef K Vogt Slides by: Simon Rasmussen Next Generation Sequencing Analysis

Introduction to NGS. Josef K Vogt Slides by: Simon Rasmussen Next Generation Sequencing Analysis Introduction to NGS Josef K Vogt Slides by: Simon Rasmussen 2017 Life science data deluge Massive unstructured data from several areas DNA, patient journals, proteomics, imaging,... Impacts Industry, Environment,

More information

RNA-Seq data analysis course September 7-9, 2015

RNA-Seq data analysis course September 7-9, 2015 RNA-Seq data analysis course September 7-9, 2015 Peter-Bram t Hoen (LUMC) Jan Oosting (LUMC) Celia van Gelder, Jacintha Valk (BioSB) Anita Remmelzwaal (LUMC) Expression profiling DNA mrna protein Comprehensive

More information

NGS technologies: a user s guide. Karim Gharbi & Mark Blaxter

NGS technologies: a user s guide. Karim Gharbi & Mark Blaxter NGS technologies: a user s guide Karim Gharbi & Mark Blaxter genepool-manager@ed.ac.uk Natural history of sequencing 2 Brief history of sequencing 100s bp throughput 100 Gb 1977 1986 1995 1999 2005 2007

More information

Introduction to metagenome assembly. Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014

Introduction to metagenome assembly. Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014 Introduction to metagenome assembly Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014 Sequencing specs* Method Read length Accuracy Million reads Time Cost per M 454

More information

Outline General NGS background and terms 11/14/2016 CONFLICT OF INTEREST. HLA region targeted enrichment. NGS library preparation methodologies

Outline General NGS background and terms 11/14/2016 CONFLICT OF INTEREST. HLA region targeted enrichment. NGS library preparation methodologies Eric T. Weimer, PhD, D(ABMLI) Assistant Professor, Pathology & Laboratory Medicine, UNC School of Medicine Director, Molecular Immunology Associate Director, Clinical Flow Cytometry, HLA, and Immunology

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature09944 Supplementary Figure 1. Establishing DNA sequence similarity thresholds for phylum and genus levels Sequence similarity distributions of pairwise alignments of 40 universal single

More information

Microbially Mediated Plant Salt Tolerance and Microbiome based Solutions for Saline Agriculture

Microbially Mediated Plant Salt Tolerance and Microbiome based Solutions for Saline Agriculture Microbially Mediated Plant Salt Tolerance and Microbiome based Solutions for Saline Agriculture Contents Introduction Abiotic Tolerance Approaches Reasons for failure Roots, microorganisms and soil-interaction

More information

Bioinformatics for Genomics

Bioinformatics for Genomics Bioinformatics for Genomics It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material. When I was young my Father

More information

Galaxy Workshop

Galaxy Workshop Galaxy Workshop 1-8-13 Intros: Tom Bair thomas-bair@uiowa.edu Ann Black-Ziegelbein annblack@eng.uiowa.edu Srinivas Maddhi srinivas-maddhi@uiowa.edu What is galaxy good for Access to resources Documentation

More information

GSA ---------------------------------------------------------------------------------------------------------------------- GSA Submission *Alias Submission name of the GSA. This field is used when the

More information

Next generation sequencing in diagnostic laboratories: opportunities and challenges

Next generation sequencing in diagnostic laboratories: opportunities and challenges Next generation sequencing in diagnostic laboratories: opportunities and challenges Vitali Sintchenko Marie Bashir Institute for Emerging Infectious Diseases & Biosecurity Declaration No conflict of interest

More information

Supplementary Figure 1 Schematic view of phasing approach. A sequence-based schematic view of the serial compartmentalization approach.

Supplementary Figure 1 Schematic view of phasing approach. A sequence-based schematic view of the serial compartmentalization approach. Supplementary Figure 1 Schematic view of phasing approach. A sequence-based schematic view of the serial compartmentalization approach. First, barcoded primer sequences are attached to the bead surface

More information

Understanding the science and technology of whole genome sequencing

Understanding the science and technology of whole genome sequencing Understanding the science and technology of whole genome sequencing Dag Undlien Department of Medical Genetics Oslo University Hospital University of Oslo and The Norwegian Sequencing Centre d.e.undlien@medisin.uio.no

More information

Genome Resequencing. Rearrangements. SNPs, Indels CNVs. De novo genome Sequencing. Metagenomics. Exome Sequencing. RNA-seq Gene Expression

Genome Resequencing. Rearrangements. SNPs, Indels CNVs. De novo genome Sequencing. Metagenomics. Exome Sequencing. RNA-seq Gene Expression Genome Resequencing De novo genome Sequencing SNPs, Indels CNVs Rearrangements Metagenomics RNA-seq Gene Expression Splice Isoform Abundance High Throughput Short Read Sequencing: Illumina Exome Sequencing

More information

Sequencing techniques and applications

Sequencing techniques and applications I519 Introduction to Bioinformatics Sequencing techniques and applications Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Contents Sequencing techniques Sanger sequencing Next generation

More information