Introduction to NGS Analysis Tools
|
|
- Stanley Fowler
- 6 years ago
- Views:
Transcription
1 National Center for Emerging and Zoonotic Infectious Diseases Introduction to NGS Analysis Tools Heather Carleton, PhD, MPH Team Lead, Enteric Diseases Bioinformatics, Enteric Diseases Laboratory Branch, DFWED, NCEZID, CDC Next Generation Sequencing: From concept to reality at public health laboratories June 6 th, 2016 Objectives Provide a basic overview of terminology surrounding next generation sequencing data Discuss analysis terminology Highlight NGS analysis tools Command line freely available tools On line/cloud based tools Commercially available analysis tools Discuss advantages/disadvantages to the tools 1
2 Why do you need analysis tools: To translate WGS data Consolidation of multiple workflows in the laboratory: Identification serotyping virulence profiling antimicrobial resistance characterization subtyping Analysis Tools Assembly (de novo) whole genome MLST Analysis Functional analysis (ANI, Serotype, antimicrobial resistance profile, annotation) Sequence QC Read mapping Reference-based assembly hqsnp analysis kmer (raw read/assembly) SNP analysis wgmlst analysis 2
3 What is a analysis/bioinformatics pipeline? QC de novo assembly wgmlst phylogenetic tree Pipeline refers to the series of tools used to go from raw sequence data to answer Types of analysis pipelines Bioinformatics Experience Freely available command-line/ on-line cloud-based/fee for service Commercial software 3
4 How to pick an analysis pipeline(s) Pick the tool that fits your users If you do not have bioinformaticians in your lab than using command line tools will be a challenge Make sure the tool delivers the output you need if you need a phylogenetic tree then it needs to do read mapping, snp detection, and phylogenetic inference Must provide quality checks of raw sequence data and analysis steps so you can evaluate success of tool Analysis Tools Sequen ce QC Assembly (de novo) whole genome MLST Analysis Functional analysis (ANI, Serotype, antimicrobial resistance profile, annotation) 4
5 Basic QC analysis Tools used to analyze the basic quality of a sequencing run or reads generated per isolate of a sequencing run FastQC (also available in BaseSpace) Torrent Server Geneious Qiagen/CLC workbench BioNumerics v7 Sequence QC Q score 95% Q30 Quality scores likelihood the base call is correct Phred part of fastq file generated from sequencer that scores base call quality Q30 the percentage of base calls that have a 1 in 1000 chance or less of being incorrect (Q20 1 incorrect in 100 base calls) indicates whether a base call is trustworthy and can be used in a hqsnp analysis 5
6 Sequence QC Read trimming Assess quality over the entire read by looking at quality score by base position and % GC by base position Most NGS machines have read trimming as part of machine workflow to remove indices and adaptors Sequence Quality Insert size Insert size refers to the length of the piece of DNA you are sequencing Generally want insert size to be larger than sequencing chemistry (i.e. if doing 2x250/500 cycle sequencing want insert size larger than 500bp) Bad insert size 2x150 sequencing Good insert size 6
7 Sequence QC Coverage Coverage at 40x Coverage at 5x NGS generates 100,000 or more reads per one genome sequenced Any single location on the genome can have zero to hundreds of sequence reads that cover the one region Sequence Analysis De Novo Assembly Assemble raw sequence data from ~100k reads to contigs Assemblers use different algorithms and are built to work with a specific NGS machine SPAdes, Velvet, Newbler BaseSpace/SPAdes plug in Torrent Server Geneious Qiagen/CLC workbench BioNumerics v7 7
8 Sequence Analysis De novo assembly Combine overlapping reads into a single contig Sequence Analysis de novo assembly quality Assembly metrics can indicate sequence quality Number of contigs raw reads assembles into Good: E. coli <200, Salmonella < 100, Listeria < 30 N50 statistic Calculated by summarizing the lengths of the biggest contigs until you reach 50% of total combined contig length Good: >200,000 bp 3 Million base pair genome (determined by sum of contig lengths) 750,000bp 500,000bp 350,000bp *N50 is 350,000 bp Indicates 1.5 Million base pairs, or cutoff for 50% combined contig length (N50) 8
9 Sequence Analysis Multi locus sequence typing Locus can be a gene or part of a gene any change (single nucleotide polymorphism, insertion, deletion, small inversion) is a new allele number Loci can cover the whole genome of an isolate, the core (in common) genes of a species, or house keeping genes of a genus (traditional MLST) cgmlst hq SNP Sequence Analysis MLST Comparing number (character) differences between isolates Requires an already developed scheme for the analyzed organism NCBI Pathogen pipeline (in development) BigsDB ( Ridom/SeqSphere ( BioNumerics v7 9
10 Sequence analysis functional annotation Predict isolate characteristics from WGS data (genus/species, serotype, antimicrobial resistance, virulence, etc.) NCBI Pathogen pipeline (antimicrobial resistance) Center for Genomic Epidemiology (CGE) (virulence, STEC/ Salmonella serotype, antimicrobial resistance) BioNumerics v7 (genus/species (ANIm), virulence, STEC/Salmonella serotype, antimicrobial resistance) Identifying Genus and Species from WGS data Can use databases MLST, ribosomal MLST, 16S to identify Genus and occasionally to species level Can use WGS methods similar to classic laboratory methods for identification, DNA DNA hybridization, to calculate Average Nucleotide Identity (ANI) between a query genome and a reference genome E. coli ACTAGAGGGAAA S. enterica GCATCCCCCGTT GCATCCCCCGTA query genome ANI score 98% for S.enterica 10
11 Inferring serotype from WGS Since the genes that code the O and H antigens and determine serotype are known can build a database that translates sequence to serotype Limitations Sometimes genes are not expressed (non motile isolates) There may be modifications to the antigen protein that are not encoded in the genes that originally make the protein Virulence factors from WGS data Virulence factors like Shiga toxin or other enterotoxins that are traditionally detected by serology, PCR, or real time PCR can be detected in WGS data using databases Publically available resources like the Center of Genomic Epidemiology VirulenceFinder can be used to find virulence genes in E. coli, Enterococcus, and S. aureus 11
12 Predicting antimicrobial resistance from WGS Acquired resistance Usually resistance genes (200bp 1,000bp) Highly conserved even between different genera (>98% identity) Usually located on mobile elements (plasmids, integrons, islands) Methods to detect assembled sequence, resistance databases (Resfinder, ARG ANNOT, FDA/NCBI AR database) Acquired Resistance Genes associated with a particular AR phenotype Phenotype Ampicillin Amoxicillin/ clavulanic acid Cefoxitin Ceftriaxone Ceftiofur Kanamycin Gentamicin Streptomycin Chloramphenicol Sulfisoxazole Trimethoprim/ sulphamethoxazole Tetracycline Genotype bla cmy 2 aph(3 ) Ia aac(3) VIa aada2, strab flor sul1, sul2 dfra12, sul1, sul2 teta 12
13 Predicting antimicrobial resistance from WGS Mutational resistance Usually SNPs, but can be insertions/deletions Usually chromosomal Genera or species specific Methods no available databases assembled sequence, in silico PCR raw reads, SNP analysis Analysis Tools Reference-based assembly Sequen ce QC Read mapping hqsnp analysis Functional analysis (ANI, Serotype, antimicrobial resistance profile, annotation) 13
14 Sequence Analysis Read mapping/ hqsnp analysis Map raw sequence data to a known reference genome Pick mapper based on sequencing chemistry and organism (diploid/haploid) Mapping used for downstream analysis including hqsnp samtools, bowtie2, smalt (can wrap some of these in Galaxy) BaseSpace (bacterial, viral, human, and cancer variant apps), torrent server NCBI pathogen pipeline BioNumerics v7, CLC Genome workbench, Geneious Sequence Analysis high quality single nucleotide polymorphisms (hqsnps) Sequence Reads Sequence reads Sequence reads What makes a SNP high quality (hq)? Apply a quality filter that filters out nucleotides in sequence reads for comparison based on sequence coverage, quality, location Quality filtered Sequence Reads ready for analysis 14
15 What to call a SNP SNPs called based on: Quality Coverage Base frequency The differences between the reference and compared genome are extracted and used to determine relatedness ATGTTACTC ATGTTCCTC ATGTTCCTC ATGTTCCTC ATGTTCCTC ATGTTCCTC ATGTTTCTC ATGTTCCTC ATGTTCCTC ATGTTCCTC ATGTTGCTC ATGTTCCTC ATGTTCCTC ATGTTCCTC ATGTTGCTC reference Is it a SNP? Where to call a SNP? Mobile elements genes Raw reads Mask mobile elements -do no consider SNPs in this location Only call SNPs in genes Not all SNP pipelines are equal where you call SNPs will affect the total SNP count SNPs relevant for phylogenetic analysis are vertically transmitted, not horizontally, so horizontal genetic elements like phages can be masked 15
16 Where to call a SNP pick the right reference Choice of reference genome affects analysis more closely related reference more likely to identify true SNP differences How to interpret hqsnps phylogenetic trees Use the differences you identified by hqsnp to infer the relatedness or phylogeny of isolates actgaatta 3 ggagaatta 1 ggataatta 1 1 ggattatta ggagagtta 6 Isolate C Isolate A ggatccccc Isolate B 5 actgccggt Isolate D genetic change 16
17 NCBI Pathogen Detection Pipeline NCBI Submission Portal BioProject BioSamples SRA GenBank NCBI Pathogen Pipeline QC Kmer analysis Genome Assembly Genome Annotation Genome Placement Clustering SNP analysis Tree Construction Reports Automated Bacterial Assembly Reference Distance tree SRA Reads sample 1 Trim reads (Ns, adaptor) Find closest reference genome(s) De novo assembly panel Argo (Reference assisted assembly) SOAP denovo MaSuRCA SPAdes GS-assembler (newbler) Celera Assembler ArgoCA (Combined Assembly) Reads remapped to combined assembly Contig fasta Read placements (bam) Quality profile 17
18 Results Available Now NCBI Pathogen Detection SNP Pipeline: example 1 - stone fruit outbreak 18
19 CDC SNP extraction tool Lyve SET Developed for analysis of raw sequence data from foodborne pathogens Works with both ion torrent and illumina data (need to use 2 different mappers Can filter based on quality and clustered SNPs and filter out phages automatically SET Clean raw reads cg-pipeline Map reads to reference SMALT Identify SNPs Varscan Create phylogeny RaxML SNP matrix pairwise differences phylogenetic tree FDA SNP pipeline SNP pipeline Developed for analysis of sequence data for foodborne pathogens Excellent documentation online pipeline.readthedocs.io/en/latest/ Biostatistics/snp pipeline Map reads to reference Bowtie2 Identify SNPs Varscan SNP matrix pairwise differences Output for phylogenetic analysis 19
20 Analysis Tools Sequence QC kmer (raw read based/assembly) SNP analysis wgmlst analysis Sequence analysis reference free raw read and assembly based approaches Analysis does not require a reference Can use kmer based analyses to measure relatedness between isolates Can also use to fast match against a known allele/reference ksnp ( MASH NCBI pathogen pipeline (kmer tree) Center for genomic epidemiology CLC genome workbench BioNumerics v7.5 wgmlst 20
21 Kmer based analysis Computer algorithms use a sliding window to chop up sequence reads into shorter lengths (k) of DNA kmers kmers are compared to identify differences Read (15bp) ACTGAACTGACTCAA ACTGAACTGACTCAC K-mer (10bp) ACTGAACTGA CTGAACTGAC TGAACTGACT AACTGACTCA ACTGACTCAA Identical K-mers Unique K-mer ACTGAACTGA CTGAACTGAC TGAACTGACT AACTGACTCA ACTGACTCAC Isolate 1 Isolate 2 KSNP based analysis Computer algorithms use a sliding window to chop up sequence reads into shorter lengths (k) of DNA k is always an odd number Raw Read (15bp) Compare base pair differences at central position of kmer ACTGAACTGACTCAA ACTGCACTGACTCAA K-mer (9bp) ACTGAACTG CTGAACTGA TGAACTGAC AACTGACTC ACTGACTCA ACTGCACTG CTGCACTGA TGCACTGAC CACTGACTC ACTGACTCA Isolate 1 Isolate 2 21
22 Kmer analysis identifying organisms End to End Analysis Tools Assembly (de novo) whole genome MLST Analysis Functional analysis (ANI, Serotype, antimicrobial resistance profile, annotation) Sequence QC Read mapping Reference-based assembly hqsnp analysis kmer (raw read/assembly) SNP analysis wgmlst analysis 22
23 Tools that offer end to end solutions: BioNumerics v7.6 Tools for QC, assembly, wgmlst, hqsnp, functional prediction in each single button workflows Functions as a database so the metadata needed to interpret the analysis is easily viewable For bacteriology, virology, mycology, animals, and plants Tools that offer end to end solutions: CLC Genomics Has tools to handle haploid and diploid genomes Nice graphics and reporting features Can export workflows for others to use 23
24 Tools that offer end to end solutions: Illumina BaseSpace Conclusions: Pick the tool that fits your need Think about whether you will be doing CLIA or CAP certified tests through the pipeline and what kind of control and customization you need Make sure your laboratorians can use the tool and interpret the output 24
25 Questions? Use of trade names is for identification only and does not imply endorsement by the Centers for Disease Control and Prevention or the U.S. Department of Health and Human Services. For more information, contact CDC CDC INFO ( ) TTY: The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention. Resources: Program What for? Where to find it Cost? Platform BioNumerics 7.5 CLC Bio Genomics Workbench Geneious Assembly, wgmlst, SNP analysis Workflows, read metrics, assemblies, etc, SNP analyses Assemblies, trees, SNP analysis Yes Yes Windows Windows/ Linux Yes Windows MEGA6 Phylogenies megasoftware.net/ No Windows Lasergene Assemblies, read metrics, Yes Windows analysis NCBI Genome Workbench CFSAN SNP pipeline Snp Extraction Tool Viewing trees, analysis Assembly, read metrics, assembly metrics, read cleaning, etc Read cleaning, Creating Phylogenies ools/gbench/ atics.com/products/clcgenomics-workbench/ sourceforge.net/projects/cgpipeline No No Windows/ Linux Linux github.com/lskatz/lyve-set No Linux 25
The Basics of Understanding Whole Genome Next Generation Sequence Data
The Basics of Understanding Whole Genome Next Generation Sequence Data Heather Carleton-Romer, MPH, Ph.D. ASM-CDC Infectious Disease and Public Health Microbiology Postdoctoral Fellow PulseNet USA Next
More informationThe Basics of Understanding Whole Genome Next Generation Sequence Data
The Basics of Understanding Whole Genome Next Generation Sequence Data Heather Carleton, MPH, Ph.D. ASM-CDC Infectious Disease and Public Health Microbiology Postdoctoral Fellow PulseNet USA Next Generation
More informationIntroduction to PulseNet WGS Tools in BioNumerics v7.6
National Center for Emerging and Zoonotic Infectious Diseases Introduction to PulseNet WGS Tools in BioNumerics v7.6 Steven Stroika PulseNet CDC PulseNet/OutbreakNet Regional Meeting February 2019 Overview
More informationDeveloping Tools for Rapid and Accurate Post-Sequencing Analysis of Foodborne Pathogens. Mitchell Holland, Noblis
Developing Tools for Rapid and Accurate Post-Sequencing Analysis of Foodborne Pathogens Mitchell Holland, Noblis Agenda Introduction Whole Genome Sequencing Analysis Pipeline Sequence Alignment SNPs and
More informationBeef Industry Safety Summit Renaissance Austin Hotel 9721 Arboretum Blvd. Austin, TX March 1-3
1 USDA, Food Safety and Inspection Service Beef Industry Safety Summit - 2016 Renaissance Austin Hotel 9721 Arboretum Blvd. Austin, TX 78759 March 1-3 Uday Dessai MPH, MS, PhD Senior Public Health Advisor
More informationValidating Bionumerics 7.6: A strategic approach from Oregon
Validating Bionumerics 7.6: A strategic approach from Oregon Karim Morey, MS, M(ASCP) Oregon State Public Health Laboratory PulseNet West Coast Regional Meeting February 2019 Outline Compliance requirements
More informationWhole Genome Sequence Data Quality Control and Validation
Whole Genome Sequence Data Quality Control and Validation GoSeqIt ApS / Ved Klædebo 9 / 2970 Hørsholm VAT No. DK37842524 / Phone +45 26 97 90 82 / Web: www.goseqit.com / mail: mail@goseqit.com Table of
More informationIntroduction to CGE tools
Introduction to CGE tools Pimlapas Leekitcharoenphon (Shinny) Research Group of Genomic Epidemiology, DTU-Food. WHO Collaborating Centre for Antimicrobial Resistance in Foodborne Pathogens and Genomics.
More informationCanada's IRIDA platform for genomic epidemiology. Gary Van Domselaar Chief, Bioinformatics National Microbiology Lab Public Health Agency of Canada
Canada's IRIDA platform for genomic epidemiology Gary Van Domselaar Chief, Bioinformatics National Microbiology Lab Public Health Agency of Canada Integrated Rapid Infectious Disease Analysis informatics
More informationCDC s Advanced Molecular Detection (AMD) Sequence Data Analysis and Management
CDC s Advanced Molecular Detection (AMD) Sequence Data Analysis and Management Scott Sammons Technology Officer Office of Advanced Molecular Detection National Center for Emerging and Zoonotic Infectious
More informationBioinformatics- Data Analysis
Bioinformatics- Data Analysis Erin H. Graf, PhD, D(ABMM) Infectious Disease Diagnostics Laboratory, Children s Hospital of Philadelphia Department of Pathology and Laboratory Medicine, University of Pennsylvania
More informationDetecting Clusters and Reporting Results
National Center for Emerging and Zoonotic Infectious Diseases Detecting Clusters and Reporting Results Beth Tolar Salmonella Database Coordinator PulseNet Central Regional Meeting March 2019 Update to
More informationCurrent status of universal whole genome sequencing of Mycobacterium tuberculosis in the United States
Current status of universal whole genome sequencing of Mycobacterium tuberculosis in the United States Lauren Cowan, PhD Medical Consultant Meeting San Antonio, TX November 29-30, 2018 1 EXCELLENCE EXPERTISE
More informationThe implementation and application of Whole Genome Sequencing in the Campylobacter Reference Laboratory at Public Health England Craig Swift
The implementation and application of Whole Genome Sequencing in the Campylobacter Reference Laboratory at Public Health England Craig Swift Campylobacter EURL workshop (2018) The Gastrointestinal Bacteria
More informationRue Juliette Wytsmanstraat Brussels Belgium T F
Kevin Vanneste, PhD Bioinformatics Platform Platform Biotechnology and Molecular Biology Department Expertise, Service Provision and Customer Relations Collaboration between the EURL-VTEC and the Platform
More informationUpdates from CDC: Cluster Detection and Reporting Guidelines
National Center for Emerging and Zoonotic Infectious Diseases Updates from CDC: Cluster Detection and Reporting Guidelines Molly Leeper Salmonella Database Manager PulseNet Western Regional Meeting February
More informationVTEC strains typing: from traditional methods to NGS
VTEC strains typing: from traditional methods to NGS 2 nd course on bioinformatics tools for Next Generation Sequencing data mining: use of bioinformatics tools for typing pathogenic E. coli ISS, Rome
More informationBioinformatics Tools and Pipelines for Real-Time Pathogen Surveillance
Bioinformatics Tools and Pipelines for Real-Time Pathogen Surveillance Errol Strain, Ph.D. Chief, Biostatistics Branch FDA/OFVM/CFSAN/OAO/DPHIA 3/24/2014 Overview 1. Validation and Proficiency Testing
More informationComputational assembly for prokaryotic sequencing projects
Computational assembly for prokaryotic sequencing projects Lee Katz, Ph.D. Bioinformatician, Enteric Diseases Laboratory Branch (EDLB) Enteric Diseases Bioinformatics Team (EDBiT) January 30, 2017 Disclaimers
More informationWhole-Genome Sequencing (WGS) for Food Safety
Whole-Genome Sequencing (WGS) for Food Safety Errol Strain, Ph.D. Director, Biostatistics and Bioinformatics Staff Center for Food Safety and Applied Nutrition U.S. Food Drug Administration IFSH Meeting
More informationFrom Bands to Base Pairs: Implementation of WGS in a PulseNet Laboratory
From Bands to Base Pairs: Implementation of WGS in a PulseNet Laboratory Sara Wagner Microbiologist WI State Lab of Hygiene InFORM Meeting Nov 19, 2015 Objectives Describe WGS implementation at WSLH What
More informationEURL WORKING GROUP ON WHOLE GENOME SEQUENCING AND PULSENET INTERNATIONAL
EURL WORKING GROUP ON WHOLE GENOME SEQUENCING AND PULSENET INTERNATIONAL EURL-Campylobacter workshop, 9/10-2018 Joakim Skarin, SVA Objectives of the WG-NGS To promote the use of NGS across the EURL networks
More informationCGE Pipeline. Content 1. The Batch Upload 2. The Pipeline 3. The User System 4. The List Tool 5. The Map Tool 6. Exercises
CGE Pipeline Content 1. The Batch Upload 2. The Pipeline 3. The User System 4. The List Tool 5. The Map Tool 6. Exercises Jose Luis Bellod Cisneros PhD. Student Content 1. The Batch Upload 2. The Pipeline
More informationWhole Genome Sequencing for Enteric Pathogen Surveillance and Outbreak Investigations
Whole Genome Sequencing for Enteric Pathogen Surveillance and Outbreak Investigations Anne Maki, Manager, Enteric, Environmental, Molecular Surveillance and Bacterial Sexually Transmitted Infections, Public
More informationData Intensive Biomedical Research: The EU RL VTEC efforts to take up the NGS challenge. EU RL for E. coli Annual Workshop 2015
Data Intensive Biomedical Research: The EU RL VTEC efforts to take up the NGS challenge EU RL for E. coli Annual Workshop 2015 NGS adoption: Worldwide Source: Omicsmap.com November, 2015 Data Production
More informationCGE Pipeline. Content 1. The User System 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. FuturePlans 7.
CGE Pipeline Content 1. The User System 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. FuturePlans 7. Q&A Jose Luis Bellod Cisneros PhD. Student Content 1. The Batch Upload 2.
More informationData Basics. Josef K Vogt Slides by: Simon Rasmussen Next Generation Sequencing Analysis
Data Basics Josef K Vogt Slides by: Simon Rasmussen 2017 Generalized NGS analysis Sample prep & Sequencing Data size Main data reductive steps SNPs, genes, regions Application Assembly: Compare Raw Pre-
More informationUsing Galaxy for the analysis of NGS-derived pathogen genomes in clinical microbiology
Using Galaxy for the analysis of NGS-derived pathogen genomes in clinical microbiology Anthony Underwood*, Paul-Michael Agapow, Michel Doumith and Jonathan Green. Bioinformatics Unit, Health Protection
More informationSNP calling and VCF format
SNP calling and VCF format Laurent Falquet, Oct 12 SNP? What is this? A type of genetic variation, among others: Family of Single Nucleotide Aberrations Single Nucleotide Polymorphisms (SNPs) Single Nucleotide
More informationTECHNICAL REPORT. Fifth external quality assessment scheme for Listeria monocytogenes typing.
TECHNICAL REPORT Fifth external quality assessment scheme for Listeria monocytogenes typing www.ecdc.europa.eu ECDC TECHNICAL REPORT Fifth external quality assessment scheme for Listeria monocytogenes
More informationIntroduction to DNA-Sequencing
informatics.sydney.edu.au sih.info@sydney.edu.au The Sydney Informatics Hub provides support, training, and advice on research data, analyses and computing. Talk to us about your computing infrastructure,
More informationDe Novo Assembly of High-throughput Short Read Sequences
De Novo Assembly of High-throughput Short Read Sequences Chuming Chen Center for Bioinformatics and Computational Biology (CBCB) University of Delaware NECC Third Skate Genome Annotation Workshop May 23,
More informationIFSH WHOLE GENOME SEQUENCING FOR FOOD INDUSTRY SYMPOSIUM May 22-23, 2017
1 USDA, Food Safety and Inspection Service IFSH WHOLE GENOME SEQUENCING FOR FOOD INDUSTRY SYMPOSIUM May 22-23, 2017 Chicago Marriott Southwest at Burr Ridge 1200 Burr Ridge Parkway, Burr Ridge, IL 60527
More informationFrom classical molecular typing to WGS in a food safety context: WGS at EFSA
From classical molecular typing to WGS in a food safety context: WGS at EFSA Beatriz Guerra EURL-AR WGS Training, Copenhage, Denmark, 27.09.17 WGS FOR FOOD SAFETY AT EFSA Molecular Typing Recent Past:
More informationIntroductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie. Sander van Boheemen Medical Microbiology
Introductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie Sander van Boheemen Medical Microbiology Next-generation sequencing Next-generation sequencing (NGS), also known as
More informationWhole Genome Sequencing for food safety FSA Chief Scientific Advisor Report and 2013 Listeria pilot study
Whole Genome Sequencing for food safety FSA Chief Scientific Advisor Report and 2013 Listeria pilot study Dr Edward Hayes Date: July 2016, Version 1 Foodborne Pathogens 280,000 cases of Campylobacter,
More informationDevelopment and Implementation of a Quality System for Next-Generation Sequencing
Development and Implementation of a Quality System for Next-Generation Sequencing Lauren Turner, PhD Lead Scientist Virginia Division of Consolidated Laboratory Services DCLS Phased Implementation of NGS
More informationPractical quality control for whole genome sequencing in clinical microbiology
Practical quality control for whole genome sequencing in clinical microbiology John WA Rossen, PhD, MMM Department of Medical Microbiology, University of Groningen, UMCG, Groningen, The Netherlands Disclosure
More informationBringing Whole Genome Sequencing on Board in a State Regulatory Laboratory
Bringing Whole Genome Sequencing on Board in a State Regulatory Laboratory Brian D. Sauders, PhD NY State Dept. of Agriculture & Markets Food Laboratory The Food Laboratory! 2 Major laboratory sections:
More informationAnalytics Behind Genomic Testing
A Quick Guide to the Analytics Behind Genomic Testing Elaine Gee, PhD Director, Bioinformatics ARUP Laboratories 1 Learning Objectives Catalogue various types of bioinformatics analyses that support clinical
More informationDe novo whole genome assembly
De novo whole genome assembly Qi Sun Bioinformatics Facility Cornell University Sequencing platforms Short reads: o Illumina (150 bp, up to 300 bp) Long reads (>10kb): o PacBio SMRT; o Oxford Nanopore
More informationGENOME ASSEMBLY FINAL PIPELINE AND RESULTS
GENOME ASSEMBLY FINAL PIPELINE AND RESULTS Faction 1 Yanxi Chen Carl Dyson Sean Lucking Chris Monaco Shashwat Deepali Nagar Jessica Rowell Ankit Srivastava Camila Medrano Trochez Venna Wang Seyed Alireza
More information2014 APHL Next Generation Sequencing (NGS) Survey
APHL would like you to complete the Next Generation Sequencing (NGS) in Public Health Laboratories Survey. The purpose of this survey is to collect information on current capacities for NGS testing and
More informationGALAXY TRAKR FOR STATE PUBLIC HEALTH BIOINFORMATICS INTRODUCTORY TRAININGS, DATA ANALYTICS, & BIOINFORMATICS COLLABORATIONS
GALAXY TRAKR FOR STATE PUBLIC HEALTH BIOINFORMATICS INTRODUCTORY TRAININGS, DATA ANALYTICS, & BIOINFORMATICS COLLABORATIONS Kevin G. Libuit, M.S. Senior Informatics Scientist Division of Consolidated Laboratory
More informationTargeted Sequencing in the NBS Laboratory
Targeted Sequencing in the NBS Laboratory Christopher Greene, PhD Newborn Screening and Molecular Biology Branch Division of Laboratory Sciences Gene Sequencing in Public Health Newborn Screening February
More informationOverview of CIDT Challenges and Opportunities
Overview of CIDT Challenges and Opportunities Peter Gerner-Smidt, MD, DSc Enteric Diseases Laboratory Branch InFORM II Phoenix, AZ, 19 November 2015 National Center for Emerging and Zoonotic Infectious
More informationESCMID Online Lecture Library. by author
ESCMID WS Rapid NGS for Characterization and Typing of Resistant Gram-Negative Bacilli 7-9 October 2015 João André Carriço, Microbiology Institute and Instituto de Medicina Molecular, Faculty of Medicine,
More informationDe Novo Assembly (Pseudomonas aeruginosa MAPO1 ) Sample to Insight
De Novo Assembly (Pseudomonas aeruginosa MAPO1 ) Sample to Insight 1 Workflow Import NGS raw data QC on reads De novo assembly Trim reads Finding Genes BLAST Sample to Insight Case Study Pseudomonas aeruginosa
More informationNew York State s experience with analyzing, interpreting, and sharing whole genome sequence data for surveillance of enteric organisms.
New York State s experience with analyzing, interpreting, and sharing whole genome sequence data for surveillance of enteric organisms. InForm 11/18/15 William Wolfgang, PhD Wadsworth Center, NYSDOH william.wolfgang@health.ny.gov
More informationNGS in Pathology Webinar
NGS in Pathology Webinar NGS Data Analysis March 10 2016 1 Topics for today s presentation 2 Introduction Next Generation Sequencing (NGS) is becoming a common and versatile tool for biological and medical
More informationIllumina Sequencing Error Profiles and Quality Control
Illumina Sequencing Error Profiles and Quality Control RNA-seq Workflow Biological samples/library preparation Sequence reads FASTQC Adapter Trimming (Optional) Splice-aware mapping to genome Counting
More informationExperimental Design. Sequencing. Data Quality Control. Read mapping. Differential Expression analysis
-Seq Analysis Quality Control checks Reproducibility Reliability -seq vs Microarray Higher sensitivity and dynamic range Lower technical variation Available for all species Novel transcript identification
More informationChallenges and opportunities for whole genome sequencing based surveillance of antibiotic resistance
Challenges and opportunities for whole genome sequencing based surveillance of antibiotic resistance Prof. Willem van Schaik Professor in Microbiology and Infection Institute of Microbiology and Infection
More informationFast, Accurate and Sensitive DNA Variant Detection from Sanger Sequencing:
Fast, Accurate and Sensitive DNA Variant Detection from Sanger Sequencing: Patented, Anti-Correlation Technology Provides 99.5% Accuracy & Sensitivity to 5% Variant Knowledge Base and External Annotation
More informationC3BI. VARIANTS CALLING November Pierre Lechat Stéphane Descorps-Declère
C3BI VARIANTS CALLING November 2016 Pierre Lechat Stéphane Descorps-Declère General Workflow (GATK) software websites software bwa picard samtools GATK IGV tablet vcftools website http://bio-bwa.sourceforge.net/
More informationGenomic epidemiology of bacterial pathogens. Sylvain BRISSE Microbial Evolutionary Genomics, Institut Pasteur, Paris
Genomic epidemiology of bacterial pathogens Sylvain BRISSE Microbial Evolutionary Genomics, Institut Pasteur, Paris Typing Population genetics Analysis of strain diversity within species Aim: Local epidemiology?
More informationSubtyping the top 30 Salmonella serotypes using a combination of CRISPR elements and virulence genes: Salmonella CRISPR-MLVST
Subtyping the top 30 Salmonella serotypes using a combination of CRISPR elements and virulence genes: Salmonella CRISPR-MLVST Heather Carleton, MPH, Ph.D. InFORM 2013 National Center for Emerging and Zoonotic
More informationA year in clinical bioinformatics
Division of Clinical Microbiology A year in clinical bioinformatics Helena Seth-Smith, PhD October 2018 ICCMg " the application of next generation sequencing to clinical samples in order to recover information
More informationWhole genome sequencing in the reference laboratory: An Introduction & Overview
Whole genome sequencing in the reference laboratory: An Introduction & Overview 1 WGS Services in Scotland STEC reference service Salmonella & Shigella reference services.only the beginning! 2 Typing -
More informationNext generation sequencing in diagnostic laboratories: opportunities and challenges
Next generation sequencing in diagnostic laboratories: opportunities and challenges Vitali Sintchenko Marie Bashir Institute for Emerging Infectious Diseases & Biosecurity Declaration No conflict of interest
More informationTranscriptomics analysis with RNA seq: an overview Frederik Coppens
Transcriptomics analysis with RNA seq: an overview Frederik Coppens Platforms Applications Analysis Quantification RNA content Platforms Platforms Short (few hundred bases) Long reads (multiple kilobases)
More informationSanger vs Next-Gen Sequencing
Tools and Algorithms in Bioinformatics GCBA815/MCGB815/BMI815, Fall 2017 Week-8: Next-Gen Sequencing RNA-seq Data Analysis Babu Guda, Ph.D. Professor, Genetics, Cell Biology & Anatomy Director, Bioinformatics
More informationIntroduction to RNA sequencing
Introduction to RNA sequencing Bioinformatics perspective Olga Dethlefsen NBIS, National Bioinformatics Infrastructure Sweden November 2017 Olga (NBIS) RNA-seq November 2017 1 / 49 Outline Why sequence
More informationIntroduction to Whole Genome Sequencing and its Applications in Microbial Diagnostics
Introduction to Whole Genome Sequencing and its Applications in Microbial Diagnostics Workshop on Whole Genome Sequencing and Analysis, 19-21 Mar. 2018 Whole genome sequencing is currently revolutionising
More informationIntroduction to the MiSeq
Introduction to the MiSeq 2011 Illumina, Inc. All rights reserved. Illumina, illuminadx, BeadArray, BeadXpress, cbot, CSPro, DASL, Eco, Genetic Energy, GAIIx, Genome Analyzer, GenomeStudio, GoldenGate,
More informationBionano Access : Assembly Report Guidelines
Bionano Access : Assembly Report Guidelines Document Number: 30255 Document Revision: A For Research Use Only. Not for use in diagnostic procedures. Copyright 2018 Bionano Genomics Inc. All Rights Reserved
More informationFrom Infection to Genbank
From Infection to Genbank How a pathogenic bacterium gets its genome to NCBI Torsten Seemann VLSCI - Life Sciences Computation Centre - Genomics Theme - Lab Meeting - Friday 27 April 2012 The steps 1.
More informationStarting Bioinformatics from Zero as a Biologist
Starting Bioinformatics from Zero as a Biologist Presented by Jessica Chen, Andrea (Ray) Etter, Peter Cook Sponsored by IEH Laboratories & Consulting Organized by the Developing Food Safety Professionals
More informationWhole genome and core genome multilocus sequence typing and single nucleotide
AEM Accepted Manuscript Posted Online 26 May 2017 Appl. Environ. Microbiol. doi:10.1128/aem.00633-17 Copyright 2017 Chen et al. This is an open-access article distributed under the terms of the Creative
More informationFrom Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow
From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with
More informationDNA concentration and purity were initially measured by NanoDrop 2000 and verified on Qubit 2.0 Fluorometer.
DNA Preparation and QC Extraction DNA was extracted from whole blood or flash frozen post-mortem tissue using a DNA mini kit (QIAmp #51104 and QIAmp#51404, respectively) following the manufacturer s recommendations.
More informationIntroduction to Whole Genome Sequencing and its Applications in Microbial Diagnostics
Introduction to Whole Genome Sequencing and its Applications in Microbial Diagnostics Workshop on Whole Genome Sequencing and Analysis, 27-29 Mar. 2017 Whole genome sequencing is currently revolutionising
More informationIntroduction to Whole Genome Sequencing and its Applications in Microbial Diagnostics
Introduction to Whole Genome Sequencing and its Applications in Microbial Diagnostics Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Whole genome sequencing is currently revolutionising
More informationAntisera QC and IQCP and Associated Challenges
Antisera QC and IQCP and Associated Challenges Patti Fields Enteric Diseases Laboratory Branch (EDLB) CDC 2017 APHL Annual Meeting Providence, Rhode Island June 14, 2017 National Center for Emerging and
More informationby author Bacterial typing - what methodology should I use? MTE Session ECCMID 2017 VIENNA, 25 APRIL 2017 L u í s a V i e i ra P e i xe
Bacterial typing - what methodology should I use? MTE Session ECCMID 2017 VIENNA, 25 APRIL 2017 L u í s a V i e i ra P e i xe U C I B I O @ R E Q U I M T E, F a c u l t y o f P h a r m a c y U n i v e
More informationVerocytotoxin producing Escherichia coli (VTEC) diagnostics
Verocytotoxin producing Escherichia coli (VTEC) diagnostics Workshop on Whole Genome Sequencing and Analysis, 27-29 Mar. 2017 Learning objective: After this lecture and exercise, you should be able to
More informationSetting the Course: Virginia's experience navigating information technology and bioinformatics needs for whole genome sequencing
Setting the Course: Virginia's experience navigating information technology and bioinformatics needs for whole genome sequencing Lauren Turner, Ph.D. Virginia Division of Consolidated Laboratory Services
More informationWGS Analysis and Interpretation in Clinical and Public Health Microbiology Laboratories: What Are the Requirements and How Do Existing Tools Compare?
Pathogens 2014, 3, 437-458; doi:10.3390/pathogens3020437 Review OPEN ACCESS pathogens ISSN 2076-0817 www.mdpi.com/journal/pathogens WGS Analysis and Interpretation in Clinical and Public Health Microbiology
More informationNext Gen Sequencing. Expansion of sequencing technology. Contents
Next Gen Sequencing Contents 1 Expansion of sequencing technology 2 The Next Generation of Sequencing: High-Throughput Technologies 3 High Throughput Sequencing Applied to Genome Sequencing (TEDed CC BY-NC-ND
More informationGenome 373: Mapping Short Sequence Reads II. Doug Fowler
Genome 373: Mapping Short Sequence Reads II Doug Fowler The final Will be in this room on June 6 th at 8:30a Will be focused on the second half of the course, but will include material from the first half
More informationVerocytotoxin producing Escherichia coli (VTEC) diagnostics
Verocytotoxin producing Escherichia coli (VTEC) diagnostics Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Learning objective: After this lecture and exercise, you should be able to describe
More informationGenome Assembly Software for Different Technology Platforms. PacBio Canu Falcon. Illumina Soap Denovo Discovar Platinus MaSuRCA.
Genome Assembly Software for Different Technology Platforms PacBio Canu Falcon 10x SuperNova Illumina Soap Denovo Discovar Platinus MaSuRCA Experimental design using Illumina Platform Estimate genome size:
More informationSequence Assembly and Alignment. Jim Noonan Department of Genetics
Sequence Assembly and Alignment Jim Noonan Department of Genetics james.noonan@yale.edu www.yale.edu/noonanlab The assembly problem >>10 9 sequencing reads 36 bp - 1 kb 3 Gb Outline Basic concepts in genome
More informationFrancisco García Quality Control for NGS Raw Data
Contents Data formats Sequence capture Fasta and fastq formats Sequence quality encoding Quality Control Evaluation of sequence quality Quality control tools Identification of artifacts & filtering Practical
More informationUsing New ThiNGS on Small Things. Shane Byrne
Using New ThiNGS on Small Things Shane Byrne Next Generation Sequencing New Things Small Things NGS Next Generation Sequencing = 2 nd generation of sequencing 454 GS FLX, SOLiD, GAIIx, HiSeq, MiSeq, Ion
More informationBioinformatics small variants Data Analysis. Guidelines. genomescan.nl
Next Generation Sequencing Bioinformatics small variants Data Analysis Guidelines genomescan.nl GenomeScan s Guidelines for Small Variant Analysis on NGS Data Using our own proprietary data analysis pipelines
More informationSEQUENCE QUALITY CONSIDERATIONS FOR THE WET LAB
National Center for Emerging and Zoonotic Infectious Diseases SEQUENCE QUALITY CONSIDERATIONS FOR THE WET LAB Eija Trees, Ph.D., D.V.M. Chief, PulseNet Next Generation Subtyping Methods Unit PulseNet/OutbreakNet
More informationWelcome to the NGS webinar series
Welcome to the NGS webinar series Webinar 1 NGS: Introduction to technology, and applications NGS Technology Webinar 2 Targeted NGS for Cancer Research NGS in cancer Webinar 3 NGS: Data analysis for genetic
More informationTutorial for Stop codon reassignment in the wild
Tutorial for Stop codon reassignment in the wild Learning Objectives This tutorial has two learning objectives: 1. Finding evidence of stop codon reassignment on DNA fragments. 2. Detecting and confirming
More informationGenome Assembly Background and Strategy
Genome Assembly Background and Strategy February 6th, 2017 BIOL 7210 - Faction I (Outbreak) - Genome Assembly Group Yanxi Chen Carl Dyson Zhiqiang Lin Sean Lucking Chris Monaco Shashwat Deepali Nagar Jessica
More informationDATA FORMATS AND QUALITY CONTROL
HTS Summer School 12-16th September 2016 DATA FORMATS AND QUALITY CONTROL Romina Petersen, University of Cambridge (rp520@medschl.cam.ac.uk) Luigi Grassi, University of Cambridge (lg490@medschl.cam.ac.uk)
More informationFunctional annotation of metagenomes
Functional annotation of metagenomes Jeroen F. J. Laros Leiden Genome Technology Center Department of Human Genetics Center for Human and Clinical Genetics Introduction Functional analysis Objectives:
More informationRead Mapping and Variant Calling. Johannes Starlinger
Read Mapping and Variant Calling Johannes Starlinger Application Scenario: Personalized Cancer Therapy Different mutations require different therapy Collins, Meredith A., and Marina Pasca di Magliano.
More informationComparing a few SNP calling algorithms using low-coverage sequencing data
Yu and Sun BMC Bioinformatics 2013, 14:274 RESEARCH ARTICLE Open Access Comparing a few SNP calling algorithms using low-coverage sequencing data Xiaoqing Yu 1 and Shuying Sun 1,2* Abstract Background:
More informationLeonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015
Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH BIOL 7210 A Computational Genomics 2/18/2015 The $1,000 genome is here! http://www.illumina.com/systems/hiseq-x-sequencing-system.ilmn Bioinformatics bottleneck
More informationMatthew Tinning Australian Genome Research Facility. July 2012
Next-Generation Sequencing: an overview of technologies and applications Matthew Tinning Australian Genome Research Facility July 2012 History of Sequencing Where have we been? 1869 Discovery of DNA 1909
More informationComputational assembly for prokaryotic sequencing projects
Computational assembly for prokaryotic sequencing projects Lee Katz, Ph.D. Bioinformatician, Enteric Diseases Laboratory Branch January 21, 2015 Disclaimers The findings and conclusions in this presentation
More informationWhole-genome sequencing (WGS) of microbes employing nextgeneration sequencing (NGS) technologies enables pathogen
Application Note Microbial whole-genome sequencing A novel, single-tube enzymatic fragmentation and library construction method enables fast turnaround times and improved data quality for microbial whole-genome
More informationTHE RISE OF WHOLE GENOME SEQUENCING AS A SUBTYPING TOOL FOR MICROBIAL SOURCE TRACKING: FROM FUNDAMENTALS TO APPLICATIONS
THE RISE OF WHOLE GENOME SEQUENCING AS A SUBTYPING TOOL FOR MICROBIAL SOURCE TRACKING: FROM FUNDAMENTALS TO APPLICATIONS STEAK EXPERT MEETING: ANGERS FRANCE JUNE, 2015 Kendra Nightingale, Ph.D. Inter national
More informationQIAseq Targeted Panel Analysis Plugin USER MANUAL
QIAseq Targeted Panel Analysis Plugin USER MANUAL User manual for QIAseq Targeted Panel Analysis 1.1 Windows, macos and Linux June 18, 2018 This software is for research purposes only. QIAGEN Aarhus Silkeborgvej
More information