Introductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie Sander van Boheemen Medical Microbiology
Next-generation sequencing Next-generation sequencing (NGS), also known as high-throughput sequencing, deep sequencing or massive parallel sequencing, is the catch-all term used to describe a technique that utilizes DNA sequencing technologies capable of processing multiple DNA sequences in parallel.
Cost of sequencing
Library preparation Adaptor ligation needed for Purification Amplification Barcoding Sequencing Lefterova, 2015
Barcoding illumina.com
Illumina bridge PCR: physical separation and clonal amplification of individual molecules cyclic reversible termination read length : up to 2x300bp throughput: ~15 Gb run time: ~8-56 hours illumina.com
Life technologies emulsion PCR: physical separation and clonal amplification of individual molecules during nucleotide incorporation, protons (H+) are released. semiconductor measures change in ph. read length: 200-400 bp throughput: 100 Mb -2 Gb run time: 2 7 hours. Mardis ER, 2013
Pacific Biosciences adaptors to fragmented DNA: generating closed circular DNA molecules create DNA-polymerase complexin zero mode waveguide (ZMW) single molecule real-time sequencing read length average: 8,500 bp (40,000 bp) throughput: 275-375 Mb run time: 30-180 minutes Niedringhaus TP, 2011
Oxford Nanopore Technologies Add adaptors/enzymes to DNA Measures changes in current as DNA passes through protein nanopore Read length: ~6,500-8,000 bp Throughput: 100 Mb -200 Mb Run time: 48 hours nanoporetech.com
Bioinformatics
File formats FASTA text-based format for sequences 2 lines 1. > sequence identifier 2. sequence FASTQ text-based format for sequences and quality scores 4 lines 1. @ sequence identifier 2. sequence 3. + description (opt.) 4. quality values
Quality control - PHRED score Probability scores of incorrectly base calls by sequencer Trimming reads with Phred scores<20
Quality control - indexes illumina.com
Mapping alignment A sequence mapping alignment is assembling reads against an existing backbone sequence (sequence is similar but not necessarily identical to the backbone sequence).
de novo assembly assembling short reads to create full-length (sometimes novel) sequences
Applications of NGS in virology Whole-Genome Sequencing Virus discovery/diagnostics Single Nucleotide Polymorphisms typing Epidemiology Viral genotyping Virus evolution Transcriptomics Ribosome Profiling Phylogeny Molecular epidemiology and surveillance Intra-host viral genome diversity Pathogenic mutations and fitness Variants escaping host neutralization Identification of virulence factors and drug-resistant variants..
Whole Genome Sequencing Whole genome sequencing (WGS) is a laboratory process that determines the complete DNA sequence of an organism's genome at a single time. Metagenomics is the study of a collection of genetic material (genomes) from a mixed community of organisms. Metagenomic Whole Genome Sequencing (mwgs)
Diagnostic assays Current diagnostic assays PCR Culture Antibody assays.. Are these viruses in the sample? Future diagnostic assays mwgs What is in the sample?
mwgs example Respiratory sample Influenza C positive
NGS too time-consuming for clinical virology Quick et al., Nature, 2016 Greninger et al., Genome Med, 2015
SNP typing Viruses possess high mutation rate and exist in large populations. A quasispecies is a cloud of diverse variants that are genetically linked through mutation, interact cooperatively on a functional level, and collectively contribute to the characteristics of the population. virology.ws
Epidemiology Holmes et al., Nature, 2016
Conclusions Wide variety next generation sequencing platforms Various applications of next generation sequencing in clinical virology