DE NOVO WHOLE GENOME ASSEMBLY AND SEQUENCING OF THE SUPERB FAIRYWREN. (Malurus cyaneus) JOSHUA PEÑALBA LEO JOSEPH CRAIG MORITZ ANDREW COCKBURN

Size: px
Start display at page:

Download "DE NOVO WHOLE GENOME ASSEMBLY AND SEQUENCING OF THE SUPERB FAIRYWREN. (Malurus cyaneus) JOSHUA PEÑALBA LEO JOSEPH CRAIG MORITZ ANDREW COCKBURN"

Transcription

1 DE NOVO WHOLE GENOME ASSEMBLY AND SEQUENCING OF THE SUPERB FAIRYWREN (Malurus cyaneus) JOSHUA PEÑALBA LEO JOSEPH CRAIG MORITZ ANDREW COCKBURN

2

3

4

5

6 Synthetic Long-read (Moleculo)

7 Short insert shot-gun Mate-pair Synthetic Long-read (Moleculo)

8 Short insert shot-gun Mate-pair Synthetic Long-read (Moleculo) CHiCago

9 Short insert shot-gun Mate-pair RSII Synthetic Long-read (Moleculo) CHiCago

10 Short insert shot-gun Mate-pair RSII Sequel Synthetic Long-read (Moleculo) CHiCago

11 Short insert shot-gun Mate-pair RSII Linkage map 26K SNPs Sequel Synthetic Long-read (Moleculo) CHiCago

12 Short insert shot-gun Mate-pair RSII Linkage map 26K SNPs Sequel Chromium Synthetic Long-read (Moleculo) CHiCago

13 Short insert shot-gun Mate-pair RSII Linkage map 26K SNPs Chromium Sequel Chromium Synthetic Long-read (Moleculo) CHiCago

14 Short insert shot-gun Mate-pair RSII Linkage map 26K SNPs Chromium Sequel Chromium Saphyr Synthetic Long-read (Moleculo) CHiCago HiC

15

16

17

18 Historic DNA

19 Historic DNA Adaptation

20 Historic DNA Adaptation Structural variation

21 SO, YOU WANT TO SEQUENCE A REFERENCE GENOME? TYPES OF PUZZLE PIECES ACATCTAGATC ACTAGTCGATC GAGCTATCGAT CGATCGATGAT CGATCGATTGA 1 1. Ecology & Evolution, Australian National University, Canberra 2. Centre for Biodiversity Analysis, Canberra 3. Australian National WIldlife Collection, @ josh.penalba@gmail.com (population with lowest diversity) Female (ZW) DNA quantity 2n = 72 ~1.1Gb Long distance placement of puzzle pieces determined. High quality short fragments. SEQUENCE DATA DNA insert size: > 1kb SHORT READS 1,3 Malurus cyaneus Flinders Island Cost ACATCTAGATC T Illumina shotgun Great quality, small standard puzzle pieces. Comes in a small range of sizes. JOSHUA V. PEÑALBA, LEO JOSEPH, CRAIG MORITZ, ANDREW COCKBURN 2,3 SUPERB FAIRYWREN GENOME DNA quality insights from de novo sequencing and assembly of the superb fairywren genome (Malurus cyaneus) 1,2,3 SEQUENCING TECH & TYPES OF DATA Yields highly fragmented assemblies. Not useful for assembly unless paired with a different technology. Illumina mate pair Illumina shotgun 250bp Jumping libraries linking across long distances across the chromosome. Illumina mate-pair Phylogenetic gap Additional resource LONG READS Long-term study Large but erroneous pieces. Error can be corrected using smaller pieces. 1x OPTICAL MAPPING Every new reference genome increases the power of broad comparative genomics, effecting novel insights into chromosome and molecular evolution. VERY large pieces but only partial pattern visible. Used to guide joining the smaller pieces together. Pieces that are close together can be found together. Order within sets unknown. What s so hard about sequencing and assembling a genome? If a genome was a jigsaw puzzle... 20x PUZZLE only 4 types of pieces large puzzle consists of smaller puzzles of different sizes > 1 large puzzle in the box with uneven copies replicate puzzles have different pieces duplicate square pieces picture on the box doesn t match what s inside 2+ near identical replicate puzzles imperfect pieces Determined ordering of some pieces. Used to guide preassembled puzzles. Helps destinguish subpuzzles. CHROMOSOME ASSEMBLY GENOME only A,C,T,G multiple chromosomes of different sizes uneven coverage across genome DNA fragmentation random repetitive elements closest reference still distant diploidy or polyplody sequencing error gap filling 30kb PBJelly ASSEMBLY STATS scaffolds contigs Illumina +PB Illumina only Size 1.01Gb 1.05Gb 0.90Gb 1.02Gb Insert sizes: 100kb - Mbs N50 6.0Mb 8.0Mb scaffolds larger than 10kb 1.01Gb genome 440 scaffolds larger than 100kb of the genome is in 153 scaffolds larger than 1Mb Illumina +PB 14Kb of the is in 0.93Gb 467Kb Great method for spanning across repetitive regions yielding quality assemblies. 10X Genomics Short reads associated to longer reads with barcodes. de novo assembly 10X Genomics Chromium scaffolding ARCS SUPERNOVA DNA Insert sizes: >30kb Great quality DNA input can yield quality de novo assemblies. Requires 30-50X coverage. Chromosome conformation information. Association between pairs of reads based on proximity in genome. DE NOVO ASSEMBLY scaffold N50 370Kb coverage contig N50 Linkage map 19x 32Kb assembly 687Mb size HiC: In vivo - robust orientation CHiCAGO: In vitro - dependent on DNA Can link scaffolds into chromosome-scale assemblies. SCAFFOLDING molecule size 46Kb stringency 280 birds Quality of assembly depends on density of markers and number of individuals. Can build linkage groups associated with chromosomes. chr1... chrz 8.3Mb Mb AFFILIATION Annotation is a beast in its own right. Proper annotation first requires RNA sequencing from a range of tissue types. The sequencing is then followed by an incredibly time-intensive computational pipeline. Although it s not ideal, quick and dirty annotation can be achieved by using gene sets from existing databases. Don t forget that repeats, not just genes, need to be annotated too! Annotation still in planning stage... chromosome identification LASTZ SEQUENCING RID HYB Input DNA? LY EMB ASS Heterozygosity? 0.7 CRIMAP FUNDING What about annotation? Budget? 8.0Mb N50 linkage analysis F9 gens Genetic markers associated with a known pedigree. Where do I start? scaffold 0.5 DaRT-seq 26K SNPs One correct solution, infinite incorrect answers, difficult to check... Genome size? of the genome is in 1, Gb Illumina only Labeling of motifs across extremely large DNA fragments. HiC / CHiCAGO Likelihood of a pair of pieces being next to each other. Helps destinguish subpuzzles. LoRDEC up to Oxford Nanopore BioNano long-read error correction 27 SMRT CELLS Requires 50x for de novo, but 20x sufficient for gap filling existing assemblies. Single molecule, long read sequence with high error rate. LINKED READS Chromosome-scale bird genomes have been unevenly sampled across the bird tree. The fairywren belongs to an unsampled oscine passerine clade, filling in a gap in reference bird genomes. ALLPATHS 7Kb 1x PacBio RSII Insert sizes: 5-30kb & improving Requires in-house optimization and high coverage for de novo assembly. Fairywrens have been studied for 30+ years in the Australian National Botanical Gardens. There is anabundance of life history and ecological data which we can link to the genomic data. 5Kb 1x Insert sizes: 5-15kb & improving The broader aim of my PhD thesis is to understand speciation processes using genomic data. This genome will serve as a reference for studying genomic introgression across a bird suture zone. scaffolding 100bp PE (HiSeq2500) insert sizes Single molecule, long read sequence with high error rate. Speciation genomics 4x 3Kb Why are you sequencing the fairywren genome? ALLPATHS 500bp 17x Used for scaffolding shotgun assemblies. Often need at least 2 different fragment sizes. PacBio de novo assembly 100 bp PE 300 bp PE HiSeq 2500 MiSeq Typical DNA insert size: 2-10kb Large jumping libraries: 10-50kb BIOINFORMATIC PIPELINE ASSEMBLY PIPELINE

22 contigs N50 scaffolds 14 Kb 6 Mb

23 contigs N50 scaffolds 14 Kb 6 Mb 467 Kb 8 Mb

24 N50 contigs scaffolds 14 Kb 6 Mb 467 Kb 8 Mb 467 Kb 10 Mb

25 N50 contigs scaffolds 14 Kb 6 Mb 467 Kb 8 Mb 467 Kb 10 Mb

De novo whole genome assembly

De novo whole genome assembly De novo whole genome assembly Qi Sun Bioinformatics Facility Cornell University Sequencing platforms Short reads: o Illumina (150 bp, up to 300 bp) Long reads (>10kb): o PacBio SMRT; o Oxford Nanopore

More information

TruSPAdes: analysis of variations using TruSeq Synthetic Long Reads (TSLR)

TruSPAdes: analysis of variations using TruSeq Synthetic Long Reads (TSLR) tru TruSPAdes: analysis of variations using TruSeq Synthetic Long Reads (TSLR) Anton Bankevich Center for Algorithmic Biotechnology, SPbSU Sequencing costs 1. Sequencing costs do not follow Moore s law

More information

Next-generation sequencing Technology Overview

Next-generation sequencing Technology Overview Next-generation sequencing Technology Overview UQ Winter School 2018 Christopher Noune, PhD AGRF Melbourne christopher.noune@agrf.org.au What is NGS? Ion Torrent PGM (Thermo-Fisher) MiSeq (Illumina) High-Throughput

More information

De novo whole genome assembly

De novo whole genome assembly De novo whole genome assembly Lecture 1 Qi Sun Bioinformatics Facility Cornell University Data generation Sequencing Platforms Short reads: Illumina Long reads: PacBio; Oxford Nanopore Contiging/Scaffolding

More information

NGS technologies: a user s guide. Karim Gharbi & Mark Blaxter

NGS technologies: a user s guide. Karim Gharbi & Mark Blaxter NGS technologies: a user s guide Karim Gharbi & Mark Blaxter genepool-manager@ed.ac.uk Natural history of sequencing 2 Brief history of sequencing 100s bp throughput 100 Gb 1977 1986 1995 1999 2005 2007

More information

Looking Ahead: Improving Workflows for SMRT Sequencing

Looking Ahead: Improving Workflows for SMRT Sequencing Looking Ahead: Improving Workflows for SMRT Sequencing Jonas Korlach FIND MEANING IN COMPLEXITY Pacific Biosciences, the Pacific Biosciences logo, PacBio, SMRT, and SMRTbell are trademarks of Pacific Biosciences

More information

Genomic resources. for non-model systems

Genomic resources. for non-model systems Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing

More information

DNA Sequencing and Assembly

DNA Sequencing and Assembly DNA Sequencing and Assembly CS 262 Lecture Notes, Winter 2016 February 2nd, 2016 Scribe: Mark Berger Abstract In this lecture, we survey a variety of different sequencing technologies, including their

More information

High Throughput Sequencing Technologies. UCD Genome Center Bioinformatics Core Monday 15 June 2015

High Throughput Sequencing Technologies. UCD Genome Center Bioinformatics Core Monday 15 June 2015 High Throughput Sequencing Technologies UCD Genome Center Bioinformatics Core Monday 15 June 2015 Sequencing Explosion www.genome.gov/sequencingcosts http://t.co/ka5cvghdqo Sequencing Explosion 2011 PacBio

More information

Add 2016 GBS Poster As Slide One

Add 2016 GBS Poster As Slide One Add 2016 GBS Poster As Slide One GBS Adapters and Enzymes Barcode Adapter P1 Sticky Ends Common Adapter P2 Illumina Sequencing Primer 2 Barcode (4 8 bp) Restriction Enzymes Illumina Sequencing Primer 1

More information

De novo whole genome assembly

De novo whole genome assembly De novo whole genome assembly Lecture 1 Qi Sun Minghui Wang Bioinformatics Facility Cornell University DNA Sequencing Platforms Illumina sequencing (100 to 300 bp reads) Overlapping reads ~180bp fragment

More information

Sequencing techniques

Sequencing techniques Sequencing techniques Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Learning objective: After this lecture, you should be able to account for different techniques for whole genome sequencing

More information

Using New ThiNGS on Small Things. Shane Byrne

Using New ThiNGS on Small Things. Shane Byrne Using New ThiNGS on Small Things Shane Byrne Next Generation Sequencing New Things Small Things NGS Next Generation Sequencing = 2 nd generation of sequencing 454 GS FLX, SOLiD, GAIIx, HiSeq, MiSeq, Ion

More information

Plant Breeding and Agri Genomics. Team Genotypic 24 November 2012

Plant Breeding and Agri Genomics. Team Genotypic 24 November 2012 Plant Breeding and Agri Genomics Team Genotypic 24 November 2012 Genotypic Family: The Best Genomics Experts Under One Roof 10 PhDs and 78 MSc MTech BTech ABOUT US! Genotypic is a Genomics company, which

More information

Sequence Assembly and Alignment. Jim Noonan Department of Genetics

Sequence Assembly and Alignment. Jim Noonan Department of Genetics Sequence Assembly and Alignment Jim Noonan Department of Genetics james.noonan@yale.edu www.yale.edu/noonanlab The assembly problem >>10 9 sequencing reads 36 bp - 1 kb 3 Gb Outline Basic concepts in genome

More information

Mate-pair library data improves genome assembly

Mate-pair library data improves genome assembly De Novo Sequencing on the Ion Torrent PGM APPLICATION NOTE Mate-pair library data improves genome assembly Highly accurate PGM data allows for de Novo Sequencing and Assembly For a draft assembly, generate

More information

Deep Sequencing technologies

Deep Sequencing technologies Deep Sequencing technologies Gabriela Salinas 30 October 2017 Transcriptome and Genome Analysis Laboratory http://www.uni-bc.gwdg.de/index.php?id=709 Microarray and Deep-Sequencing Core Facility University

More information

How much sequencing do I need? Emily Crisovan Genomics Core

How much sequencing do I need? Emily Crisovan Genomics Core How much sequencing do I need? Emily Crisovan Genomics Core How much sequencing? Three questions: 1. How much sequence is required for good experimental design? 2. What type of sequencing run is best?

More information

Understanding the science and technology of whole genome sequencing

Understanding the science and technology of whole genome sequencing Understanding the science and technology of whole genome sequencing Dag Undlien Department of Medical Genetics Oslo University Hospital University of Oslo and The Norwegian Sequencing Centre d.e.undlien@medisin.uio.no

More information

How much sequencing do I need? Emily Crisovan Genomics Core September 26, 2018

How much sequencing do I need? Emily Crisovan Genomics Core September 26, 2018 How much sequencing do I need? Emily Crisovan Genomics Core September 26, 2018 How much sequencing? Three questions: 1. How much sequence is required for good experimental design? 2. What type of sequencing

More information

Sequence assembly. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequence assembly. Jose Blanca COMAV institute bioinf.comav.upv.es Sequence assembly Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing project Unknown sequence { experimental evidence result read 1 read 4 read 2 read 5 read 3 read 6 read 7 Computational requirements

More information

Announcements. Coffee! Evalua,on. Dr. Yoshiki Sasai, R.I.P.

Announcements. Coffee! Evalua,on. Dr. Yoshiki Sasai, R.I.P. Announcements Coffee! Evalua,on. Dr. Yoshiki Sasai, R.I.P. Sequencing considerations Three basic problems Resequencing, coun,ng, and assembly. A. B. C. 1. Resequencing analysis We know a reference genome,

More information

Implementation and Evaluation of 10X Genomics Chromium technology

Implementation and Evaluation of 10X Genomics Chromium technology Implementation and Evaluation of 10X Genomics Chromium technology Claire Kuchly & Olivier Bouchez 28/11/2017 get@genotoul.fr @get_genotoul 1 Chromium evaluation: pilot phase Platform installed in november

More information

DNA. bioinformatics. genomics. personalized. variation NGS. trio. custom. assembly gene. tumor-normal. de novo. structural variation indel.

DNA. bioinformatics. genomics. personalized. variation NGS. trio. custom. assembly gene. tumor-normal. de novo. structural variation indel. DNA Sequencing T TM variation DNA amplicon mendelian trio genomics NGS bioinformatics tumor-normal custom SNP resequencing target validation de novo prediction personalized comparative genomics exome private

More information

Single Cell Transcriptomics scrnaseq

Single Cell Transcriptomics scrnaseq Single Cell Transcriptomics scrnaseq Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu Purpose The sequencing of

More information

DE NOVO GENOME ASSEMBLY OF THE AFRICAN CATFISH (CLARIAS GARIEPINUS)

DE NOVO GENOME ASSEMBLY OF THE AFRICAN CATFISH (CLARIAS GARIEPINUS) DE NOVO GENOME ASSEMBLY OF THE AFRICAN CATFISH (CLARIAS GARIEPINUS) Kovács B. a,, Barta E. c, Pongor S. L. b, Uri Cs. a, Patócs A. b, Orbán L. d, Müller T. a, Urbányi B. a a Department of Aquaculture,

More information

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Tuesday December 16, 2014

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Tuesday December 16, 2014 High Throughput Sequencing Technologies J Fass UCD Genome Center Bioinformatics Core Tuesday December 16, 2014 Sequencing Explosion www.genome.gov/sequencingcosts http://t.co/ka5cvghdqo Sequencing Explosion

More information

Next Generation Sequencing. Tobias Österlund

Next Generation Sequencing. Tobias Österlund Next Generation Sequencing Tobias Österlund tobiaso@chalmers.se NGS part of the course Week 4 Friday 13/2 15.15-17.00 NGS lecture 1: Introduction to NGS, alignment, assembly Week 6 Thursday 26/2 08.00-09.45

More information

A Roadmap to the De-novo Assembly of the Banana Slug Genome

A Roadmap to the De-novo Assembly of the Banana Slug Genome A Roadmap to the De-novo Assembly of the Banana Slug Genome Stefan Prost 1 1 Department of Integrative Biology, University of California, Berkeley, United States of America April 6th-10th, 2015 Outline

More information

Genome Assembly Software for Different Technology Platforms. PacBio Canu Falcon. Illumina Soap Denovo Discovar Platinus MaSuRCA.

Genome Assembly Software for Different Technology Platforms. PacBio Canu Falcon. Illumina Soap Denovo Discovar Platinus MaSuRCA. Genome Assembly Software for Different Technology Platforms PacBio Canu Falcon 10x SuperNova Illumina Soap Denovo Discovar Platinus MaSuRCA Experimental design using Illumina Platform Estimate genome size:

More information

A shotgun introduction to sequence assembly (with Velvet) MCB Brem, Eisen and Pachter

A shotgun introduction to sequence assembly (with Velvet) MCB Brem, Eisen and Pachter A shotgun introduction to sequence assembly (with Velvet) MCB 247 - Brem, Eisen and Pachter Hot off the press January 27, 2009 06:00 AM Eastern Time llumina Launches Suite of Next-Generation Sequencing

More information

Introduction to metagenome assembly. Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014

Introduction to metagenome assembly. Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014 Introduction to metagenome assembly Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014 Sequencing specs* Method Read length Accuracy Million reads Time Cost per M 454

More information

Matthew Tinning Australian Genome Research Facility. July 2012

Matthew Tinning Australian Genome Research Facility. July 2012 Next-Generation Sequencing: an overview of technologies and applications Matthew Tinning Australian Genome Research Facility July 2012 History of Sequencing Where have we been? 1869 Discovery of DNA 1909

More information

Genome Projects. Part III. Assembly and sequencing of human genomes

Genome Projects. Part III. Assembly and sequencing of human genomes Genome Projects Part III Assembly and sequencing of human genomes All current genome sequencing strategies are clone-based. 1. ordered clone sequencing e.g., C. elegans well suited for repetitive sequences

More information

Introduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph

Introduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph Introduction to Plant Genomics and Online Resources Manish Raizada University of Guelph Genomics Glossary http://www.genomenewsnetwork.org/articles/06_00/sequence_primer.shtml Annotation Adding pertinent

More information

The Genome Analysis Centre. Building Excellence in Genomics and Computa5onal Bioscience

The Genome Analysis Centre. Building Excellence in Genomics and Computa5onal Bioscience Building Excellence in Genomics and Computa5onal Bioscience Resequencing approaches Sarah Ayling Crop Genomics and Diversity sarah.ayling@tgac.ac.uk Why re- sequence plants? To iden

More information

Contact us for more information and a quotation

Contact us for more information and a quotation GenePool Information Sheet #1 Installed Sequencing Technologies in the GenePool The GenePool offers sequencing service on three platforms: Sanger (dideoxy) sequencing on ABI 3730 instruments Illumina SOLEXA

More information

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio

More information

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio

More information

Human Genome Sequencing Over the Decades The capacity to sequence all 3.2 billion bases of the human genome (at 30X coverage) has increased

Human Genome Sequencing Over the Decades The capacity to sequence all 3.2 billion bases of the human genome (at 30X coverage) has increased Human Genome Sequencing Over the Decades The capacity to sequence all 3.2 billion bases of the human genome (at 30X coverage) has increased exponentially since the 1990s. In 2005, with the introduction

More information

Transcriptome Assembly, Functional Annotation (and a few other related thoughts)

Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Monica Britton, Ph.D. Sr. Bioinformatics Analyst June 23, 2017 Differential Gene Expression Generalized Workflow File Types

More information

RNA sequencing with the MinION at Genoscope

RNA sequencing with the MinION at Genoscope RNA sequencing with the MinION at Genoscope Jean-Marc Aury jmaury@genoscope.cns.fr @J_M_Aury December 13, 2017 RNA workshop, Genoscope Overview Genoscope Overview MinION sequencing at Genoscope RNA-Seq

More information

Applications of Next Generation Sequencing in Metagenomics Studies

Applications of Next Generation Sequencing in Metagenomics Studies Applications of Next Generation Sequencing in Metagenomics Studies Francesca Rizzo, PhD Genomix4life Laboratory of Molecular Medicine and Genomics Department of Medicine and Surgery University of Salerno

More information

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio

More information

Current'Advances'in'Sequencing' Technology' James'Gurtowski' Schatz'Lab'

Current'Advances'in'Sequencing' Technology' James'Gurtowski' Schatz'Lab' Current'Advances'in'Sequencing' Technology' James'Gurtowski' Schatz'Lab' Outline' 1. Assembly'Review' 2. Pacbio' Technology'Overview' Data'CharacterisFcs' Algorithms' Results' 'Assemblies' 3. Oxford'Nanopore'

More information

Outline General NGS background and terms 11/14/2016 CONFLICT OF INTEREST. HLA region targeted enrichment. NGS library preparation methodologies

Outline General NGS background and terms 11/14/2016 CONFLICT OF INTEREST. HLA region targeted enrichment. NGS library preparation methodologies Eric T. Weimer, PhD, D(ABMLI) Assistant Professor, Pathology & Laboratory Medicine, UNC School of Medicine Director, Molecular Immunology Associate Director, Clinical Flow Cytometry, HLA, and Immunology

More information

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping BENG 183 Trey Ideker Genome Assembly and Physical Mapping Reasons for sequencing Complete genome sequencing!!! Resequencing (Confirmatory) E.g., short regions containing single nucleotide polymorphisms

More information

De novo Genome Assembly

De novo Genome Assembly De novo Genome Assembly A/Prof Torsten Seemann Winter School in Mathematical & Computational Biology - Brisbane, AU - 3 July 2017 Introduction The human genome has 47 pieces MT (or XY) The shortest piece

More information

Overview of Next Generation Sequencing technologies. Céline Keime

Overview of Next Generation Sequencing technologies. Céline Keime Overview of Next Generation Sequencing technologies Céline Keime keime@igbmc.fr Next Generation Sequencing < Second generation sequencing < General principle < Sequencing by synthesis - Illumina < Sequencing

More information

Next Gen Sequencing. Expansion of sequencing technology. Contents

Next Gen Sequencing. Expansion of sequencing technology. Contents Next Gen Sequencing Contents 1 Expansion of sequencing technology 2 The Next Generation of Sequencing: High-Throughput Technologies 3 High Throughput Sequencing Applied to Genome Sequencing (TEDed CC BY-NC-ND

More information

NGS technologies approaches, applications and challenges!

NGS technologies approaches, applications and challenges! www.supagro.fr NGS technologies approaches, applications and challenges! Jean-François Martin Centre de Biologie pour la Gestion des Populations Centre international d études supérieures en sciences agronomiques

More information

Next-Generation Sequencing. Technologies

Next-Generation Sequencing. Technologies Next-Generation Next-Generation Sequencing Technologies Sequencing Technologies Nicholas E. Navin, Ph.D. MD Anderson Cancer Center Dept. Genetics Dept. Bioinformatics Introduction to Bioinformatics GS011062

More information

Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing Workshop

Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing Workshop Output (bp) Aaron Liston, Oregon State University Growth in Next-Gen Sequencing Capacity 3.5E+11 2002 2004 2006 2008 2010 3.0E+11 2.5E+11 2.0E+11 1.5E+11 1.0E+11 Adapted from Mardis, 2011, Nature 5.0E+10

More information

Bioinformatics for Genomics

Bioinformatics for Genomics Bioinformatics for Genomics It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material. When I was young my Father

More information

Experimental Design Microbial Sequencing

Experimental Design Microbial Sequencing Experimental Design Microbial Sequencing Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu General rules for preparing

More information

Technologies, resources and tools for the exploitation of the sheep and goat genomes.

Technologies, resources and tools for the exploitation of the sheep and goat genomes. Technologies, resources and tools for the exploitation of the sheep and goat genomes. B. P. Dalrymple, G. Tosser-Klopp, N. Cockett, A. Archibald, W. Zhang and J. Kijas. The plan The current state of the

More information

Genome Sequencing-- Strategies

Genome Sequencing-- Strategies Genome Sequencing-- Strategies Bio 4342 Spring 04 What is a genome? A genome can be defined as the entire DNA content of each nucleated cell in an organism Each organism has one or more chromosomes that

More information

Genomic Technologies. Michael Schatz. Feb 1, 2018 Lecture 2: Applied Comparative Genomics

Genomic Technologies. Michael Schatz. Feb 1, 2018 Lecture 2: Applied Comparative Genomics Genomic Technologies Michael Schatz Feb 1, 2018 Lecture 2: Applied Comparative Genomics Welcome! The primary goal of the course is for students to be grounded in theory and leave the course empowered to

More information

De novo assembly of complex genomes using single molecule sequencing

De novo assembly of complex genomes using single molecule sequencing De novo assembly of complex genomes using single molecule sequencing Michael Schatz Jan 14, 2014 PAG XXII @mike_schatz / #PAGXXII 1. Shear & Sequence DNA Assembling a Genome 2. Construct assembly graph

More information

De Novo Assembly of High-throughput Short Read Sequences

De Novo Assembly of High-throughput Short Read Sequences De Novo Assembly of High-throughput Short Read Sequences Chuming Chen Center for Bioinformatics and Computational Biology (CBCB) University of Delaware NECC Third Skate Genome Annotation Workshop May 23,

More information

The Resurgence of Reference Quality Genome Sequence

The Resurgence of Reference Quality Genome Sequence The Resurgence of Reference Quality Genome Sequence Michael Schatz Jan 12, 2016 PAG IV @mike_schatz / #PAGIV Genomics Arsenal in the year 2015 Sample Preparation Sequencing Chromosome Mapping Summary &

More information

HaloPlex HS. Get to Know Your DNA. Every Single Fragment. Kevin Poon, Ph.D.

HaloPlex HS. Get to Know Your DNA. Every Single Fragment. Kevin Poon, Ph.D. HaloPlex HS Get to Know Your DNA. Every Single Fragment. Kevin Poon, Ph.D. Sr. Global Product Manager Diagnostics & Genomics Group Agilent Technologies For Research Use Only. Not for Use in Diagnostic

More information

The Basics of Understanding Whole Genome Next Generation Sequence Data

The Basics of Understanding Whole Genome Next Generation Sequence Data The Basics of Understanding Whole Genome Next Generation Sequence Data Heather Carleton-Romer, MPH, Ph.D. ASM-CDC Infectious Disease and Public Health Microbiology Postdoctoral Fellow PulseNet USA Next

More information

Why can GBS be complicated? Tools for filtering, error correction and imputation.

Why can GBS be complicated? Tools for filtering, error correction and imputation. Why can GBS be complicated? Tools for filtering, error correction and imputation. Edward Buckler USDA-ARS Cornell University http://www.maizegenetics.net Many Organisms Are Diverse Humans are at the lower

More information

Direct determination of diploid genome sequences. Supplemental material: contents

Direct determination of diploid genome sequences. Supplemental material: contents Direct determination of diploid genome sequences Neil I. Weisenfeld, Vijay Kumar, Preyas Shah, Deanna M. Church, David B. Jaffe Supplemental material: contents Supplemental Note 1. Comparison of performance

More information

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday September 15, 2014

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday September 15, 2014 High Throughput Sequencing Technologies J Fass UCD Genome Center Bioinformatics Core Monday September 15, 2014 Sequencing Explosion www.genome.gov/sequencingcosts http://t.co/ka5cvghdqo Sequencing Explosion

More information

De Novo and Hybrid Assembly

De Novo and Hybrid Assembly On the PacBio RS Introduction The PacBio RS utilizes SMRT technology to generate both Continuous Long Read ( CLR ) and Circular Consensus Read ( CCS ) data. In this document, we describe sequencing the

More information

Sequencing and assembly of the sheep genome reference sequence

Sequencing and assembly of the sheep genome reference sequence Sequencing and assembly of the sheep genome reference sequence Yu Jiang Kunming Institute of Zoology, CAS, China the International Sheep Genomics Consortium (ISGC) ISGC Presentations Yu Jiang, Kunming

More information

Corynebacterium pseudotuberculosis genome sequencing: Final Report

Corynebacterium pseudotuberculosis genome sequencing: Final Report Summary To provide an invaluable resource to assist in the development of diagnostics and vaccines against caseous lymphadenitis (CLA), the sequencing of the genome of a virulent, United Kingdom Corynebacterium

More information

De novo assembly of human genomes with massively parallel short read sequencing. Mikk Eelmets Journal Club

De novo assembly of human genomes with massively parallel short read sequencing. Mikk Eelmets Journal Club De novo assembly of human genomes with massively parallel short read sequencing Mikk Eelmets Journal Club 06.04.2010 Problem DNA sequencing technologies: Sanger sequencing (500-1000 bp) Next-generation

More information

De novo assembly in RNA-seq analysis.

De novo assembly in RNA-seq analysis. De novo assembly in RNA-seq analysis. Joachim Bargsten Wageningen UR/PRI/Plant Breeding October 2012 Motivation Transcriptome sequencing (RNA-seq) Gene expression / differential expression Reconstruct

More information

Sequence-based assembly of chromosome 7A and comparison to diploid progenitors

Sequence-based assembly of chromosome 7A and comparison to diploid progenitors Sequence-based assembly of chromosome 7A and comparison to diploid progenitors Gabriel Keeble-Gagnere1 J Nystrom-Persson1, C Cavanagh2, D Fleury3, H Webster1, R Appels1 1 Veterinary and Life Sciences,

More information

Outline. General principles of clonal sequencing Analysis principles Applications CNV analysis Genome architecture

Outline. General principles of clonal sequencing Analysis principles Applications CNV analysis Genome architecture The use of new sequencing technologies for genome analysis Chris Mattocks National Genetics Reference Laboratory (Wessex) NGRL (Wessex) 2008 Outline General principles of clonal sequencing Analysis principles

More information

Title: High-quality genome assembly of channel catfish, Ictalurus punctatus

Title: High-quality genome assembly of channel catfish, Ictalurus punctatus Author s response to reviews Title: High-quality genome assembly of channel catfish, Ictalurus punctatus Authors: Qiong Shi (shiqiong@genomics.cn) Xiaohui Chen (xhchenffri@hotmail.com) Liqiang Zhong (lqzhongffri@hotmail.com)

More information

The genome of Leishmania panamensis: insights into genomics of the L. (Viannia) subgenus.

The genome of Leishmania panamensis: insights into genomics of the L. (Viannia) subgenus. SUPPLEMENTARY INFORMATION The genome of Leishmania panamensis: insights into genomics of the L. (Viannia) subgenus. Alejandro Llanes, Carlos Mario Restrepo, Gina Del Vecchio, Franklin José Anguizola, Ricardo

More information

Connect-A-Contig Paper version

Connect-A-Contig Paper version Teacher Guide Connect-A-Contig Paper version Abstract Students align pieces of paper DNA strips based on the distance between markers to generate a DNA consensus sequence. The activity helps students see

More information

Sequencing Theory. Brett E. Pickett, Ph.D. J. Craig Venter Institute

Sequencing Theory. Brett E. Pickett, Ph.D. J. Craig Venter Institute Sequencing Theory Brett E. Pickett, Ph.D. J. Craig Venter Institute Applications of Genomics and Bioinformatics to Infectious Diseases GABRIEL Network Agenda Sequencing Instruments Sanger Illumina Ion

More information

Comprehensive Views of Genetic Diversity with Single Molecule, Real-Time (SMRT) Sequencing

Comprehensive Views of Genetic Diversity with Single Molecule, Real-Time (SMRT) Sequencing Comprehensive Views of Genetic Diversity with Single Molecule, Real-Time (SMRT) Sequencing Alix Kieu Cruse November 2015 For Research Use Only. Not for use in diagnostics procedures. Copyright 2015 by

More information

High Throughput Sequencing the Multi-Tool of Life Sciences. Lutz Froenicke DNA Technologies and Expression Analysis Cores UCD Genome Center

High Throughput Sequencing the Multi-Tool of Life Sciences. Lutz Froenicke DNA Technologies and Expression Analysis Cores UCD Genome Center High Throughput Sequencing the Multi-Tool of Life Sciences Lutz Froenicke DNA Technologies and Expression Analysis Cores UCD Genome Center Complementary Approaches Illumina Still-imaging of clusters (~1000

More information

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday June 16, 2014

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday June 16, 2014 High Throughput Sequencing Technologies J Fass UCD Genome Center Bioinformatics Core Monday June 16, 2014 Sequencing Explosion www.genome.gov/sequencingcosts http://t.co/ka5cvghdqo Sequencing Explosion

More information

Wheat Genome Structural Annotation Using a Modular and Evidence-combined Annotation Pipeline

Wheat Genome Structural Annotation Using a Modular and Evidence-combined Annotation Pipeline Wheat Genome Structural Annotation Using a Modular and Evidence-combined Annotation Pipeline Xi Wang Bioinformatics Scientist Computational Life Science Page 1 Bayer 4:3 Template 2010 March 2016 17/01/2017

More information

Analytics Behind Genomic Testing

Analytics Behind Genomic Testing A Quick Guide to the Analytics Behind Genomic Testing Elaine Gee, PhD Director, Bioinformatics ARUP Laboratories 1 Learning Objectives Catalogue various types of bioinformatics analyses that support clinical

More information

Slide 1. Slide 2. Slide 3

Slide 1. Slide 2. Slide 3 Notes for Voice over on Sequencing Module Slide 1 The purpose of this presentation is to describe an adaptive approach to the sequencing of very large conifer genomes. Long considered a task so daunting

More information

Single Cell Genomics

Single Cell Genomics Single Cell Genomics Application Cost Platform/Protoc ol Note Single cell 3 mrna-seq cell lysis/rt/library prep $2460/Sample 10X Genomics Chromium 500-10,000 cells/sample Single cell 5 V(D)J mrna-seq cell

More information

Targeted PacBio sequencing of wild zebrafish immune gene families. Jaanus Suurväli University of Cologne Institute for Genetics

Targeted PacBio sequencing of wild zebrafish immune gene families. Jaanus Suurväli University of Cologne Institute for Genetics Targeted PacBio sequencing of wild zebrafish immune gene families Jaanus Suurväli University of Cologne Institute for Genetics Leiden, 12. June 2018 Cyprinidae ~3000 species of cyprinids ~9-10 % of all

More information

Next Generation Sequencing. Jeroen Van Houdt - Leuven 13/10/2017

Next Generation Sequencing. Jeroen Van Houdt - Leuven 13/10/2017 Next Generation Sequencing Jeroen Van Houdt - Leuven 13/10/2017 Landmarks in DNA sequencing 1953 Discovery of DNA double helix structure 1977 A Maxam and W Gilbert "DNA seq by chemical degradation" F Sanger"DNA

More information

Mapping strategies for sequence reads

Mapping strategies for sequence reads Mapping strategies for sequence reads Ernest Turro University of Cambridge 21 Oct 2013 Quantification A basic aim in genomics is working out the contents of a biological sample. 1. What distinct elements

More information

RNA-SEQUENCING ANALYSIS

RNA-SEQUENCING ANALYSIS RNA-SEQUENCING ANALYSIS Joseph Powell SISG- 2018 CONTENTS Introduction to RNA sequencing Data structure Analyses Transcript counting Alternative splicing Allele specific expression Discovery APPLICATIONS

More information

Next-generation sequencing technologies

Next-generation sequencing technologies Next-generation sequencing technologies NGS applications Illumina sequencing workflow Overview Sequencing by ligation Short-read NGS Sequencing by synthesis Illumina NGS Single-molecule approach Long-read

More information

Transcriptomics analysis with RNA seq: an overview Frederik Coppens

Transcriptomics analysis with RNA seq: an overview Frederik Coppens Transcriptomics analysis with RNA seq: an overview Frederik Coppens Platforms Applications Analysis Quantification RNA content Platforms Platforms Short (few hundred bases) Long reads (multiple kilobases)

More information

Experimental Design. Dr. Matthew L. Settles. Genome Center University of California, Davis

Experimental Design. Dr. Matthew L. Settles. Genome Center University of California, Davis Experimental Design Dr. Matthew L. Settles Genome Center University of California, Davis settles@ucdavis.edu What is Differential Expression Differential expression analysis means taking normalized sequencing

More information

Next generation sequencing in diagnostic laboratories: opportunities and challenges

Next generation sequencing in diagnostic laboratories: opportunities and challenges Next generation sequencing in diagnostic laboratories: opportunities and challenges Vitali Sintchenko Marie Bashir Institute for Emerging Infectious Diseases & Biosecurity Declaration No conflict of interest

More information

NGS-based innovations within the Leiden Network

NGS-based innovations within the Leiden Network NGS-based innovations within the Leiden Network A strong bridge between two partners Dr. Mark de Jong 2017-09-29 Design accurate and robust NGS tests and generate data sets essential for Diagnostics &

More information

The Expanded Illumina Sequencing Portfolio New Sample Prep Solutions and Workflow

The Expanded Illumina Sequencing Portfolio New Sample Prep Solutions and Workflow The Expanded Illumina Sequencing Portfolio New Sample Prep Solutions and Workflow Marcus Hausch, Ph.D. 2010 Illumina, Inc. All rights reserved. Illumina, illuminadx, Solexa, Making Sense Out of Life, Oligator,

More information

Gap Filling for a Human MHC Haplotype Sequence

Gap Filling for a Human MHC Haplotype Sequence American Journal of Life Sciences 2016; 4(6): 146-151 http://www.sciencepublishinggroup.com/j/ajls doi: 10.11648/j.ajls.20160406.12 ISSN: 2328-5702 (Print); ISSN: 2328-5737 (Online) Gap Filling for a Human

More information

Taking Advantage of Long RNA-Seq Reads. Vince Magrini Pacific Biosciences User Group Meeting September 18, 2013

Taking Advantage of Long RNA-Seq Reads. Vince Magrini Pacific Biosciences User Group Meeting September 18, 2013 Taking Advantage of Long RNA-Seq Reads Vince Magrini Pacific Biosciences User Group Meeting September 18, 2013 Overview Proof-of-Principle SMART-cDNA Synthesis PB-SBL size distributions Gene Annotation

More information

SO YOU WANT TO DO A: RNA-SEQ EXPERIMENT MATT SETTLES, PHD UNIVERSITY OF CALIFORNIA, DAVIS

SO YOU WANT TO DO A: RNA-SEQ EXPERIMENT MATT SETTLES, PHD UNIVERSITY OF CALIFORNIA, DAVIS SO YOU WANT TO DO A: RNA-SEQ EXPERIMENT MATT SETTLES, PHD UNIVERSITY OF CALIFORNIA, DAVIS SETTLES@UCDAVIS.EDU Bioinformatics Core Genome Center UC Davis BIOINFORMATICS.UCDAVIS.EDU DISCLAIMER This talk/workshop

More information

Functional genomics to improve wheat disease resistance. Dina Raats Postdoctoral Scientist, Krasileva Group

Functional genomics to improve wheat disease resistance. Dina Raats Postdoctoral Scientist, Krasileva Group Functional genomics to improve wheat disease resistance Dina Raats Postdoctoral Scientist, Krasileva Group Talk plan Goal: to contribute to the crop improvement by isolating YR resistance genes from cultivated

More information

Genomics AGRY Michael Gribskov Hock 331

Genomics AGRY Michael Gribskov Hock 331 Genomics AGRY 60000 Michael Gribskov gribskov@purdue.edu Hock 331 Computing Essentials Resources In this course we will assemble and annotate both genomic and transcriptomic sequence assemblies We will

More information

AUDREY FARBOS JEREMIE POSCHMANN PAUL O NEILL KONRAD PASZKIEWICZ KAREN MOORE

AUDREY FARBOS JEREMIE POSCHMANN PAUL O NEILL KONRAD PASZKIEWICZ KAREN MOORE We provide: AUDREY FARBOS JEREMIE POSCHMANN PAUL O NEILL KONRAD PASZKIEWICZ KAREN MOORE State of the art genomics and bioinformatics analysis Training in experimental techniques, analysis and modelling

More information