The parrot genome: using 454 Flx+ sequencing to identify regulatory traits of vocal learning

Size: px
Start display at page:

Download "The parrot genome: using 454 Flx+ sequencing to identify regulatory traits of vocal learning"

Transcription

1 The parrot genome: using 454 Flx+ sequencing to identify regulatory traits of vocal learning Erich D. Jarvis Howard Hughes Medical Institute Investigator Duke University Medical Center Department of Neurobiology China Roche 454 Meetings September 2011

2 Motivation: Deciphering the genetic basis of convergent complex traits. Challenges: De-novo genome sequencing and assembly of species with and without the traits of interest. Proper genome assembly and tools for interrogating the genomes.

3 Motivation: Deciphering the genetic basis of convergent complex traits. Challenges: De-novo genome sequencing and assembly of species with and without the traits of interest. Proper genome assembly and tools for interrogating the genomes.

4 5 GROUPS OF MAMMALS HUMANS CETACEANS BATS ELEPHANTS SEA LIONS VOCAL LEARNING (production learning) 3 GROUPS OF BIRDS PARROTS HUMMINGBIRDS SONGBIRDS Different from auditory learning (comprehension and usage learning) Auditory Learning: Dogs can understand the sounds sit (English), sientese (Spanish), osuwari (Japanese). Vocal Learning: Dogs can not learn to say these sounds, but vocal learners can.

5 Convergent behavior: vocal learning substrate for speech AVIAN FAMILY TREE only humans * Vocal learners * Hackett et al 2008 tree Depends on auditory feedback, vocal critical periods, cultural transmission, syntax, Deaf-induced vocal disorders, aphasias, speech sound disorder, possibly autism, * *

6 African Grey Parrot - training to count (concept of one) Pepperberg/Alex

7 Song & speech systems in birds and humans Jarvis 2004 Ann NY Acad Sci; Jarvis et al 2005 Nature Rev. Neurosci.

8 Behaviorally regulated egr1 expression in parrot brain Feenders et al 2008 PLoS ONE

9 Convergent evolution of vocal learning pathways Three alternative hypotheses - Multiple independent gains - Multiple independent losses from common ancestor - Everyone to varying degrees Vocal learning pathways Vocal production pathway Auditory Learning Modified from: Jarvis et al Nature 2000

10 Vocal learning brain pathways in birds & humans Jarvis et al Nature 2000; Jarvis 2004 Ann NY Acad Sci Jarvis 2004 Ann NY Acad Sci

11 FoxP2 - language associated gene Turned on at high levels before vocal imitation starts and is turned down to low levels after vocal learning is complete FoxP2 in finch brain Days Old hatch juvenile song adult tutor song learning complete Haesler, Wada, Nshdejahn, Morrisey, Lints, Jarvis, Scharff J. Neurosci.

12 RNAi knockdown of FoxP2 in songbirds Haesler et al 2007 PLoS Biology.

13 RNAi knockdown of FoxP2 in songbirds Haesler et al 2007 PLoS Biology.

14 Dusp1 gene shows specialized regulation in song nuclei (Immediate early gene involved in neuroprotection) Egr1 Dusp1 Haruhito Horita (graduate student) Graduating 2011 Horita et al (submitted)

15 Dusp1 shows convergent specialized regulation in song nuclei Silent Singing Songbird Hummingbird Parrot Horita et al (submitted)

16 Motivation: Deciphering the genetic basis of convergent complex traits. Challenges: De-novo genome sequencing and assembly of species with and without the traits of interest. Proper genome assembly and tools for interrogating the genomes.

17

18 Add PES Map to # of contig Genome representation (%) Simulated Projection: Sequence & Assembly of Avian Genomes 300, ,000 Contig Assembly , , , ,000 * 3000 * Sequencing data (0.4 Gbp/454 Titanium Runs)

19 No matter how much sequencing, could not get full coverage on some genes. Why? Map budgie sequences from GS 454 runs to three homologous zebra finch genes Gene Gene +/-5Kb Coverage Coding region length Exon coverage 5Kb upstream exons 5Kb downstream exons Identity cutoff: 90% for 40 bp; 10 GS 454 Runs FoxP2 409,706 2,136 bp 97.05% 10.16% 72.09% ROBO1 384,230 4,243 bp 91.52% 15.20% 32.83% egr1 12,949 1,533 bp 81.28% 5.98% 1.25% Identity cutoff: 90% for 40 bp; 25 GS 454 Runs (all libraries except 8Kb) FoxP2 409,706 2,136 bp 99.00% ROBO1 384,230 4,243 bp 91.00% egr1 12,949 1,533 bp 89.60%

20 Sequencing runs used for assemblies 454 Reactions (14X coverage) Titanium shotgun library; 15 runs total (mode ~469bp) 4 x 3 kb Flex paired-end libraries; 5 runs total (~200 bp/end) 8 x 8 kb Flex paired-end libraries 3 runs total (~200 bp/end) 4 x 20 kb Flex paired-end libraries 5 runs total (~200 bp/end) Flex+ shotgun library. 4 runs total (mode ~760bp) Illumina Reactions (8X coverage) 200bp Illumina paired-end; 2 runs (~75bp/end) 200bp Tufts-illumina paired-end; 2 runs (~75bp/end)

21 Read Length of Titanium runs Average read length ~350 bp and mode ~469 bp

22 Read Length of Flx+ runs Average read length 674 bp and mode ~768 bp Inferred error rate under 1.7%

23 Compared assemblies from 3 different types of sequences with 2 assemblers Reads: short read only (200bp paired end; 400 bp shot gun) short + long read (200bp paired end; bp shot gun) short + long read, + illumina reads (75bp paired end) Assemblers: 1. Celera Assembler (CABOG; Adam Phillipy at Univ MD) 2. Newbler Assembler (Roger Winer, James Knight et al at Roche 454; Wes Warren at Wash U)

24 Comparative assembly statistics In a hybrid assembly, illumina pair-end cause scaffold breakdown, because of contaminating mate pairs Assembler Parrot-Celera Parrot-Celera Sequence method 454 short 454+Illum paired Coverage 8X 14X Genome size 1.2Gb 1.2Gb [Scaffolds] TotalBasesInScaffolds 1,022,398,844 1,032,788,935 # of Scaffolds 9,586 10,813 AvgScaffoldSize 106,655 98,174 N50ScaffoldSize 9,471,817 1,689,431 LargestScaffoldSize 55,691,819 7,090,199 Total gaps in scaffolds 131,248 99,828 [Contigs] # of Contigs 170, ,641 AvgContigSize 6,012 9,335 N50ContigSize 10,005 18,667 LargestContigSize 150, ,978

25 Comparative assembly statistics Repair of breakdown; 454 long reads enhance assembly statistics; good as Sanger method Assembler Parrot-Celera Parrot-Celera Parrot-Celera Parrot-Newbler Parrot-Newbler Parrot-Newbler Het Z. Finch-PCAP Chicken-PCAP Sequence method 454 short 454 long 454 long + illum 454 short 454 long 454 long + illum Sanger Sanger v2.1 Coverage 8X 14X 14X 8X 11X 13X 6X 7.1X Genome size 1.2Gb 1.2Gb 1.2Gb 1.2Gb 1.2Gb 1.2Gb 1.2Gb 1.05Gb [Scaffolds] TotalBasesInScaffolds 1,022,398,844 1,079,493,948 1,086,605,544 1,232,754,888 1,179,562,588 1,128,262,411 1,224,525,252 1,047,124,295 # of Scaffolds 9,586 20,685 25,212 37,024 21,081 10,926 37,698 23,776 AvgScaffoldSize 106,655 52,187 43,099 33,296 55, ,263 32,482 44,041 N50ScaffoldSize 9,471,817 12,449,215 11,201,952 4,019,469 7,285,721 6,386,522 10,409,499 11,125,310 LargestScaffoldSize 55,691,819 49,398,065 39,879,305 18,557,224 39,887,084 35,673,135 56,620,707 51,053,708 Total gaps in scaffolds 160,463 54,864 45,651 60, ,736 [Contigs] # of Contigs 170,049 75,549 70, , ,786 71, ,053 85,191 AvgContigSize 6,012 14,289 15,334 4,627 4,821 14,368 9,714 12,291 N50ContigSize 10,005 41,251 55,633 8,622 14,413 27,014 38,549 45,280 LargestContigSize 150, , , , , , , ,663

26 Mummer plot of synteny between Zebra Finch and Budgie draft assemblies: A snapshot of Chr 4 FLX PE, 454 Short reads 100s scaffold FLX PE, 454 Short + Long Reads One ~39.9MB scaffold Zebra Finch Chr 4 [25 MB-65 MB] = 40MB

27 Mummer plot of synteny between Zebra Finch and Budgie draft assemblies: A snapshot of Chr 1 FLX PE, 454 Short Reads 6 scaffolds FLX PE, 454 Short + Long Reads One ~18MB scaffold Zebra Finch Chr 18MB region

28 Assembly of equivalent 400 (titanium) and 760 (Flx+) bp sequence Assembly Metrics Titanium Reads, FLX PE FLX+, Titanium, FLX PE % change with FLX+ runs Sequence Depth estimatedgenomesize MB MB - numalignedreads , 94.48% , 94.53% - numalignedbases , 95.20% , 94.82% - numberassembled numberpartial numbersingleton numberrepeat numberoutlier numberwithbothmapped Scaffold Metrics numberofscaffolds numberofbases avgscaffoldsize N50ScaffoldSize largestscaffoldsize LargeContigMetrics numberofcontigs numberofbases avgcontigsize N50ContigSize largestcontigsize

29 Assembly completeness of 3392 highly homologous exons Cont Scaff Cont Scaff Cont Scaff 454 Flx+ & illumina 454 Flx+ 454 Titanium Used CABOG Celera assembler with different read lengths and technologies. Cont = contigs; Scaff = scaffolds

30 Assembly of genes of interest Single vs multi-exon genes Egr1: 2-exon gene, with high GC rich exon 1 FoxP2: 16-exon gene, with one GC rich exon Dusp1: Gene with repetitive regulatory region Other genes? Use zebra finch exons that >87% identical between finch and chicken to find parrot exons in the assemblies and reads

31 Single exon genes dusp14 Nb-454 short Nb-454 long Nb-hybrid CA-454 short CA-454 long CA-hybrid Nearly all high complexity single exon genes (40-60% GC) thus far examined have full coverage (97-100%) for all assemblies. Nb = Newbler; CA = Celera; 454 short = titanium; 454 long = Flx+; hybrid = 454 short+long+illumina

32 BUT: Many high complexity multi exon genes (40-60% GC) on multiple scaffolds with 454 short reads using Newbler, but assembled on one scaffold using longer reads or Celera. Multi-exon genes GlurR2 assembly Nb-454 short Nb-454 long Nb-hybrid CA-454 short CA-454 long CA-hybrid

33 GC rich exons FoxP2 language evolution Nb-454 short Nb-454 long Nb-hybrid CA-454 short CA-454 long CA-hybrid GC rich exons (>70%) have poorer assembly. Some algorithms can still handle them. Nb = Newbler; CA = Celera; 454 short = titanium; 454 long = Flx+; hybrid = 454 short+long+illumina

34 GC rich exons Dusp6 behaviorally regulated gene Nb-454 Nb-454 long Nb-hybrid CA-454 CA-454 long CA-hybrid EXON 1 missing from some assemblies of the dusp6 gene. What happened? Nb = Newbler; CA = Celera; 454 short = titanium; 454 long = Flx+; hybrid = 454 short+long+illumina

35 Dusp6 reads Sufficient exon 1 reads & overlaps for assembly

36 GC rich exons Dusp6 assembly Nb-454 Nb-454 long Nb-hybrid CA-454 CA-454 long CA-hybrid Conclusions: Newbler - GC exons (60-70%) not brought into scaffold for 454 reads (is contigs), because it was part of alternative paths. 454+illumina hybrid resolved assembly. Celera GC exons (60-70%) in 454 short (400bp) reads placed in degenerate file and not assembled; but long reads (760bp), sequence no longer labeled degenerate and thus assembled.

37 GC rich exons Egr1 behaviorally regulated gene Nb-454 short Nb-454 long Nb-hybrid CA-454 short CA-454 long CA-hybrid EXON 1 missing from all assemblies of egr1 gene. What happened?

38 GC rich exons Egr1 reads shot gun No reads of exon 1 in shot gun. GC rich exon (80%)

39 GC rich exons Egr1 reads paired-end Very few reads of exon 1 in paired-end. GC rich exon (88%)

40 GC rich promoter and exon Egr1 gene assembly Part of promoter and exon 1 missing in all assemblies

41 Even sanger method missing GC rich regions: Egr1 assembly finch Zebra finch genome Chicken genome Parrot genome All species missing GC rich promoter region (75-90%)

42 ~1,200 bp regulatory region of various microsatellite repeats In dusp1 regulatory region GGGATAACAGCACAGCCCTTAAACCCCCCTGGGGTAACAGGACAGCCCTTAAACCCCCCTGGGGTAACTGAGA ACAACCCTTAAACCCCCCTGGGGTAACAGCACAGCTCTTAAACCCCGAATTCTGAATCCACCCTGGCCCCATG GAGCATACACAGAGTGTGTGTGTGAATATGTGATTTTCTGTGTGAATATGTGATTTTGTGTGAATATGTGATT TTGTGTGCGAATATGTGATTCTGTGTGTGAATATGTGATTCTGTGTGTGAATATGTCATTTTCTGTGTGAATA TGTGATTTTGTGTGAATGTGTGATTTTCTGTGTGAATATGTGATAATATGTGATTTTGTGTGTGAATATGTGA TTCTATGTGAATATGTGATTGATTTTCTGTGTGAATATGTGATTTTGTGTGAATGTGTGATTTTTGTGTGAAT ATGTGATTTTCTGTGTGAATATGTGATTTTCTGTGTGAATATGTGATTTTTCAGAAAGTCGCAGGGTGGTTTG GCTCACACTCGCACTCACACTCTCACACACTCACACTCTCTCACTCTCACTCACACTCACACTCACACTCTCA CACTCTCTCACACTCTCTCACACTCTCACACTCTCTCACACACACACTCATACACTCCCACTCACACATACTC TCACACTCACACACTCTCACACTCTCACACTCTAACACACTCACACACTCACACACTCACACTCACACTCATA CTCACACACTCACACACTCACACTCACACTCTAACACACTCACACACTCACACTCACACTCACTTTTTCTCTT TTCTCACTTTTTCTCTCTCCCTCTCCCGCGCTCCGCGGCCGCCCCGCTCCCGATGACGTCGCACCGGCGGGGC GGGCCGCGCCCTCGCTGGCGCGCGGCCAGGCTGACGTCATCGGCCGCCCCGCCCCCCCACGTGACGCGGCCC ATTGAGAAAACGCCGTCCCGCCGCGCGGCCCCATATAAGGGCGGGAGCGGCGGGGCACCGGGACAGCCGGGCC ACCGCACCTCTGAGCTCTGCCCTGCCCTCCTTCCCTCCCCACAGCCATCCCCGCGCTGCCCGGCCATGGTGAA CCTGCGGGTGTGCGCGCTGGACTGCGAGGCGCTGCGGGCGCTGCTGCAGGAGCGCGGCGCGCAGTGCCTCGTC CTCGACTGCCGCTCCTTCTTCTCCTTCAA Horita et al (submitted)

43 Dusp1 convergent promoter changes in vocal learners Vocal learners Vocal non-learners Horita et al (submitted)

44 Dusp1 convergent promoter changes in vocal learners Vocal learners Vocal non-learners Horita et al (submitted)

45 Repetitive microsatellite assembly in dusp1 promoter ATG Nb-454 Nb-454 long Nb-hybrid CA-454 CA-454 long CA-hybrid Conclusions: Only the long reads (~760bp) allowed full and correct assembly of microsattelite repetitive sequence in the parrot dusp1 promoter.

46 Genome (G10K) consortium: Assemblathon 2 competition - parrot Three technologies 454 short (200bp) & long (750 bp) read lengths, shotgun and paired end with 3, 8, 20 Kb insert sizes, 16X coverage (Roche and Duke) Illumina HiSeq(100 bp) paired-end/mate pair reads, 0.2, 0.5, 0.8, 5, 10, 20 and 40Kb insert sizes paired end/mate pair with TruSeq v3 GC chemistry, 120X coverage (BGI & Illumina). Pacbio reads (~3000 bp read length avg, but 15% error), 7, 10Kb insert sizes, 5X coverage (Pacbio)

47 Genome (G10K) consortium: Assemblathon 2 competition - parrot Three technologies 454 long Flx+ Illumina HiSeq. Pacbio long 25 assembly groups: Overlap-Layout-Consensus (e.g. Celera CABOG, PCAP, Newbler, etc.) Eulerian debruijn graps (e.g. ALLPaths, SoapDenovo, Velvet, etc.) Hybrid inventions

48 Genome (G10K) consortium: Assemblathon 2 competition - parrot Three technologies 454 long Flx+ Illumina HiSeq. Pacbio long 25 assembly groups: Overlap-Layout-Consensus (e.g. Celera CABOG, PCAP, Newbler, etc.) Eulerian debruijn graps (e.g. ALLPaths, SoapDenovo, Velvet, etc.) Hybrid inventions Two validation methods: Optical maps (contig and scaffold accuracy) 40K pooled (10) fosmid and single molecule clones sequenced (bp accuracy)

49 Bp coverage Challenges for the future for Flex+ Limitations Cost vs Assembly bp acurarcy vs Assembly completeness Algorithms for hybrid assemblies Overcoming GC rich anti-bias 100X $ low $ high Theoretical predictions to generate high quality assembly 5X $ low 1 Read length 1500

50 Challenges for complete genome assembly Theoretical predictions to generate high quality assembly Close to theory on Dog genome long reads; Less than theory on Panda short reads Schatz et al 2010 Genome Research

51 Jarvis Lab Jason Howard James Ward (Now at NIEHS) Ganesh Ganapathy Haruhito Horita Roche 454 sequencing Duke Genome Center Lisa Bukovnik Ty Wang Olivier Fedrigo Roche support team Xuemin Liu Chinnappa Kodira Illumina sequencing Tin Le (Illumina UK) Guojie Zhang (BGI) Yingrui Li (BGI) Pacbio sequencing Eric Schadt Edwin Hawe Lawrence Lee Acknowledgements Assembly Adam Phillipy (CABOG; Univ Maryland) Sergy Koren (CABOG; Univ Maryland) Wes Warren (Newbler; Wash Univ) James Knight (Newbler; Roche 454) Roger Winer (Newbler; Roche 454) Bo Li (SoapDenovo; BGI) Optical maps David Schwartz Shiguo Zhou Fosmids Jay Shendure Funding NIH Director s Pioneer Award Howard Hughes Medical Institute

52 Previous students and Post Docs now with own labs Dr. Lubica Kubikova Dr. Raphael Pinaud Dr. V. Ann Smith Dr. Liisa Tremere Dr. Kazuhiro Wada Dr. Jing Yu Rui Wang Dr. Osceola Whitney Jason Howard Haru Horita Jarvis lab Maurice Anderson Eric Zhou Michael Silva Gustavo Arriaga Dr. Petra Roulhac Gurkan Yardimchi Andreas Pfenning Dr. Erich Tony Jarvis Zimmermann Theresa Renuart Dr. Miriam Rivas Dr. Chun-Chun Chen Alisa Ray Erina Hara Not present: Nicole Nelson Alyssa Zhu

De novo genome assembly with next generation sequencing data!! "

De novo genome assembly with next generation sequencing data!! De novo genome assembly with next generation sequencing data!! " Jianbin Wang" HMGP 7620 (CPBS 7620, and BMGN 7620)" Genomics lectures" 2/7/12" Outline" The need for de novo genome assembly! The nature

More information

Sequence assembly. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequence assembly. Jose Blanca COMAV institute bioinf.comav.upv.es Sequence assembly Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing project Unknown sequence { experimental evidence result read 1 read 4 read 2 read 5 read 3 read 6 read 7 Computational requirements

More information

Sequence Assembly and Alignment. Jim Noonan Department of Genetics

Sequence Assembly and Alignment. Jim Noonan Department of Genetics Sequence Assembly and Alignment Jim Noonan Department of Genetics james.noonan@yale.edu www.yale.edu/noonanlab The assembly problem >>10 9 sequencing reads 36 bp - 1 kb 3 Gb Outline Basic concepts in genome

More information

De Novo Assembly of High-throughput Short Read Sequences

De Novo Assembly of High-throughput Short Read Sequences De Novo Assembly of High-throughput Short Read Sequences Chuming Chen Center for Bioinformatics and Computational Biology (CBCB) University of Delaware NECC Third Skate Genome Annotation Workshop May 23,

More information

Genome Assembly. J Fass UCD Genome Center Bioinformatics Core Friday September, 2015

Genome Assembly. J Fass UCD Genome Center Bioinformatics Core Friday September, 2015 Genome Assembly J Fass UCD Genome Center Bioinformatics Core Friday September, 2015 From reads to molecules What s the Problem? How to get the best assemblies for the smallest expense (sequencing) and

More information

Mate-pair library data improves genome assembly

Mate-pair library data improves genome assembly De Novo Sequencing on the Ion Torrent PGM APPLICATION NOTE Mate-pair library data improves genome assembly Highly accurate PGM data allows for de Novo Sequencing and Assembly For a draft assembly, generate

More information

De novo whole genome assembly

De novo whole genome assembly De novo whole genome assembly Lecture 1 Qi Sun Minghui Wang Bioinformatics Facility Cornell University DNA Sequencing Platforms Illumina sequencing (100 to 300 bp reads) Overlapping reads ~180bp fragment

More information

A Roadmap to the De-novo Assembly of the Banana Slug Genome

A Roadmap to the De-novo Assembly of the Banana Slug Genome A Roadmap to the De-novo Assembly of the Banana Slug Genome Stefan Prost 1 1 Department of Integrative Biology, University of California, Berkeley, United States of America April 6th-10th, 2015 Outline

More information

De novo whole genome assembly

De novo whole genome assembly De novo whole genome assembly Lecture 1 Qi Sun Bioinformatics Facility Cornell University Data generation Sequencing Platforms Short reads: Illumina Long reads: PacBio; Oxford Nanopore Contiging/Scaffolding

More information

Genome Assembly Workshop Titles and Abstracts

Genome Assembly Workshop Titles and Abstracts Genome Assembly Workshop Titles and Abstracts TUESDAY, MARCH 15, 2011 08:15 AM Richard Durbin, Wellcome Trust Sanger Institute A generic sequence graph exchange format for assembly and population variation

More information

short read genome assembly Sorin Istrail CSCI1820 Short-read genome assembly algorithms 3/6/2014

short read genome assembly Sorin Istrail CSCI1820 Short-read genome assembly algorithms 3/6/2014 1 short read genome assembly Sorin Istrail CSCI1820 Short-read genome assembly algorithms 3/6/2014 2 Genomathica Assembler Mathematica notebook for genome assembly simulation Assembler can be found at:

More information

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio

More information

A shotgun introduction to sequence assembly (with Velvet) MCB Brem, Eisen and Pachter

A shotgun introduction to sequence assembly (with Velvet) MCB Brem, Eisen and Pachter A shotgun introduction to sequence assembly (with Velvet) MCB 247 - Brem, Eisen and Pachter Hot off the press January 27, 2009 06:00 AM Eastern Time llumina Launches Suite of Next-Generation Sequencing

More information

Gap Filling for a Human MHC Haplotype Sequence

Gap Filling for a Human MHC Haplotype Sequence American Journal of Life Sciences 2016; 4(6): 146-151 http://www.sciencepublishinggroup.com/j/ajls doi: 10.11648/j.ajls.20160406.12 ISSN: 2328-5702 (Print); ISSN: 2328-5737 (Online) Gap Filling for a Human

More information

Next-Generation Sequencing. Technologies

Next-Generation Sequencing. Technologies Next-Generation Next-Generation Sequencing Technologies Sequencing Technologies Nicholas E. Navin, Ph.D. MD Anderson Cancer Center Dept. Genetics Dept. Bioinformatics Introduction to Bioinformatics GS011062

More information

Assembly of Ariolimax dolichophallus using SOAPdenovo2

Assembly of Ariolimax dolichophallus using SOAPdenovo2 Assembly of Ariolimax dolichophallus using SOAPdenovo2 Charles Markello, Thomas Matthew, and Nedda Saremi Image taken from Banana Slug Genome Project, S. Weber SOAPdenovo Assembly Tool Short Oligonucleotide

More information

Title: High-quality genome assembly of channel catfish, Ictalurus punctatus

Title: High-quality genome assembly of channel catfish, Ictalurus punctatus Author s response to reviews Title: High-quality genome assembly of channel catfish, Ictalurus punctatus Authors: Qiong Shi (shiqiong@genomics.cn) Xiaohui Chen (xhchenffri@hotmail.com) Liqiang Zhong (lqzhongffri@hotmail.com)

More information

Targeted Sequencing Using Droplet-Based Microfluidics. Keith Brown Director, Sales

Targeted Sequencing Using Droplet-Based Microfluidics. Keith Brown Director, Sales Targeted Sequencing Using Droplet-Based Microfluidics Keith Brown Director, Sales brownk@raindancetech.com Who we are: is a Provider of Microdroplet-based Solutions The Company s RainStorm TM Technology

More information

Genome Assembly, part II. Tandy Warnow

Genome Assembly, part II. Tandy Warnow Genome Assembly, part II Tandy Warnow How to apply de Bruijn graphs to genome assembly Phillip E C Compeau, Pavel A Pevzner & Glenn Tesler A mathematical concept known as a de Bruijn graph turns the formidable

More information

Genome Sequencing-- Strategies

Genome Sequencing-- Strategies Genome Sequencing-- Strategies Bio 4342 Spring 04 What is a genome? A genome can be defined as the entire DNA content of each nucleated cell in an organism Each organism has one or more chromosomes that

More information

Comprehensive Views of Genetic Diversity with Single Molecule, Real-Time (SMRT) Sequencing

Comprehensive Views of Genetic Diversity with Single Molecule, Real-Time (SMRT) Sequencing Comprehensive Views of Genetic Diversity with Single Molecule, Real-Time (SMRT) Sequencing Alix Kieu Cruse November 2015 For Research Use Only. Not for use in diagnostics procedures. Copyright 2015 by

More information

Next Generation Sequencing. Jeroen Van Houdt - Leuven 13/10/2017

Next Generation Sequencing. Jeroen Van Houdt - Leuven 13/10/2017 Next Generation Sequencing Jeroen Van Houdt - Leuven 13/10/2017 Landmarks in DNA sequencing 1953 Discovery of DNA double helix structure 1977 A Maxam and W Gilbert "DNA seq by chemical degradation" F Sanger"DNA

More information

De novo Genome Assembly

De novo Genome Assembly De novo Genome Assembly A/Prof Torsten Seemann Winter School in Mathematical & Computational Biology - Brisbane, AU - 3 July 2017 Introduction The human genome has 47 pieces MT (or XY) The shortest piece

More information

A near perfect de novo assembly of a eukaryotic genome using sequence reads of greater than 10 kilobases generated by the Pacific Biosciences RS II

A near perfect de novo assembly of a eukaryotic genome using sequence reads of greater than 10 kilobases generated by the Pacific Biosciences RS II A near perfect de novo assembly of a eukaryotic genome using sequence reads of greater than 10 kilobases generated by the Pacific Biosciences RS II W. Richard McCombie Disclosures Introduction to the challenge

More information

NOW GENERATION SEQUENCING. Monday, December 5, 11

NOW GENERATION SEQUENCING. Monday, December 5, 11 NOW GENERATION SEQUENCING 1 SEQUENCING TIMELINE 1953: Structure of DNA 1975: Sanger method for sequencing 1985: Human Genome Sequencing Project begins 1990s: Clinical sequencing begins 1998: NHGRI $1000

More information

Haploid Assembly of Diploid Genomes

Haploid Assembly of Diploid Genomes Haploid Assembly of Diploid Genomes Challenges, Trials, Tribulations 13 October 2011 İnanç Birol Assembly By Short Sequencing IEEE InfoVis 2009 2 3 in Literature ~40 citations on tool comparisons ~20 citations

More information

Workflow of de novo assembly

Workflow of de novo assembly Workflow of de novo assembly Experimental Design Clean sequencing data (trim adapter and low quality sequences) Run assembly software for contiging and scaffolding Evaluation of assembly Several iterations:

More information

Outline. DNA Sequencing. Whole Genome Shotgun Sequencing. Sequencing Coverage. Whole Genome Shotgun Sequencing 3/28/15

Outline. DNA Sequencing. Whole Genome Shotgun Sequencing. Sequencing Coverage. Whole Genome Shotgun Sequencing 3/28/15 Outline Introduction Lectures 22, 23: Sequence Assembly Spring 2015 March 27, 30, 2015 Sequence Assembly Problem Different Solutions: Overlap-Layout-Consensus Assembly Algorithms De Bruijn Graph Based

More information

Announcements. Coffee! Evalua,on. Dr. Yoshiki Sasai, R.I.P.

Announcements. Coffee! Evalua,on. Dr. Yoshiki Sasai, R.I.P. Announcements Coffee! Evalua,on. Dr. Yoshiki Sasai, R.I.P. Sequencing considerations Three basic problems Resequencing, coun,ng, and assembly. A. B. C. 1. Resequencing analysis We know a reference genome,

More information

Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Supplementary Material

Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Supplementary Material Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions Joshua N. Burton 1, Andrew Adey 1, Rupali P. Patwardhan 1, Ruolan Qiu 1, Jacob O. Kitzman 1, Jay Shendure 1 1 Department

More information

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping BENG 183 Trey Ideker Genome Assembly and Physical Mapping Reasons for sequencing Complete genome sequencing!!! Resequencing (Confirmatory) E.g., short regions containing single nucleotide polymorphisms

More information

Third Generation Sequencing

Third Generation Sequencing Third Generation Sequencing By Mohammad Hasan Samiee Aref Medical Genetics Laboratory of Dr. Zeinali History of DNA sequencing 1953 : Discovery of DNA structure by Watson and Crick 1973 : First sequence

More information

Analysis of Structural Variants using 3 rd generation Sequencing

Analysis of Structural Variants using 3 rd generation Sequencing Analysis of Structural Variants using 3 rd generation Sequencing Michael Schatz January 12, 2016 Bioinformatics / PAG XXIV @mike_schatz / #PAGXXIV Analysis of Structural Variants using 3 rd generation

More information

Genomics AGRY Michael Gribskov Hock 331

Genomics AGRY Michael Gribskov Hock 331 Genomics AGRY 60000 Michael Gribskov gribskov@purdue.edu Hock 331 Computing Essentials Resources In this course we will assemble and annotate both genomic and transcriptomic sequence assemblies We will

More information

Sequencing the genomes of Nicotiana sylvestris and Nicotiana tomentosiformis Nicolas Sierro

Sequencing the genomes of Nicotiana sylvestris and Nicotiana tomentosiformis Nicolas Sierro Sequencing the genomes of Nicotiana sylvestris and Nicotiana tomentosiformis Nicolas Sierro Philip Morris International R&D, Philip Morris Products S.A., Neuchatel, Switzerland Introduction Nicotiana sylvestris

More information

Molecular Biology: DNA sequencing

Molecular Biology: DNA sequencing Molecular Biology: DNA sequencing Author: Prof Marinda Oosthuizen Licensed under a Creative Commons Attribution license. SEQUENCING OF LARGE TEMPLATES As we have seen, we can obtain up to 800 nucleotides

More information

Next Generation Sequencing Lecture Saarbrücken, 19. March Sequencing Platforms

Next Generation Sequencing Lecture Saarbrücken, 19. March Sequencing Platforms Next Generation Sequencing Lecture Saarbrücken, 19. March 2012 Sequencing Platforms Contents Introduction Sequencing Workflow Platforms Roche 454 ABI SOLiD Illumina Genome Anlayzer / HiSeq Problems Quality

More information

The Genome Analysis Centre. Building Excellence in Genomics and Computational Bioscience

The Genome Analysis Centre. Building Excellence in Genomics and Computational Bioscience Building Excellence in Genomics and Computational Bioscience Wheat genome sequencing: an update from TGAC Sequencing Technology Development now Plant & Microbial Genomics Group Leader Matthew Clark matt.clark@tgac.ac.uk

More information

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio

More information

RADSeq Data Analysis. Through STACKS on Galaxy. Yvan Le Bras Anthony Bretaudeau Cyril Monjeaud Gildas Le Corguillé

RADSeq Data Analysis. Through STACKS on Galaxy. Yvan Le Bras Anthony Bretaudeau Cyril Monjeaud Gildas Le Corguillé RADSeq Data Analysis Through STACKS on Galaxy Yvan Le Bras Anthony Bretaudeau Cyril Monjeaud Gildas Le Corguillé RAD sequencing: next-generation tools for an old problem INTRODUCTION source: Karim Gharbi

More information

The tomato genome re-seq project

The tomato genome re-seq project The tomato genome re-seq project http://www.tomatogenome.net 5 February 2013, Richard Finkers & Sjaak van Heusden Rationale Genetic diversity in commercial tomato germplasm relatively narrow Unexploited

More information

Slide 1. Slide 2. Slide 3

Slide 1. Slide 2. Slide 3 Notes for Voice over on Sequencing Module Slide 1 The purpose of this presentation is to describe an adaptive approach to the sequencing of very large conifer genomes. Long considered a task so daunting

More information

Human genome sequence

Human genome sequence NGS: the basics Human genome sequence June 26th 2000: official announcement of the completion of the draft of the human genome sequence (truly finished in 2004) Francis Collins Craig Venter HGP: 3 billion

More information

Analysis of large deletions in human-chimp genomic alignments. Erika Kvikstad BioInformatics I December 14, 2004

Analysis of large deletions in human-chimp genomic alignments. Erika Kvikstad BioInformatics I December 14, 2004 Analysis of large deletions in human-chimp genomic alignments Erika Kvikstad BioInformatics I December 14, 2004 Outline Mutations, mutations, mutations Project overview Strategy: finding, classifying indels

More information

CloG: a pipeline for closing gaps in a draft assembly using short reads

CloG: a pipeline for closing gaps in a draft assembly using short reads CloG: a pipeline for closing gaps in a draft assembly using short reads Xing Yang, Daniel Medvin, Giri Narasimhan Bioinformatics Research Group (BioRG) School of Computing and Information Sciences Miami,

More information

CSC Assignment1SequencingReview- 1109_Su N_NEXT_GENERATION_SEQUENCING.docx By Anonymous. Similarity Index

CSC Assignment1SequencingReview- 1109_Su N_NEXT_GENERATION_SEQUENCING.docx By Anonymous. Similarity Index Page 1 of 6 Document Viewer TurnitinUK Originality Report Processed on: 05-Dec-20 10:49 AM GMT ID: 13 Word Count: 1587 Submitted: 1 CSC8313-201 - Assignment1SequencingReview- 1109_Su N_NEXT_GENERATION_SEQUENCING.docx

More information

COPE: An accurate k-mer based pair-end reads connection tool to facilitate genome assembly

COPE: An accurate k-mer based pair-end reads connection tool to facilitate genome assembly Bioinformatics Advance Access published October 8, 2012 COPE: An accurate k-mer based pair-end reads connection tool to facilitate genome assembly Binghang Liu 1,2,, Jianying Yuan 2,, Siu-Ming Yiu 1,3,

More information

Genomics and Transcriptomics of Spirodela polyrhiza

Genomics and Transcriptomics of Spirodela polyrhiza Genomics and Transcriptomics of Spirodela polyrhiza Doug Bryant Bioinformatics Core Facility & Todd Mockler Group, Donald Danforth Plant Science Center Desired Outcomes High-quality genomic reference sequence

More information

SMRT-assembly Error correction and de novo assembly of complex genomes using single molecule, real-time sequencing

SMRT-assembly Error correction and de novo assembly of complex genomes using single molecule, real-time sequencing SMRT-assembly Error correction and de novo assembly of complex genomes using single molecule, real-time sequencing Michael Schatz May 10, 2012 Biology of Genomes @mike_schatz / #bog12 Ingredients for a

More information

Bioinformatics Advice on Experimental Design

Bioinformatics Advice on Experimental Design Bioinformatics Advice on Experimental Design Where do I start? Please refer to the following guide to better plan your experiments for good statistical analysis, best suited for your research needs. Statistics

More information

Applications of PacBio Single Molecule, Real- Time (SMRT) DNA Sequencing

Applications of PacBio Single Molecule, Real- Time (SMRT) DNA Sequencing Applications of PacBio Single Molecule, Real- Time (SMRT) DNA Sequencing Stephen Turner November 5, 2014 FIND MEANING IN COMPLEXITY For Research Use Only. Not for use in diagnostic procedures. Pacific

More information

Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona

Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona Structural variation Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona Genetic variation How much genetic variation is there between individuals? What type of variants

More information

Hybrid Error Correction and De Novo Assembly with Oxford Nanopore

Hybrid Error Correction and De Novo Assembly with Oxford Nanopore Hybrid Error Correction and De Novo Assembly with Oxford Nanopore Michael Schatz Jan 13, 2015 PAG Bioinformatics @mike_schatz / #PAGXXIII Oxford Nanopore MinION Thumb drive sized sequencer powered over

More information

Incorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits

Incorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits Incorporating Molecular ID Technology Accel-NGS 2S MID Indexing Kits Molecular Identifiers (MIDs) MIDs are indices used to label unique library molecules MIDs can assess duplicate molecules in sequencing

More information

Ultrasequencing: Methods and Applications of the New Generation Sequencing Platforms

Ultrasequencing: Methods and Applications of the New Generation Sequencing Platforms Ultrasequencing: Methods and Applications of the New Generation Sequencing Platforms Laura Moya Andérico Master in Advanced Genetics Genomics Class December 16 th, 2015 Brief Overview First-generation

More information

March 20-23, 2010 Sacramento, CA

March 20-23, 2010 Sacramento, CA Comparison of Commercially Available Target Enrichment Methods for Next Generation Sequencing with the Illumina Platform March 20-23, 2010 Sacramento, CA Anoja Perera, Scottie Adams, David Bintzler, Kip

More information

Introduction to Bioinformatics. Genome sequencing & assembly

Introduction to Bioinformatics. Genome sequencing & assembly Introduction to Bioinformatics Genome sequencing & assembly Genome sequencing & assembly p DNA sequencing How do we obtain DNA sequence information from organisms? p Genome assembly What is needed to put

More information

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday June 16, 2014

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Monday June 16, 2014 High Throughput Sequencing Technologies J Fass UCD Genome Center Bioinformatics Core Monday June 16, 2014 Sequencing Explosion www.genome.gov/sequencingcosts http://t.co/ka5cvghdqo Sequencing Explosion

More information

Outline. General principles of clonal sequencing Analysis principles Applications CNV analysis Genome architecture

Outline. General principles of clonal sequencing Analysis principles Applications CNV analysis Genome architecture The use of new sequencing technologies for genome analysis Chris Mattocks National Genetics Reference Laboratory (Wessex) NGRL (Wessex) 2008 Outline General principles of clonal sequencing Analysis principles

More information

Illumina (Solexa) Throughput: 4 Tbp in one run (5 days) Cheapest sequencing technology. Mismatch errors dominate. Cost: ~$1000 per human genme

Illumina (Solexa) Throughput: 4 Tbp in one run (5 days) Cheapest sequencing technology. Mismatch errors dominate. Cost: ~$1000 per human genme Illumina (Solexa) Current market leader Based on sequencing by synthesis Current read length 100-150bp Paired-end easy, longer matepairs harder Error ~0.1% Mismatch errors dominate Throughput: 4 Tbp in

More information

Opportunities offered by new sequencing technologies

Opportunities offered by new sequencing technologies Opportunities offered by new sequencing technologies Pierre Taberlet Laboratoire d'ecologie Alpine CNRS UMR 5553 Université Joseph Fourier, Grenoble, France Nature Biotechnology, October 2008: special

More information

How much sequencing do I need? Emily Crisovan Genomics Core

How much sequencing do I need? Emily Crisovan Genomics Core How much sequencing do I need? Emily Crisovan Genomics Core How much sequencing? Three questions: 1. How much sequence is required for good experimental design? 2. What type of sequencing run is best?

More information

Bayesian Networks as framework for data integration

Bayesian Networks as framework for data integration Bayesian Networks as framework for data integration Jun Zhu, Ph. D. Department of Genomics and Genetic Sciences Icahn Institute of Genomics and Multiscale Biology Icahn Medical School at Mount Sinai New

More information

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI)

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI) DNA-Sequencing Technologies & Devices Matthias Platzer Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI) Genome analysis DNA sequencing platforms ABI 3730xl 4/2004 & 6/2006 1 Mb/day,

More information

Research school methods seminar Genomics and Transcriptomics

Research school methods seminar Genomics and Transcriptomics Research school methods seminar Genomics and Transcriptomics Stephan Klee 19.11.2014 2 3 4 5 Genetics, Genomics what are we talking about? Genetics and Genomics Study of genes Role of genes in inheritence

More information

Title: Genome sequence of lineage III Listeria monocytogenes strain HCC23

Title: Genome sequence of lineage III Listeria monocytogenes strain HCC23 JB Accepts, published online ahead of print on 20 May 2011 J. Bacteriol. doi:10.1128/jb.05236-11 Copyright 2011, American Society for Microbiology and/or the Listed Authors/Institutions. All Rights Reserved.

More information

De novo genome assembly. Dr Torsten Seemann

De novo genome assembly. Dr Torsten Seemann De novo genome assembly Dr Torsten Seemann IMB Winter School - Brisbane Mon 1 July 2013 Introduction Ideal world I would not need to give this talk! Human DNA Non-existent USB3 device AGTCTAGGATTCGCTA

More information

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI)

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI) DNA-Sequencing Technologies & Devices Matthias Platzer Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI) Genome analysis DNA sequencing platforms ABI 3730xl 4/2004 & 6/2006 1 Mb/day,

More information

CM581A2: NEXT GENERATION SEQUENCING PLATFORMS AND LIBRARY GENERATION

CM581A2: NEXT GENERATION SEQUENCING PLATFORMS AND LIBRARY GENERATION CM581A2: NEXT GENERATION SEQUENCING PLATFORMS AND LIBRARY GENERATION Fall 2015 Instructors: Coordinator: Carol Wilusz, Associate Professor MIP, CMB Instructor: Dan Sloan, Assistant Professor, Biology,

More information

Applying Genotyping by Sequencing (GBS) to Corn Genetics and Breeding. Peter Bradbury USDA/Cornell University

Applying Genotyping by Sequencing (GBS) to Corn Genetics and Breeding. Peter Bradbury USDA/Cornell University Applying Genotyping by Sequencing (GBS) to Corn Genetics and Breeding Peter Bradbury USDA/Cornell University Genotyping by sequencing (GBS) makes use of high through-put, short-read sequencing to provide

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Alla L Lapidus, Ph.D. SPbSU St. Petersburg Term Bioinformatics Term Bioinformatics was invented by Paulien Hogeweg (Полина Хогевег) and Ben Hesper in 1970 as "the study of

More information

Next Generation Sequencing for Metagenomics

Next Generation Sequencing for Metagenomics Next Generation Sequencing for Metagenomics Genève, 13.10.2016 Patrick Wincker, Genoscope-CEA Human and model organisms sequencing were initially based on the Sanger method Sanger shotgun sequencing was

More information

De novo assembly of human genomes with massively parallel short read sequencing

De novo assembly of human genomes with massively parallel short read sequencing Resource De novo assembly of human genomes with massively parallel short read sequencing Ruiqiang Li, 1,2,3 Hongmei Zhu, 1,3 Jue Ruan, 1,3 Wubin Qian, 1 Xiaodong Fang, 1 Zhongbin Shi, 1 Yingrui Li, 1 Shengting

More information

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Tuesday December 16, 2014

High Throughput Sequencing Technologies. J Fass UCD Genome Center Bioinformatics Core Tuesday December 16, 2014 High Throughput Sequencing Technologies J Fass UCD Genome Center Bioinformatics Core Tuesday December 16, 2014 Sequencing Explosion www.genome.gov/sequencingcosts http://t.co/ka5cvghdqo Sequencing Explosion

More information

Shuji Shigenobu. April 3, 2013 Illumina Webinar Series

Shuji Shigenobu. April 3, 2013 Illumina Webinar Series Shuji Shigenobu April 3, 2013 Illumina Webinar Series RNA-seq RNA-seq is a revolutionary tool for transcriptomics using deepsequencing technologies. genome HiSeq2000@NIBB (Wang 2009 with modifications)

More information

Hunting Down the Papaya Transgenes

Hunting Down the Papaya Transgenes Hunting Down the Papaya Transgenes Michael Schatz Center for Bioinformatics and Computational Biology University of Maryland January 16, 2008 PAG XVI Papaya Overview Carica papaya from the order Brassicales

More information

Next Generation Sequencing Technologies. Rob Mitra 1/30/17

Next Generation Sequencing Technologies. Rob Mitra 1/30/17 Next Generation Sequencing Technologies Rob Mitra 1/30/17 Outline Overview of next-generation sequencing How does it work? What technologies are being used? How would one use it in practice? Math basic

More information

Modern Epigenomics. Histone Code

Modern Epigenomics. Histone Code Modern Epigenomics Histone Code Ting Wang Department of Genetics Center for Genome Sciences and Systems Biology Washington University Dragon Star 2012 Changchun, China July 2, 2012 DNA methylation + Histone

More information

Next Gen Sequencing. Expansion of sequencing technology. Contents

Next Gen Sequencing. Expansion of sequencing technology. Contents Next Gen Sequencing Contents 1 Expansion of sequencing technology 2 The Next Generation of Sequencing: High-Throughput Technologies 3 High Throughput Sequencing Applied to Genome Sequencing (TEDed CC BY-NC-ND

More information

Genome sequence of Acinetobacter baumannii MDR-TJ

Genome sequence of Acinetobacter baumannii MDR-TJ JB Accepts, published online ahead of print on 11 March 2011 J. Bacteriol. doi:10.1128/jb.00226-11 Copyright 2011, American Society for Microbiology and/or the Listed Authors/Institutions. All Rights Reserved.

More information

Bioinformatics and computational tools

Bioinformatics and computational tools Bioinformatics and computational tools Etienne P. de Villiers (PhD) International Livestock Research Institute Nairobi, Kenya International Livestock Research Institute Nairobi, Kenya ILRI works at the

More information

Typically, to be biologically related means to share a common ancestor. In biology, we call this homologous

Typically, to be biologically related means to share a common ancestor. In biology, we call this homologous Typically, to be biologically related means to share a common ancestor. In biology, we call this homologous. Two proteins sharing a common ancestor are said to be homologs. Homologyoften implies structural

More information

Livestock Genomics: The Odyssey

Livestock Genomics: The Odyssey Livestock Genomics: The Odyssey Jim Womack, Texas A&M University NRSP-8 Animal Genome Workshop Plant and Animal Genome XX, Jan 15, 2012 Thanks, Geoff and Workshop Committee BRD? Rift Valley Fever? HISTORY!!!

More information

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Number and length distributions of the inferred fosmids.

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Number and length distributions of the inferred fosmids. Supplementary Figure 1 Number and length distributions of the inferred fosmids. Fosmid were inferred by mapping each pool s sequence reads to hg19. We retained only those reads that mapped to within a

More information

Lectures 18, 19: Sequence Assembly. Spring 2017 April 13, 18, 2017

Lectures 18, 19: Sequence Assembly. Spring 2017 April 13, 18, 2017 Lectures 18, 19: Sequence Assembly Spring 2017 April 13, 18, 2017 1 Outline Introduction Sequence Assembly Problem Different Solutions: Overlap-Layout-Consensus Assembly Algorithms De Bruijn Graph Based

More information

1000 Insect Transcriptomes Evolution - 1KITE

1000 Insect Transcriptomes Evolution - 1KITE 1KITE 1K Insect Transcriptome Evolution 1000 Insect Transcriptomes Evolution - 1KITE An Example of Handling "Big Data" Karen Meusemann, on behalf of the 1KITE Consortium CSIRO Ecosystem Sciences, Australian

More information

Local assembly and pre-mrna splicing analyses by high-throughput sequencing data

Local assembly and pre-mrna splicing analyses by high-throughput sequencing data Graduate Theses and Dissertations Graduate College 2012 Local assembly and pre-mrna splicing analyses by high-throughput sequencing data Hsien-chao Chou Iowa State University Follow this and additional

More information

RIPTIDE HIGH THROUGHPUT RAPID LIBRARY PREP (HT-RLP)

RIPTIDE HIGH THROUGHPUT RAPID LIBRARY PREP (HT-RLP) Application Note: RIPTIDE HIGH THROUGHPUT RAPID LIBRARY PREP (HT-RLP) Introduction: Innovations in DNA sequencing during the 21st century have revolutionized our ability to obtain nucleotide information

More information

Sequencing Theory. Brett E. Pickett, Ph.D. J. Craig Venter Institute

Sequencing Theory. Brett E. Pickett, Ph.D. J. Craig Venter Institute Sequencing Theory Brett E. Pickett, Ph.D. J. Craig Venter Institute Applications of Genomics and Bioinformatics to Infectious Diseases GABRIEL Network Agenda Sequencing Instruments Sanger Illumina Ion

More information

RNA-Seq analysis workshop

RNA-Seq analysis workshop RNA-Seq analysis workshop Zhangjun Fei Boyce Thompson Institute for Plant Research USDA Robert W. Holley Center for Agriculture and Health Cornell University Outline Background of RNA-Seq Application of

More information

High Throughput Sequencing Technologies. UCD Genome Center Bioinformatics Core Monday 15 June 2015

High Throughput Sequencing Technologies. UCD Genome Center Bioinformatics Core Monday 15 June 2015 High Throughput Sequencing Technologies UCD Genome Center Bioinformatics Core Monday 15 June 2015 Sequencing Explosion www.genome.gov/sequencingcosts http://t.co/ka5cvghdqo Sequencing Explosion 2011 PacBio

More information

What the Genome of Raffaelea lauricola Can Tell Us About Laurel Wilt

What the Genome of Raffaelea lauricola Can Tell Us About Laurel Wilt What the Genome of Raffaelea lauricola Can Tell Us About Laurel Wilt Laurel Wilt Summit November 3-4, 2016 Dr. Jeffrey Rollins Associate Professor Plant Pathology Department University of Florida Gainesville,

More information

1. A brief overview of sequencing biochemistry

1. A brief overview of sequencing biochemistry Supplementary reading materials on Genome sequencing (optional) The materials are from Mark Blaxter s lecture notes on Sequencing strategies and Primary Analysis 1. A brief overview of sequencing biochemistry

More information

Each cell of a living organism contains chromosomes

Each cell of a living organism contains chromosomes COVER FEATURE Genome Sequence Assembly: Algorithms and Issues Algorithms that can assemble millions of small DNA fragments into gene sequences underlie the current revolution in biotechnology, helping

More information

Mapping strategies for sequence reads

Mapping strategies for sequence reads Mapping strategies for sequence reads Ernest Turro University of Cambridge 21 Oct 2013 Quantification A basic aim in genomics is working out the contents of a biological sample. 1. What distinct elements

More information

Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads

Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads Authors Rei Kajitani 1, Kouta Toshimoto 1,2, Hideki Noguchi 3, Atsushi Toyoda 3,4, Yoshitoshi Ogura 5, Miki

More information

Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory

Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Title: Genome Sequence Databases (Overview): Sequencing and Assembly Author: Lapidus, Alla L. Publication Date: 08-25-2009 Publication

More information

NGS technologies approaches, applications and challenges!

NGS technologies approaches, applications and challenges! www.supagro.fr NGS technologies approaches, applications and challenges! Jean-François Martin Centre de Biologie pour la Gestion des Populations Centre international d études supérieures en sciences agronomiques

More information

HiSeqTM 2000 Sequencing System

HiSeqTM 2000 Sequencing System IET International Equipment Trading Ltd. www.ietltd.com Proudly serving laboratories worldwide since 1979 CALL +847.913.0777 for Refurbished & Certified Lab Equipment HiSeqTM 2000 Sequencing System Performance

More information