Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory
|
|
- Amie Gilmore
- 6 years ago
- Views:
Transcription
1 Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory Title Glomus intraradices: Initial Whole-Genome Shotgun Sequencing and Assembly Results Permalink Author Shapiro, Harris Publication Date escholarship.org Powered by the California Digital Library University of California
2 Glomus intraradices Initial Whole-Genome Shotgun Sequencing and Assembly Results Harris Shapiro DOE Joint Genome Institute This work was performed under the auspices of the US Department of Energy's Office of Science, Biological and Environmental Research Program, and by the University of California, Lawrence Livermore National Laboratory under Contract No. W Eng-48, Lawrence Berkeley National Laboratory under contract No. DE-AC02-05CH11231 and Los Alamos National Laboratory under contract No. W-7405-ENG-36.
3 Overview Review of the Whole-Genome Shotgun (WGS) assembly procedure WGS library Quality Control (QC) G.intraradices test assembly Polymorphism estimates Data set anomalies Where to go from here COST 8.38 Meeting, Dijon, 3/6/2005 2
4 Outline of the Assembly Process: JAZZ, the JGI In-House Assembler Spores Extract DNA Pair-aligned reads Convert to graph: Reads = Nodes Edges = Pair alignments Shred & insert DNA 3 kb 8 kb 40 kb Clone libraries Forward & reverse sequencing Traverse graph to generate contigs and scaffolds End sequences (reads) with approximately known separations Read pairing information helps span COST 8.38 Meeting, Dijon, 3/6/2005 sections without good alignments 3
5 So What Could Go Wrong? Repeat Elements A R B R C Shred Assemble??? Polymorphism Haplotype 1 Haplotype 2?? Shred Assemble?? Stricter assembly parameters can distinguish (some) repeats, but make it more likely that haplotypes will assembly separately. COST 8.38 Meeting, Dijon, 3/6/2005 4
6 WGS Library Quality Control Identification of insertless clones Trimming of vector and low quality sequence Examination of the GC content distribution BLAST analysis COST 8.38 Meeting, Dijon, 3/6/2005 5
7 GC Content Comparison: Initial WGS Libraries 0.25 G.intraradices, Old WGS Libraries, GC Content Comparison AHZO (3 kb, 7,398 reads) AHZP (8 kb, 13,759 reads) AHZS (40 kb, 2,907 reads) 0.2 Fraction of Reads GC Content COST 8.38 Meeting, Dijon, 3/6/2005 6
8 GC Content Comparison: Second WGS Libraries 0.12 G.intraradices, New WGS Libraries, GC Content Comparison BCTU (3 kb, 7,228 reads) BCTW (8 kb, 7,548 reads) ATSX (40 kb, 1,479 reads) Fraction of Reads GC Content COST 8.38 Meeting, Dijon, 3/6/2005 7
9 Test Assembly Specification All 6 WGS libraries were included: ~28.7 MB of trimmed sequence, of which perhaps 4 7 MB consisted of various contaminants Genome size estimate: ~15 MB (Hijri & Sanders, 2004) Estimated depth: 1 1.5x; set to 1.0 for the assembly Repeats: Maximum copy number of 5 times the estimated depth (5) before they can t be used to seed an alignment Polymorphism/repeat divergence: used the default value of ~3% different for a neutral alignment COST 8.38 Meeting, Dijon, 3/6/2005 8
10 Summary of Test Assembly Results 615 scaffolds, containing 704 KB of scaffold sequence (582 KB contig sequence = 17.3% gap). After removing short and redundant scaffolds, 165 were left, with 318 KB of scaffold sequence (197 KB contig sequence = 38.1% gap) Filtered scaffold set was easily partitioned: Prokaryotic contaminant: GC content > 0.60 Cloning artifacts: GC content between 0.50 and 0.60 (with one exception) Potential G.intraradices scaffolds: GC content < 0.50 (with one exception) EST alignments supported the identification of the low-gc scaffolds COST 8.38 Meeting, Dijon, 3/6/2005 9
11 Test Assembly Statistics Scaffold Set Potential G.intraradices Cloning Artifact Prokaryotic Contamination Number of Scaffolds Scaffold Sequence Total (KB) Contig Sequence Total (KB) COST 8.38 Meeting, Dijon, 3/6/
12 Test Assembly: BLAST Analysis G.intraradices, 4/25/2005 Assembly, Filtered Scaffolds, Revised BLAST Results Filtered Scaffolds Prokaryotic-Only, Cloning Artifacts Prokaryotic-Only, Other 0.6 GC Content Mean Scaffold Sequence Depth (Excluding Gaps) COST 8.38 Meeting, Dijon, 3/6/
13 Test Assembly: EST Analysis 0.8 G.intraradices, 4/25/2005 Filtered Scaffolds, Alignments to cdna Sequences Filtered Scaffolds cdna Hits GC Content Mean Scaffold Sequence Depth (Excluding Gaps) COST 8.38 Meeting, Dijon, 3/6/
14 Polymorphism Estimation Procedure Generation of reference sequence: draft & finished subcloned fosmids Alignment of WGS & EST reads to the reference sequence Identification of good alignments Reads with unique alignments to the reference fosmids Greater than 97% of the read covered by the alignment Polymorphism calculations % ID of the good alignments (Number of matches/alignment footprint) Mismatch % of the good alignments (Number of mismatches/read length) COST 8.38 Meeting, Dijon, 3/6/
15 Subcloned Fosmid Creation First batch: 15 randomly selected fosmids from the AHZS library. Consisted almost entirely of cloning artifacts. Second batch: 10 targeted fosmids from the AHZS library. Due to cloning artifacts and sequencing problems, this yielded 5 finished fosmids, containing about 181 KB of reference sequence. Third batch: 10 targeted fosmids from the ATSX library. All 10 yielded draft sequences. COST 8.38 Meeting, Dijon, 3/6/
16 Sample Finished Fosmid WGS Alignments COST 8.38 Meeting, Dijon, 3/6/
17 Sample Draft Fosmid WGS Alignments COST 8.38 Meeting, Dijon, 3/6/
18 Summary of the WGS Polymorphism Results Polymorphism Estimate Data Set Number of Reads % ID Method Mismatch Method AHZO % +/- 4.9% 3.5% +/- 2.9% AHZP % +/- 4.2% 2.6% +/- 2.7% BCTU % +/- 6.1% 2.3% +/- 2.5% Overall % +/- 5.1% N/A COST 8.38 Meeting, Dijon, 3/6/
19 Details of the WGS Polymorphism Results With the % ID Method: All three WGS libraries had their main polymorphism peak at 0%- 1%. All three WGS libraries had a secondary peak at 7%-8%. AHZO may have had an additional secondary peak at 5%. With the Mismatch Method: All three WGS libraries had their main polymorphism peak at 0%- 1%. AHZO and BCTU still had secondary peaks at 7% - 8%. AHZO still had a secondary peak at 5%. AHZP no longer had any secondary peaks, instead having its mismatch rate decline smoothly to a background value. COST 8.38 Meeting, Dijon, 3/6/
20 WGS Data Set Issues Only produced plausible alignments to 2 of the 10 subcloned fosmids in the latest batch Very non-uniform coverage of some of the draft and finished fosmids Potential causes: Mis-assembled or finished fosmids Chimeric fosmid clones Undetected non-glomus contamination Library bias Statistical artifact COST 8.38 Meeting, Dijon, 3/6/
21 Sample Finished Fosmid EST Alignments COST 8.38 Meeting, Dijon, 3/6/
22 Sample Draft Fosmid EST Alignments COST 8.38 Meeting, Dijon, 3/6/
23 EST Polymorphism Results Very few EST sequences aligned to the subcloned fosmid sequences. Only 1 of the 5 finished fosmids had any alignments Only 2 of the 10 draft fosmids had plausible alignments The longest alignments all seemed to be missing sections of the ESTs, on the order of hundreds of bases. Due to the small number of alignments and the anomalies associated with them, it was not possible to estimate the polymorphism rate using this data set. COST 8.38 Meeting, Dijon, 3/6/
24 EST Data Set Issues Only produced plausible alignments to 2 of the 10 subcloned fosmids in the latest batch Very small number of alignments overall Alignments that were present did not cover large portions of the ESTs Potential causes: Mis-assembled or finished fosmids Chimeric fosmid clones Undetected non-glomus contamination Library bias Spurious alignments The haplotypes actually differ greatly from each other Statistical artifact COST 8.38 Meeting, Dijon, 3/6/
25 Where Do We Go From Here? Investigation of the WGS and EST subcloned fosmid anomalies Creation of a third round of WGS libraries Further attempts at WGS assembly Requires the production of a good set of WGS libraries Analysis of the possible repeats, which is complicated by high polymorphism Initially strict parameters to force the haplotypes to assemble separately Distribution of polymorphic elements might require a relaxedparameter assembly, to try to force the haplotypes to assemble together Pre-screening of repeat elements may allow some sidestepping of the repeat/polymorphism trade-off COST 8.38 Meeting, Dijon, 3/6/
26 Acknowledgements Joint Genome Institute Library Creation: Eileen Dalin, Chris Detter, Paul Richardson Production Sequencing: Tijana Glavina, Miranda Harmon-Smith, Susan Lucas Genome Assembly: Nik Putnam, Jarrod Chapman, Isaac Ho, Dan Rokhsar ESTs: Erika Lindquist, Peter Brokstein Subcloned Fosmids: Jane Grimwood Peter Lammers & his group Ian Sanders & his group This work was performed under the auspices of the US Department of Energy's Office of Science, Biological and Environmental Research Program and by the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng- 48, Lawrence Berkeley National Laboratory under contract No. DE-AC03-76SF00098 and Los Alamos National Laboratory under contract No. W-7405-ENG-36. COST 8.38 Meeting, Dijon, 3/6/
27 AHZO: ATYW (Finished Fosmid) Alignments COST 8.38 Meeting, Dijon, 3/6/
28 AHZO: ATYY (Finished Fosmid) Alignments COST 8.38 Meeting, Dijon, 3/6/
29 AHZO: ATYZ (Finished Fosmid) Alignments COST 8.38 Meeting, Dijon, 3/6/
30 AHZO: ATZC (Finished Fosmid) Alignments COST 8.38 Meeting, Dijon, 3/6/
Genome Sequencing-- Strategies
Genome Sequencing-- Strategies Bio 4342 Spring 04 What is a genome? A genome can be defined as the entire DNA content of each nucleated cell in an organism Each organism has one or more chromosomes that
More informationSequence Assembly and Alignment. Jim Noonan Department of Genetics
Sequence Assembly and Alignment Jim Noonan Department of Genetics james.noonan@yale.edu www.yale.edu/noonanlab The assembly problem >>10 9 sequencing reads 36 bp - 1 kb 3 Gb Outline Basic concepts in genome
More informationSequence assembly. Jose Blanca COMAV institute bioinf.comav.upv.es
Sequence assembly Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing project Unknown sequence { experimental evidence result read 1 read 4 read 2 read 5 read 3 read 6 read 7 Computational requirements
More informationHow to Characterize Rust Isolates: Genome Sequencing, Molecular Markers and Differentials
How to Characterize Rust Isolates: Genome Sequencing, Molecular Markers and Differentials Dr. Reid D. Frederick USDA-Agricultural Research Service Foreign Disease-Weed Science Research Unit Fort Detrick,
More informationGenomics AGRY Michael Gribskov Hock 331
Genomics AGRY 60000 Michael Gribskov gribskov@purdue.edu Hock 331 Computing Essentials Resources In this course we will assemble and annotate both genomic and transcriptomic sequence assemblies We will
More informationThe Diploid Genome Sequence of an Individual Human
The Diploid Genome Sequence of an Individual Human Maido Remm Journal Club 12.02.2008 Outline Background (history, assembling strategies) Who was sequenced in previous projects Genome variations in J.
More informationUnfortunately, plate data was not available to generate an initial low coverage
Finishing Report Carolyn Cain Fosmid: XBAA-30G19 Fosmid XBAA-30G19 was challenging to assemble, but ultimately came together in a satisfactory final assembly. The initial problems in my assembly were gaps,
More informationBiol 478/595 Intro to Bioinformatics
Biol 478/595 Intro to Bioinformatics September M 1 Labor Day 4 W 3 MG Database Searching Ch. 6 5 F 5 MG Database Searching Hw1 6 M 8 MG Scoring Matrices Ch 3 and Ch 4 7 W 10 MG Pairwise Alignment 8 F 12
More informationCSE182-L16. LW statistics/assembly
CSE182-L16 LW statistics/assembly Silly Quiz Who are these people, and what is the occasion? Genome Sequencing and Assembly Sequencing A break at T is shown here. Measuring the lengths using electrophoresis
More informationGenome Assembly of the Obligate Crassulacean Acid Metabolism (CAM) Species Kalanchoë laxiflora
Genome Assembly of the Obligate Crassulacean Acid Metabolism (CAM) Species Kalanchoë laxiflora Jerry Jenkins 1, Xiaohan Yang 2, Hengfu Yin 2, Gerald A. Tuskan 2, John C. Cushman 3, Won Cheol Yim 3, Jason
More informationFinishing Fosmid DMAC-27a of the Drosophila mojavensis third chromosome
Finishing Fosmid DMAC-27a of the Drosophila mojavensis third chromosome Ruth Howe Bio 434W 27 February 2010 Abstract The fourth or dot chromosome of Drosophila species is composed primarily of highly condensed,
More informationGenome Projects. Part III. Assembly and sequencing of human genomes
Genome Projects Part III Assembly and sequencing of human genomes All current genome sequencing strategies are clone-based. 1. ordered clone sequencing e.g., C. elegans well suited for repetitive sequences
More informationCourse summary. Today. PCR Polymerase chain reaction. Obtaining molecular data. Sequencing. DNA sequencing. Genome Projects.
Goals Organization Labs Project Reading Course summary DNA sequencing. Genome Projects. Today New DNA sequencing technologies. Obtaining molecular data PCR Typically used in empirical molecular evolution
More informationTruSPAdes: analysis of variations using TruSeq Synthetic Long Reads (TSLR)
tru TruSPAdes: analysis of variations using TruSeq Synthetic Long Reads (TSLR) Anton Bankevich Center for Algorithmic Biotechnology, SPbSU Sequencing costs 1. Sequencing costs do not follow Moore s law
More informationAlignment and Assembly
Alignment and Assembly Genome assembly refers to the process of taking a large number of short DNA sequences and putting them back together to create a representation of the original chromosomes from which
More informationFinishing Drosophila grimshawi Fosmid Clone DGA23F17. Kenneth Smith Biology 434W Professor Elgin February 20, 2009
Finishing Drosophila grimshawi Fosmid Clone DGA23F17 Kenneth Smith Biology 434W Professor Elgin February 20, 2009 Smith 2 Abstract: My fosmid, DGA23F17, did not present too many challenges when I finished
More informationDirect determination of diploid genome sequences. Supplemental material: contents
Direct determination of diploid genome sequences Neil I. Weisenfeld, Vijay Kumar, Preyas Shah, Deanna M. Church, David B. Jaffe Supplemental material: contents Supplemental Note 1. Comparison of performance
More informationContact us for more information and a quotation
GenePool Information Sheet #1 Installed Sequencing Technologies in the GenePool The GenePool offers sequencing service on three platforms: Sanger (dideoxy) sequencing on ABI 3730 instruments Illumina SOLEXA
More informationSUPPLEMENTARY MATERIAL FOR THE PAPER: RASCAF: IMPROVING GENOME ASSEMBLY WITH RNA-SEQ DATA
SUPPLEMENTARY MATERIAL FOR THE PAPER: RASCAF: IMPROVING GENOME ASSEMBLY WITH RNA-SEQ DATA Authors: Li Song, Dhruv S. Shankar, Liliana Florea Table of contents: Figure S1. Methods finding contig connections
More informationFinishing of Fosmid 1042D14. Project 1042D14 is a roughly 40 kb segment of Drosophila ananassae
Schefkind 1 Adam Schefkind Bio 434W 03/08/2014 Finishing of Fosmid 1042D14 Abstract Project 1042D14 is a roughly 40 kb segment of Drosophila ananassae genomic DNA. Through a comprehensive analysis of forward-
More informationMapping strategies for sequence reads
Mapping strategies for sequence reads Ernest Turro University of Cambridge 21 Oct 2013 Quantification A basic aim in genomics is working out the contents of a biological sample. 1. What distinct elements
More informationImproving Genome Assemblies without Sequencing
Improving Genome Assemblies without Sequencing Michael Schatz April 25, 2005 TIGR Bioinformatics Seminar Assembly Pipeline Overview 1. Sequence shotgun reads 2. Call Bases 3. Trim Reads 4. Assemble phred/tracetuner/kb
More informationFinishing Drosophila Ananassae Fosmid 2728G16
Finishing Drosophila Ananassae Fosmid 2728G16 Kyle Jung March 8, 2013 Bio434W Professor Elgin Page 1 Abstract For my finishing project, I chose to finish fosmid 2728G16. This fosmid carries a segment of
More informationMolecular Biology: DNA sequencing
Molecular Biology: DNA sequencing Author: Prof Marinda Oosthuizen Licensed under a Creative Commons Attribution license. SEQUENCING OF LARGE TEMPLATES As we have seen, we can obtain up to 800 nucleotides
More informationIntroduction to metagenome assembly. Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014
Introduction to metagenome assembly Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014 Sequencing specs* Method Read length Accuracy Million reads Time Cost per M 454
More informationSundaram DGA43A19 Page 1. Finishing Drosophila grimshawi Fosmid: DGA43A19 Varun Sundaram 2/16/09
Sundaram DGA43A19 Page 1 Finishing Drosophila grimshawi Fosmid: DGA43A19 Varun Sundaram 2/16/09 Sundaram DGA43A19 Page 2 Abstract My project focused on the fosmid clone DGA43A19. The main problems with
More informationDe Novo and Hybrid Assembly
On the PacBio RS Introduction The PacBio RS utilizes SMRT technology to generate both Continuous Long Read ( CLR ) and Circular Consensus Read ( CCS ) data. In this document, we describe sequencing the
More informationDe novo meta-assembly of ultra-deep sequencing data
De novo meta-assembly of ultra-deep sequencing data Hamid Mirebrahim 1, Timothy J. Close 2 and Stefano Lonardi 1 1 Department of Computer Science and Engineering 2 Department of Botany and Plant Sciences
More informationMate-pair library data improves genome assembly
De Novo Sequencing on the Ion Torrent PGM APPLICATION NOTE Mate-pair library data improves genome assembly Highly accurate PGM data allows for de Novo Sequencing and Assembly For a draft assembly, generate
More informationFinished (Almost) Sequence of Drosophila littoralis Chromosome 4 Fosmid Clone XAAA73. Seth Bloom Biology 4342 March 7, 2004
Finished (Almost) Sequence of Drosophila littoralis Chromosome 4 Fosmid Clone XAAA73 Seth Bloom Biology 4342 March 7, 2004 Summary: I successfully sequenced Drosophila littoralis fosmid clone XAAA73. The
More information3) This diagram represents: (Indicate all correct answers)
Functional Genomics Midterm II (self-questions) 2/4/05 1) One of the obstacles in whole genome assembly is dealing with the repeated portions of DNA within the genome. How do repeats cause complications
More informationIllumina Read QC. UCD Genome Center Bioinformatics Core Monday 29 August 2016
Illumina Read QC UCD Genome Center Bioinformatics Core Monday 29 August 2016 QC should be interactive Error modes Each technology has unique error modes, depending on the physico-chemical processes involved
More informationNGS developments in tomato genome sequencing
NGS developments in tomato genome sequencing 16-02-2012, Sandra Smit TATGTTTTGGAAAACATTGCATGCGGAATTGGGTACTAGGTTGGACCTTAGTACC GCGTTCCATCCTCAGACCGATGGTCAGTCTGAGAGAACGATTCAAGTGTTGGAAG ATATGCTTCGTGCATGTGTGATAGAGTTTGGTGGCCATTGGGATAGCTTCTTACC
More information10/20/2009 Comp 590/Comp Fall
Lecture 14: DNA Sequencing Study Chapter 8.9 10/20/2009 Comp 590/Comp 790-90 Fall 2009 1 DNA Sequencing Shear DNA into millions of small fragments Read 500 700 nucleotides at a time from the small fragments
More informationFinishing Drosophila ananassae Fosmid 2410F24
Nick Spies Research Explorations in Genomics Finishing Report Elgin, Shaffer and Leung 23 February 2013 Abstract: Finishing Drosophila ananassae Fosmid 2410F24 Finishing Drosophila ananassae fosmid clone
More informationTranscriptome Assembly, Functional Annotation (and a few other related thoughts)
Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Monica Britton, Ph.D. Sr. Bioinformatics Analyst June 23, 2017 Differential Gene Expression Generalized Workflow File Types
More informationNEXT GENERATION SEQUENCING. Farhat Habib
NEXT GENERATION SEQUENCING HISTORY HISTORY Sanger Dominant for last ~30 years 1000bp longest read Based on primers so not good for repetitive or SNPs sites HISTORY Sanger Dominant for last ~30 years 1000bp
More informationThe Genome Analysis Centre. Building Excellence in Genomics and Computational Bioscience
Building Excellence in Genomics and Computational Bioscience Wheat genome sequencing: an update from TGAC Sequencing Technology Development now Plant & Microbial Genomics Group Leader Matthew Clark matt.clark@tgac.ac.uk
More informationAMOS Assembly Validation and Visualization
AMOS Assembly Validation and Visualization Michael Schatz Center for Bioinformatics and Computational Biology University of Maryland August 13, 2006 University of Hawaii Outline AMOS Validation Pipeline
More informationA shotgun introduction to sequence assembly (with Velvet) MCB Brem, Eisen and Pachter
A shotgun introduction to sequence assembly (with Velvet) MCB 247 - Brem, Eisen and Pachter Hot off the press January 27, 2009 06:00 AM Eastern Time llumina Launches Suite of Next-Generation Sequencing
More informationDe Novo Assembly of High-throughput Short Read Sequences
De Novo Assembly of High-throughput Short Read Sequences Chuming Chen Center for Bioinformatics and Computational Biology (CBCB) University of Delaware NECC Third Skate Genome Annotation Workshop May 23,
More informationCopyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation
Assembly of the Human Genome Daniel Huson Informatics Research Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation
More informationBENG 183 Trey Ideker. Genome Assembly and Physical Mapping
BENG 183 Trey Ideker Genome Assembly and Physical Mapping Reasons for sequencing Complete genome sequencing!!! Resequencing (Confirmatory) E.g., short regions containing single nucleotide polymorphisms
More informationNature Biotechnology: doi: /nbt Supplementary Figure 1. Number and length distributions of the inferred fosmids.
Supplementary Figure 1 Number and length distributions of the inferred fosmids. Fosmid were inferred by mapping each pool s sequence reads to hg19. We retained only those reads that mapped to within a
More informationRead Quality Assessment & Improvement. UCD Genome Center Bioinformatics Core Tuesday 14 June 2016
Read Quality Assessment & Improvement UCD Genome Center Bioinformatics Core Tuesday 14 June 2016 QA&I should be interactive Error modes Each technology has unique error modes, depending on the physico-chemical
More informationN50 must die!? Genome assembly workshop, Santa Cruz, 3/15/11
N50 must die!? Genome assembly workshop, Santa Cruz, 3/15/11 twitter: @assemblathon web: assemblathon.org Should N50 die in its role as a frequently used measure of genome assembly quality? Are there other
More informationMetagenomic 3C, full length 16S amplicon sequencing on Illumina, and the diabetic skin microbiome
Also: Sunaina Melissa Gardiner UTS Catherine Burke UTS Michael Liu UTS Chris Beitel UTS, UC Davis Matt DeMaere UTS Metagenomic 3C, full length 16S amplicon sequencing on Illumina, and the diabetic skin
More informationAnnotation Practice Activity [Based on materials from the GEP Summer 2010 Workshop] Special thanks to Chris Shaffer for document review Parts A-G
Annotation Practice Activity [Based on materials from the GEP Summer 2010 Workshop] Special thanks to Chris Shaffer for document review Parts A-G Introduction: A genome is the total genetic content of
More informationTargeted Sequencing Using Droplet-Based Microfluidics. Keith Brown Director, Sales
Targeted Sequencing Using Droplet-Based Microfluidics Keith Brown Director, Sales brownk@raindancetech.com Who we are: is a Provider of Microdroplet-based Solutions The Company s RainStorm TM Technology
More informationA Guide to Consed Michelle Itano, Carolyn Cain, Tien Chusak, Justin Richner, and SCR Elgin.
1 A Guide to Consed Michelle Itano, Carolyn Cain, Tien Chusak, Justin Richner, and SCR Elgin. Main Window Figure 1. The Main Window is the starting point when Consed is opened. From here, you can access
More informationChIP-seq and RNA-seq
ChIP-seq and RNA-seq Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions (ChIPchromatin immunoprecipitation)
More informationshort read genome assembly Sorin Istrail CSCI1820 Short-read genome assembly algorithms 3/6/2014
1 short read genome assembly Sorin Istrail CSCI1820 Short-read genome assembly algorithms 3/6/2014 2 Genomathica Assembler Mathematica notebook for genome assembly simulation Assembler can be found at:
More informationGenome Assembly Using de Bruijn Graphs. Biostatistics 666
Genome Assembly Using de Bruijn Graphs Biostatistics 666 Previously: Reference Based Analyses Individual short reads are aligned to reference Genotypes generated by examining reads overlapping each position
More informationWhole-genome haplotype reconstruction using proximity-ligation and shotgun sequencing (HaploSeq)
Whole-genome haplotype reconstruction using proximity-ligation and shotgun sequencing (HaploSeq) Lyon Lab Journal Clubs Han Fang 01/28/2014 Lyon Lab Journal Clubs 1 Lyon Lab Journal Clubs 2 Existing genome
More informationQuality Control of High Throughput EST & SAGE Sequence Production
European Spotfire Users Conference - Versailles, April 25-26, 2002 Quality Control of High Throughput EST & SAGE Sequence Production Marcel de Leeuw IT director Summary I. Application field II. Why quality
More information1 Abstract. 2 Introduction. 3 Requirements. 4 Procedure
1 Abstract 2 Introduction This SOP describes the procedure for creating the reference genomes database used for HMP WGS read mapping. The database is comprised of all archaeal, bacterial, lower eukaryote
More informationBLAST. compared with database sequences Sequences with many matches to high- scoring words are used for final alignments
BLAST 100 times faster than dynamic programming. Good for database searches. Derive a list of words of length w from query (e.g., 3 for protein, 11 for DNA) High-scoring words are compared with database
More informationRNA-Sequencing analysis
RNA-Sequencing analysis Markus Kreuz 25. 04. 2012 Institut für Medizinische Informatik, Statistik und Epidemiologie Content: Biological background Overview transcriptomics RNA-Seq RNA-Seq technology Challenges
More informationGenome evolution on the allotetraploid Xenopus laevis
Genome evolution on the allotetraploid Xenopus laevis Taejoon Kwon Department of Biomedical Engineering, School of Life Sciences Ulsan National Institute of Science & Technology (UNIST) Xenopus Bioinformatics
More informationParts of a standard FastQC report
FastQC FastQC, written by Simon Andrews of Babraham Bioinformatics, is a very popular tool used to provide an overview of basic quality control metrics for raw next generation sequencing data. There are
More informationAnnotating Fosmid 14p24 of D. Virilis chromosome 4
Lo 1 Annotating Fosmid 14p24 of D. Virilis chromosome 4 Lo, Louis April 20, 2006 Annotation Report Introduction In the first half of Research Explorations in Genomics I finished a 38kb fragment of chromosome
More informationBioinformatics for Genomics
Bioinformatics for Genomics It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material. When I was young my Father
More informationIntroduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph
Introduction to Plant Genomics and Online Resources Manish Raizada University of Guelph Genomics Glossary http://www.genomenewsnetwork.org/articles/06_00/sequence_primer.shtml Annotation Adding pertinent
More informationUHT Sequencing Course Large-scale genotyping. Christian Iseli January 2009
UHT Sequencing Course Large-scale genotyping Christian Iseli January 2009 Overview Introduction Examples Base calling method and parameters Reads filtering Reads classification Detailed alignment Alignments
More informationThe Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica
The Ensembl Database Dott.ssa Inga Prokopenko Corso di Genomica 1 www.ensembl.org Lecture 7.1 2 What is Ensembl? Public annotation of mammalian and other genomes Open source software Relational database
More informationSequencing the genomes of Nicotiana sylvestris and Nicotiana tomentosiformis Nicolas Sierro
Sequencing the genomes of Nicotiana sylvestris and Nicotiana tomentosiformis Nicolas Sierro Philip Morris International R&D, Philip Morris Products S.A., Neuchatel, Switzerland Introduction Nicotiana sylvestris
More informationChIP-seq and RNA-seq. Farhat Habib
ChIP-seq and RNA-seq Farhat Habib fhabib@iiserpune.ac.in Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions
More informationSupplementary Figure 1. Design of the control microarray. a, Genomic DNA from the
Supplementary Information Supplementary Figures Supplementary Figure 1. Design of the control microarray. a, Genomic DNA from the strain M8 of S. ruber and a fosmid containing the S. ruber M8 virus M8CR4
More informationChromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Supplementary Material
Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions Joshua N. Burton 1, Andrew Adey 1, Rupali P. Patwardhan 1, Ruolan Qiu 1, Jacob O. Kitzman 1, Jay Shendure 1 1 Department
More informationABSTRACT. We present a reliable, easy to implement algorithm to generate a set of highly
ABSTRACT Title of dissertation: IMPROVING GENOME ASSEMBLY Cevat Ustun, Doctor of Philosophy, 2005 Dissertation directed by: Professor Jim Yorke and Brian Hunt Department of Physics and Math We present
More informationHigh-Throughput Bioinformatics: Re-sequencing and de novo assembly. Elena Czeizler
High-Throughput Bioinformatics: Re-sequencing and de novo assembly Elena Czeizler 13.11.2015 Sequencing data Current sequencing technologies produce large amounts of data: short reads The outputted sequences
More informationSingle Nucleotide Variant Analysis. H3ABioNet May 14, 2014
Single Nucleotide Variant Analysis H3ABioNet May 14, 2014 Outline What are SNPs and SNVs? How do we identify them? How do we call them? SAMTools GATK VCF File Format Let s call variants! Single Nucleotide
More informationLecture 14: DNA Sequencing
Lecture 14: DNA Sequencing Study Chapter 8.9 10/17/2013 COMP 465 Fall 2013 1 Shear DNA into millions of small fragments Read 500 700 nucleotides at a time from the small fragments (Sanger method) DNA Sequencing
More informationIntroduction to Sequencher. Tom Randall Center for Bioinformatics
Introduction to Sequencher Tom Randall Center for Bioinformatics tarandal@email.unc.edu Introduction Importing, viewing and manipulating chromatographs Trimming chromatographs Assembly into contigs Editing
More informationDNA Sequencing and Assembly
DNA Sequencing and Assembly CS 262 Lecture Notes, Winter 2016 February 2nd, 2016 Scribe: Mark Berger Abstract In this lecture, we survey a variety of different sequencing technologies, including their
More informationWhole Genome Sequence Data Quality Control and Validation
Whole Genome Sequence Data Quality Control and Validation GoSeqIt ApS / Ved Klædebo 9 / 2970 Hørsholm VAT No. DK37842524 / Phone +45 26 97 90 82 / Web: www.goseqit.com / mail: mail@goseqit.com Table of
More informationNature Biotechnology: doi: /nbt Supplementary Figure 1. Read Complexity
Supplementary Figure 1 Read Complexity A) Density plot showing the percentage of read length masked by the dust program, which identifies low-complexity sequence (simple repeats). Scrappie outputs a significantly
More informationSupplementary Figure 1 Genotyping by Sequencing (GBS) pipeline used in this study to genotype maize inbred lines. The 14,129 maize inbred lines were
Supplementary Figure 1 Genotyping by Sequencing (GBS) pipeline used in this study to genotype maize inbred lines. The 14,129 maize inbred lines were processed following GBS experimental design 1 and bioinformatics
More informationIntroduction to DNA-Sequencing
informatics.sydney.edu.au sih.info@sydney.edu.au The Sydney Informatics Hub provides support, training, and advice on research data, analyses and computing. Talk to us about your computing infrastructure,
More informationCBC Data Therapy. Metagenomics Discussion
CBC Data Therapy Metagenomics Discussion General Workflow Microbial sample Generate Metaomic data Process data (QC, etc.) Analysis Marker Genes Extract DNA Amplify with targeted primers Filter errors,
More informationExploiting novel rice baseline datasets: WGS, BAC-based platinum genome sequencing and full-length transcriptomics
Exploiting novel rice baseline datasets: WGS, BAC-based platinum genome sequencing and full-length transcriptomics Dario Copetti, PhD Arizona Genomics Institute The University of Arizona International
More informationExperimental Design. Sequencing. Data Quality Control. Read mapping. Differential Expression analysis
-Seq Analysis Quality Control checks Reproducibility Reliability -seq vs Microarray Higher sensitivity and dynamic range Lower technical variation Available for all species Novel transcript identification
More informationSupplementary Data for Hybrid error correction and de novo assembly of single-molecule sequencing reads
Supplementary Data for Hybrid error correction and de novo assembly of single-molecule sequencing reads Online Resources Pre&compiledsourcecodeanddatasetsusedforthispublication: http://www.cbcb.umd.edu/software/pbcr
More informationNext Generation Sequences & Chloroplast Assembly. 8 June, 2012 Jongsun Park
Next Generation Sequences & Chloroplast Assembly 8 June, 2012 Jongsun Park Table of Contents 1 History of Sequencing Technologies 2 Genome Assembly Processes With NGS Sequences 3 How to Assembly Chloroplast
More informationGenome sequencing in Senecio squalidus
Genome sequencing in Senecio squalidus Outline of project A new NERC funded grant, the genomic basis of adaptation and species divergence in Senecio in collaboration with Richard Abbott and Dmitry Filatov
More informationConsensus Ensemble Approaches Improve De Novo Transcriptome Assemblies
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Computer Science and Engineering: Theses, Dissertations, and Student Research Computer Science and Engineering, Department
More informationGENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.
!! www.clutchprep.com CONCEPT: OVERVIEW OF GENOMICS Genomics is the study of genomes in their entirety Bioinformatics is the analysis of the information content of genomes - Genes, regulatory sequences,
More informationDe Novo Repeats Construction: Methods and Applications
University of Connecticut DigitalCommons@UConn Doctoral Dissertations University of Connecticut Graduate School 5-5-2017 De Novo Repeats Construction: Methods and Applications Chong Chu chong.chu@engr.uconn.edu
More informationDNA sequencing. Course Info
DNA sequencing EECS 458 CWRU Fall 2004 Readings: Pevzner Ch1-4 Adams, Fields & Venter (ISBN:0127170103) Serafim Batzoglou s slides Course Info Instructor: Jing Li 509 Olin Bldg Phone: X0356 Email: jingli@eecs.cwru.edu
More informationChapter 2: Access to Information
Chapter 2: Access to Information Outline Introduction to biological databases Centralized databases store DNA sequences Contents of DNA, RNA, and protein databases Central bioinformatics resources: NCBI
More informationFinishing Drosophila Grimshawi Fosmid Clone DGA19A15. Matthew Kwong Bio4342 Professor Elgin February 23, 2010
Finishing Drosophila Grimshawi Fosmid Clone DGA19A15 Matthew Kwong Bio4342 Professor Elgin February 23, 2010 1 Abstract The primary aim of the Bio4342 course, Research Explorations and Genomics, is to
More informationNext Genera*on Sequencing II: Personal Genomics. Jim Noonan Department of Gene*cs
Next Genera*on Sequencing II: Personal Genomics Jim Noonan Department of Gene*cs Personal genome sequencing Iden*fying the gene*c basis of phenotypic diversity among humans Gene*c risk factors for disease
More informationNext-generation sequencing and quality control: An introduction 2016
Next-generation sequencing and quality control: An introduction 2016 s.schmeier@massey.ac.nz http://sschmeier.com/bioinf-workshop/ Overview Typical workflow of a genomics experiment Genome versus transcriptome
More informationDraft 3 Annotation of DGA06H06, Contig 1 Jeannette Wong Bio4342W 27 April 2009
Page 1 Draft 3 Annotation of DGA06H06, Contig 1 Jeannette Wong Bio4342W 27 April 2009 Page 2 Introduction: Annotation is the process of analyzing the genomic sequence of an organism. Besides identifying
More informationTheory and Application of Multiple Sequence Alignments
Theory and Application of Multiple Sequence Alignments a.k.a What is a Multiple Sequence Alignment, How to Make One, and What to Do With It Brett Pickett, PhD History Structure of DNA discovered (1953)
More informationDe novo assembly of human genomes with massively parallel short read sequencing. Mikk Eelmets Journal Club
De novo assembly of human genomes with massively parallel short read sequencing Mikk Eelmets Journal Club 06.04.2010 Problem DNA sequencing technologies: Sanger sequencing (500-1000 bp) Next-generation
More informationTranscriptomics analysis with RNA seq: an overview Frederik Coppens
Transcriptomics analysis with RNA seq: an overview Frederik Coppens Platforms Applications Analysis Quantification RNA content Platforms Platforms Short (few hundred bases) Long reads (multiple kilobases)
More informationThe Basics of Understanding Whole Genome Next Generation Sequence Data
The Basics of Understanding Whole Genome Next Generation Sequence Data Heather Carleton-Romer, MPH, Ph.D. ASM-CDC Infectious Disease and Public Health Microbiology Postdoctoral Fellow PulseNet USA Next
More informationUsing the Potato Genome Sequence! Robin Buell! Michigan State University! Department of Plant Biology! August 15, 2010!
Using the Potato Genome Sequence! Robin Buell! Michigan State University! Department of Plant Biology! August 15, 2010! buell@msu.edu! 1 Whole Genome Shotgun Sequencing 2 New Technologies Revolutionize
More informationSupplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly
Supplementary Tables Supplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly Library Read length Raw data Filtered data insert size (bp) * Total Sequence depth Total Sequence
More information