Haploid Assembly of Diploid Genomes
|
|
- Maude Kennedy
- 6 years ago
- Views:
Transcription
1 Haploid Assembly of Diploid Genomes Challenges, Trials, Tribulations 13 October 2011 İnanç Birol
2 Assembly By Short Sequencing IEEE InfoVis
3 3
4 in Literature ~40 citations on tool comparisons ~20 citations on using ABySS for a biology study Crowded field 17 teams in Assemblathon 1 4 Overlap-Overlay-Consensus ARACHNE CAP3 Celera assembler MIRA Newbler Phred/Phrap SGA De Bruijn Graph Euler Velvet ABySS SOAPdenovo ALLPATHS
5 Assembly Problem TCGATCGATTTTCGGCCTAA read1 ATTTTCGGCCTAATATTAGG read2 GCATCGATCGATTTTCGGCCTAATATTAGGCCGATAATCGACGATC 5 A partial and unambiguous read-to-read alignment extends the length of sequence information First stage of an assembly algorithm is to find such alignments Assembly algorithms differ in the way they find and use these alignments
6 Algorithm SE Assembly: k-mer extension on a de Bruijn graph PE Assembly: search for unambiguous contig merging along paths d=6±5 d=5±4 Scaffolding: search for unambiguous linkage across distant contigs d=12±5 d=26±9 6
7 7 Software
8 De Bruijn Graph Description of read-to-read overlaps 2x4 possible extension of every k-mer Provides and O(n) algorithm for SE assembly GACATTGC seq1 GACATTAT seq2 GACAT ACATT CATTG CATTA ATTGC ATTAT 8 k = 5
9 Adjacency Graph Description of contig overlaps Built during SE assembly Overlap = k-1 bp Generalized during PE assembly Arbitrary overlap 9
10 Linkage Graph Built through read pairs aligned to different contigs PE reads from a tight fragment length distribution Reliable distance estimates MP reads from broader insert length distribution Noisy data Used in PE assembly (PE) and scaffolding (PE and MP) stages 10
11 Anchor Scrubbing homozygous variations Indel SNPs 11
12 Anchor Local directional assembly scaffold gap filling (bridging) extension (planking) 12
13 Case Study Mountain Pine Beetle Genome Assembly 13
14 Mountain Pine Beetle Genome Assembly statistics contigs scaffolds n 1,128,463 1,103,221 n:500bp 33,591 11,657 n:n50 4, N50 (bp) 11, ,443 Max (bp) 276,135 3,583,207 Reconstruction (Mb)
15 Assembly As a Hairball ABySS v1.2.7 PE/MP information disambiguates short contig extensions out in Node connectivity* * For contigs 2 kb 15
16 16 Scaffolding
17 Quality Assessment Alignment of 81,047,980 reads Before Anchor After Anchor Change Mapped 65,624,456 (80.97%) Paired 43,207,118 (53.31%) Single-end 9,536,178 (11.77%) Gene alignments 66,949,341 (82.60%) 44,732,320 (55.19%) 8,846,977 (10.92%) + 1,324, ,525, ,201 2,180 ESTs 248 Conserved Genes Complete Partial Complete Partial Contigs Scaffolds 1,
18 Date ABySS Version Data n:500 N50 Max Sum August x GAiix 81,431 1,526 20, e6 November x GAiix 104,958 2,333 55, e6 February x GAiix 157,081 2, , e6 July x GAiix 146,313 3, , e6 November x GAiix +1x GAiix (MP) 100,690 4, , e6 May , ,158 1,908, e6 July x HiSeq +1x HiSeq (MP) 11, ,443 3,583, e6 August , ,847 3,746, e6 18
19 19 Transcriptome Assembly
20 Transcriptome Sequencing RNA-seq protocol Brings information on how a genome acts Expression levels Allelic expression Present isoforms Gene fusions Other transcriptional events Post-transcriptional RNA editing Rodrigo Goya 20
21 Transcriptome Assembly Transcriptome assembly is different from genome assembly varying coverage levels varying expression levels split assembly paths isoforms/splice variants small contig sizes small product sizes Transcript models 21
22 22 What Overlap to Choose?
23 23 Selection of k
24 What Overlap to Choose? Selection of parameter k depends on read coverage depth Expression levels vary over 5 orders of magnitude 24
25 Assembly Merging buried parent untouched 25
26 Multi-k Assembly We capture a wide range of expression levels Gray: all transcripts with a read alignment Blue: at least 80% of a transcript in a single contig Red: at least 80% of a transcript is reconstructed 26
27 Trans-ABySS A versatile tool for Transcript reconstruction Gene identification InDel and SNV discovery Chimeric transcript discovery Gene fusions Trans-splicing Expression analysis 27
28 Transcriptome Assembly Trans-ABySS De novo assembly based on ABySS Cufflinks Scripture Reference-based assembly based on TopHat alignments [Trapnell et al., 2010; Guttman et al., 2010; Trapnell et al., 2009] 28
29 Events 29 + chimeric transcripts
30 Performance Compared to mapping-based analysis tools Trans-ABySS constructs as many transcripts with better sensitivity and specificity 30 [Trapnell et al., 2010; Guttman et al., 2010; Trapnell et al., 2009]
31 Case Study Acute Myeloid Leukemia Transcriptome Assembly 31
32 Fusions Lucas Swanson, Readman Chiu and Gordon Robertson Assembled transcriptome contigs span multiple genes Break point (usually) corresponds to exon boundaries Break point is supported by Spanning reads Read pairs linking regions Gene fusions are often drivers in AML and define subtypes (e.g. PML/RARα and M3 subtype) 32
33 Number of patients AML Gene Fusions % 71 events in 65/173 (38%) patients 30 different gene fusions identified 94% validation by RT-PCR sequencing Known AML fusion events (12) Known polymorphism (1) Novel fusion event (17) 10 5% 8 4% MLL fusions 6 4 Low frequency (<1%) Candidate fusion events Karen Mungall
34 Validation of a Novel Fusion Chr 17p13.1 Chr 19p13.2 DNA directed RNA polymerase II polypeptide A (POLR2A) 5 UTR Exon 1 2 Fibrillin 3 (FBN3) Exon M: 1kb plus DNA ladder 1: A00160 (2938) POLR2A-FBN3 5 UTR Exon 1 Exon 48 Exon 63 1 M EGF-like, calcium binding domains 505bp 34 Andy Mungall
35 Internal Tandem Duplications Contig alignments result in Query gaps Contiguous target blocks Read support on break point(s) Aberrant read pair distances Known AML ITDs: 29/173 (17%) harbour partial FLT3 exon 14 duplication 6/173 (3.5%) harbour partial WT1 exon 7 duplication Nakao et al., Leukemia 1996; Christiansen et al., Leukemia
36 Known ITD in FLT3 A 33 bp duplication in exon 14 CTCCCATttgagatcatattcatattctctgaaatcaacgTTGAGATCATATTCATATTCTCTGAAATCAACGTAGAA 36 Karen Mungall
37 Partial Tandem Duplications 2 3 Usually coexist with the wild-type PTD event manifested in a particular contig type A short contig with 50/50 split alignment Break point is supported by Spanning reads Read pairs in opposite orientation Known AML PTD: 10/173 (5.8%) harbour duplication of MLL exons 2-10 Dorrance et al., Blood 2008 Identified 88 genes with PTDs 37
38 Novel PTD in Arid1a Exons 2-4 tandemly repeated in 5 AML libraries WT CT Recurrent across tissues and species Source AML LBC Normal mouse NCBI EST Observations 5/173 Libraries 5/54 Libraries 3/7 Libraries colon_ins, placenta_normal 38
39 39 Summary
40 ABySS Team: Shaun Jackman Tony Raymond Rod Docking Beetle Project: Joerg Bohlmann Chris Keeling Nancy Liao Greg Taylor Simon Chan Diana Palmquist Trans-ABySS Team: Readman Chiu Karen Mungall Gordon Robertson Ka Ming Nip Jenny Qian Rong She Lucas Swanson AML Project: Richard Moore Yongjun Zhao Andy Mungall Aly Karsan GSC: Sequencing Team Library Core Systems Team Steven Jones Marco Marra
41 Final Hairball ABySS v1.2.7 Read pairs and inferred distances allow for scaffolding contigs scaffolds n 1,128,463 1,103,221 n:500bp 33,591 11,657 n:n50 4, N50 (bp) 11, ,443 Max (bp) 276,135 3,583,207 Reconstruction (Gb)
42 Biotin Read-Through circularized insert 42
43 43
44 Triage of MP Reads Challenge: A B B A Which one? 44 Information: Distances from contig ends Base mismatches on read ends Inferred contig orientations
45 Triage of MP Reads Read 1 Read 2 x xx MP-like x xxx x x x xxx PE-like MP-like PE-like MP-like PE-like 45
Mapping strategies for sequence reads
Mapping strategies for sequence reads Ernest Turro University of Cambridge 21 Oct 2013 Quantification A basic aim in genomics is working out the contents of a biological sample. 1. What distinct elements
More informationAnalysis of RNA-seq Data
Analysis of RNA-seq Data A physicist and an engineer are in a hot-air balloon. Soon, they find themselves lost in a canyon somewhere. They yell out for help: "Helllloooooo! Where are we?" 15 minutes later,
More informationDe novo assembly in RNA-seq analysis.
De novo assembly in RNA-seq analysis. Joachim Bargsten Wageningen UR/PRI/Plant Breeding October 2012 Motivation Transcriptome sequencing (RNA-seq) Gene expression / differential expression Reconstruct
More informationBarnacle: detecting and characterizing tandem duplications and fusions in transcriptome assemblies
Barnacle: detecting and characterizing tandem duplications and fusions in transcriptome assemblies The MIT Faculty has made this article openly available. Please share how this access benefits you. Your
More informationChIP-seq and RNA-seq
ChIP-seq and RNA-seq Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions (ChIPchromatin immunoprecipitation)
More informationDe novo genome assembly with next generation sequencing data!! "
De novo genome assembly with next generation sequencing data!! " Jianbin Wang" HMGP 7620 (CPBS 7620, and BMGN 7620)" Genomics lectures" 2/7/12" Outline" The need for de novo genome assembly! The nature
More informationDe novo assembly of human genomes with massively parallel short read sequencing. Mikk Eelmets Journal Club
De novo assembly of human genomes with massively parallel short read sequencing Mikk Eelmets Journal Club 06.04.2010 Problem DNA sequencing technologies: Sanger sequencing (500-1000 bp) Next-generation
More informationChIP-seq and RNA-seq. Farhat Habib
ChIP-seq and RNA-seq Farhat Habib fhabib@iiserpune.ac.in Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions
More informationRNA-Sequencing analysis
RNA-Sequencing analysis Markus Kreuz 25. 04. 2012 Institut für Medizinische Informatik, Statistik und Epidemiologie Content: Biological background Overview transcriptomics RNA-Seq RNA-Seq technology Challenges
More informationSequence Assembly and Alignment. Jim Noonan Department of Genetics
Sequence Assembly and Alignment Jim Noonan Department of Genetics james.noonan@yale.edu www.yale.edu/noonanlab The assembly problem >>10 9 sequencing reads 36 bp - 1 kb 3 Gb Outline Basic concepts in genome
More informationde novo paired-end short reads assembly
1/54 de novo paired-end short reads assembly Rayan Chikhi ENS Cachan Brittany Symbiose, Irisa, France 2/54 THESIS FOCUS Graph theory for assembly models Indexing large sequencing datasets Practical implementation
More informationIntroduction to RNA-Seq. David Wood Winter School in Mathematics and Computational Biology July 1, 2013
Introduction to RNA-Seq David Wood Winter School in Mathematics and Computational Biology July 1, 2013 Abundance RNA is... Diverse Dynamic Central DNA rrna Epigenetics trna RNA mrna Time Protein Abundance
More informationDe Novo Assembly of High-throughput Short Read Sequences
De Novo Assembly of High-throughput Short Read Sequences Chuming Chen Center for Bioinformatics and Computational Biology (CBCB) University of Delaware NECC Third Skate Genome Annotation Workshop May 23,
More informationDe novo whole genome assembly
De novo whole genome assembly Qi Sun Bioinformatics Facility Cornell University Sequencing platforms Short reads: o Illumina (150 bp, up to 300 bp) Long reads (>10kb): o PacBio SMRT; o Oxford Nanopore
More informationde novo Transcriptome Assembly Nicole Cloonan 1 st July 2013, Winter School, UQ
de novo Transcriptome Assembly Nicole Cloonan 1 st July 2013, Winter School, UQ de novo transcriptome assembly de novo from the Latin expression meaning from the beginning In bioinformatics, we often use
More informationGenomics and Transcriptomics of Spirodela polyrhiza
Genomics and Transcriptomics of Spirodela polyrhiza Doug Bryant Bioinformatics Core Facility & Todd Mockler Group, Donald Danforth Plant Science Center Desired Outcomes High-quality genomic reference sequence
More informationSCIENCE CHINA Life Sciences. Comparative analysis of de novo transcriptome assembly
SCIENCE CHINA Life Sciences SPECIAL TOPIC February 2013 Vol.56 No.2: 156 162 RESEARCH PAPER doi: 10.1007/s11427-013-4444-x Comparative analysis of de novo transcriptome assembly CLARKE Kaitlin 1, YANG
More informationRNA-SEQUENCING ANALYSIS
RNA-SEQUENCING ANALYSIS Joseph Powell SISG- 2018 CONTENTS Introduction to RNA sequencing Data structure Analyses Transcript counting Alternative splicing Allele specific expression Discovery APPLICATIONS
More informationNOW GENERATION SEQUENCING. Monday, December 5, 11
NOW GENERATION SEQUENCING 1 SEQUENCING TIMELINE 1953: Structure of DNA 1975: Sanger method for sequencing 1985: Human Genome Sequencing Project begins 1990s: Clinical sequencing begins 1998: NHGRI $1000
More informationOutline. The types of Illumina data Methods of assembly Repeats Selecting k-mer size Assembly Tools Assembly Diagnostics Assembly Polishing
Illumina Assembly 1 Outline The types of Illumina data Methods of assembly Repeats Selecting k-mer size Assembly Tools Assembly Diagnostics Assembly Polishing 2 Illumina Sequencing Paired end Illumina
More informationDe novo assembly and analysis of RNA-seq data
Nature Methods De novo assembly and analysis of RNA-seq data Gordon Robertson, Jacqueline Schein, Readman Chiu, Richard Corbett, Matthew Field, Shaun D Jackman, Karen Mungall, Sam Lee, Hisanaga Mark Okada,
More informationRNA-Seq. Joshua Ainsley, PhD Postdoctoral Researcher Lab of Leon Reijmers Neuroscience Department Tufts University
RNA-Seq Joshua Ainsley, PhD Postdoctoral Researcher Lab of Leon Reijmers Neuroscience Department Tufts University joshua.ainsley@tufts.edu Day five Alternative splicing Assembly RNA edits Alternative splicing
More informationDe novo whole genome assembly
De novo whole genome assembly Lecture 1 Qi Sun Minghui Wang Bioinformatics Facility Cornell University DNA Sequencing Platforms Illumina sequencing (100 to 300 bp reads) Overlapping reads ~180bp fragment
More informationEucalyptus gene assembly
Eucalyptus gene assembly ACGT Plant Biotechnology meeting Charles Hefer Bioinformatics and Computational Biology Unit University of Pretoria October 2011 About Eucalyptus Most valuable and widely planted
More informationNUCLEOTIDE RESOLUTION STRUCTURAL VARIATION DETECTION USING NEXT- GENERATION WHOLE GENOME RESEQUENCING
NUCLEOTIDE RESOLUTION STRUCTURAL VARIATION DETECTION USING NEXT- GENERATION WHOLE GENOME RESEQUENCING Ken Chen, Ph.D. kchen@genome.wustl.edu The Genome Center, Washington University in St. Louis The path
More informationState of the art de novo assembly of human genomes from massively parallel sequencing data
State of the art de novo assembly of human genomes from massively parallel sequencing data Yingrui Li, 1 Yujie Hu, 1,2 Lars Bolund 1,3 and Jun Wang 1,2* 1 BGI-Shenzhen, Shenzhen, Guangdong 518083, China
More informationIntroduction to RNA sequencing
Introduction to RNA sequencing Bioinformatics perspective Olga Dethlefsen NBIS, National Bioinformatics Infrastructure Sweden November 2017 Olga (NBIS) RNA-seq November 2017 1 / 49 Outline Why sequence
More informationIntroduction to metagenome assembly. Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014
Introduction to metagenome assembly Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014 Sequencing specs* Method Read length Accuracy Million reads Time Cost per M 454
More informationGenome Assembly Using de Bruijn Graphs. Biostatistics 666
Genome Assembly Using de Bruijn Graphs Biostatistics 666 Previously: Reference Based Analyses Individual short reads are aligned to reference Genotypes generated by examining reads overlapping each position
More informationCompute- and Data-Intensive Analyses in Bioinformatics"
Compute- and Data-Intensive Analyses in Bioinformatics" Wayne Pfeiffer SDSC/UCSD August 8, 2012 Questions for today" How big is the flood of data from high-throughput DNA sequencers? What bioinformatics
More informationThe Diploid Genome Sequence of an Individual Human
The Diploid Genome Sequence of an Individual Human Maido Remm Journal Club 12.02.2008 Outline Background (history, assembling strategies) Who was sequenced in previous projects Genome variations in J.
More informationExperimental Design. Sequencing. Data Quality Control. Read mapping. Differential Expression analysis
-Seq Analysis Quality Control checks Reproducibility Reliability -seq vs Microarray Higher sensitivity and dynamic range Lower technical variation Available for all species Novel transcript identification
More informationLecture 14: DNA Sequencing
Lecture 14: DNA Sequencing Study Chapter 8.9 10/17/2013 COMP 465 Fall 2013 1 Shear DNA into millions of small fragments Read 500 700 nucleotides at a time from the small fragments (Sanger method) DNA Sequencing
More informationDe novo whole genome assembly
De novo whole genome assembly Lecture 1 Qi Sun Bioinformatics Facility Cornell University Data generation Sequencing Platforms Short reads: Illumina Long reads: PacBio; Oxford Nanopore Contiging/Scaffolding
More information02 Agenda Item 03 Agenda Item
01 Agenda Item 02 Agenda Item 03 Agenda Item SOLiD 3 System: Applications Overview April 12th, 2010 Jennifer Stover Field Application Specialist - SOLiD Applications Workflow for SOLiD Application Application
More informationIntroduction to RNA-Seq in GeneSpring NGS Software
Introduction to RNA-Seq in GeneSpring NGS Software Dipa Roy Choudhury, Ph.D. Strand Scientific Intelligence and Agilent Technologies Learn more at www.genespring.com Introduction to RNA-Seq In a few years,
More informationRNA-Seq de novo assembly training
RNA-Seq de novo assembly training Training session aims Give you some keys elements to look at during read quality check. Transcriptome assembly is not completely a strait forward process : Multiple strategies
More informationTranscriptomics analysis with RNA seq: an overview Frederik Coppens
Transcriptomics analysis with RNA seq: an overview Frederik Coppens Platforms Applications Analysis Quantification RNA content Platforms Platforms Short (few hundred bases) Long reads (multiple kilobases)
More informationAnalysis of data from high-throughput molecular biology experiments Lecture 6 (F6, RNA-seq ),
Analysis of data from high-throughput molecular biology experiments Lecture 6 (F6, RNA-seq ), 2012-01-26 What is a gene What is a transcriptome History of gene expression assessment RNA-seq RNA-seq analysis
More informationSupplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly
Supplementary Tables Supplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly Library Read length Raw data Filtered data insert size (bp) * Total Sequence depth Total Sequence
More informationHigh-Throughput Bioinformatics: Re-sequencing and de novo assembly. Elena Czeizler
High-Throughput Bioinformatics: Re-sequencing and de novo assembly Elena Czeizler 13.11.2015 Sequencing data Current sequencing technologies produce large amounts of data: short reads The outputted sequences
More informationBIOINFORMATICS ORIGINAL PAPER
BIOINFORMATICS ORIGINAL PAPER Vol. 27 no. 21 2011, pages 2957 2963 doi:10.1093/bioinformatics/btr507 Genome analysis Advance Access publication September 7, 2011 : fast length adjustment of short reads
More information10/20/2009 Comp 590/Comp Fall
Lecture 14: DNA Sequencing Study Chapter 8.9 10/20/2009 Comp 590/Comp 790-90 Fall 2009 1 DNA Sequencing Shear DNA into millions of small fragments Read 500 700 nucleotides at a time from the small fragments
More informationGenomic resources. for non-model systems
Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing
More informationYellow-bellied marmot genome. Gabriela Pinho Graduate Student Blumstein & Wayne Labs EEB - UCLA
Yellow-bellied marmot genome Gabriela Pinho Graduate Student Blumstein & Wayne Labs EEB - UCLA Why do we need an annotated genome?.. Daniel T. Blumstein Kenneth B. Armitage 1962 2002 Samples & measurements
More informationSequence assembly. Jose Blanca COMAV institute bioinf.comav.upv.es
Sequence assembly Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing project Unknown sequence { experimental evidence result read 1 read 4 read 2 read 5 read 3 read 6 read 7 Computational requirements
More informationRNA-sequencing. Next Generation sequencing analysis Anne-Mette Bjerregaard. Center for biological sequence analysis (CBS)
RNA-sequencing Next Generation sequencing analysis 2016 Anne-Mette Bjerregaard Center for biological sequence analysis (CBS) Terms and definitions TRANSCRIPTOME The full set of RNA transcripts and their
More informationEfficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads. Supplemental Materials
Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads Supplemental Materials 1. Supplemental Methods... 3 1.1 Algorithm Detail... 3 1.1.1 k-mer coverage distribution
More informationA wide spectrum of somatic mutations in high-risk neuroblastoma
A wide spectrum of somatic mutations in high-risk neuroblastoma Olena Morozova Marra Lab Genome Sciences Centre VanBUG October 13, 2011 Neuroblastoma (NB): a childhood cancer of the sympathetic nervous
More informationTranscriptome analysis
Statistical Bioinformatics: Transcriptome analysis Stefan Seemann seemann@rth.dk University of Copenhagen April 11th 2018 Outline: a) How to assess the quality of sequencing reads? b) How to normalize
More informationGenome Assembly. J Fass UCD Genome Center Bioinformatics Core Friday September, 2015
Genome Assembly J Fass UCD Genome Center Bioinformatics Core Friday September, 2015 From reads to molecules What s the Problem? How to get the best assemblies for the smallest expense (sequencing) and
More informationRNA-Seq with the Tuxedo Suite
RNA-Seq with the Tuxedo Suite Monica Britton, Ph.D. Sr. Bioinformatics Analyst September 2015 Workshop The Basic Tuxedo Suite References Trapnell C, et al. 2009 TopHat: discovering splice junctions with
More informationDNA polymorphisms and RNA-Seq alternative splicing blow bubbles in de Bruijn Graphs
DNA polymorphisms and RNA-Seq alternative splicing blow bubbles in de Bruijn Graphs Nadia Pisanti University of Pisa & Leiden University Outline New Generation Sequencing (NGS), and the importance of detecting
More informationRNA-Seq analysis workshop
RNA-Seq analysis workshop Zhangjun Fei Boyce Thompson Institute for Plant Research USDA Robert W. Holley Center for Agriculture and Health Cornell University Outline Background of RNA-Seq Application of
More informationDe novo genome assembly. Dr Torsten Seemann
De novo genome assembly Dr Torsten Seemann IMB Winter School - Brisbane Mon 1 July 2013 Introduction Ideal world I would not need to give this talk! Human DNA Non-existent USB3 device AGTCTAGGATTCGCTA
More informationChromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Supplementary Material
Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions Joshua N. Burton 1, Andrew Adey 1, Rupali P. Patwardhan 1, Ruolan Qiu 1, Jacob O. Kitzman 1, Jay Shendure 1 1 Department
More informationGenome Sequencing-- Strategies
Genome Sequencing-- Strategies Bio 4342 Spring 04 What is a genome? A genome can be defined as the entire DNA content of each nucleated cell in an organism Each organism has one or more chromosomes that
More informationConcepts and methods in genome assembly and annotation
BCM-2002 Concepts and methods in genome assembly and annotation B. Franz LANG, Département de Biochimie Bureau: H307-15 Courrier électronique: Franz.Lang@Umontreal.ca Outline 1. What is genome assembly?
More informationSureSelect Target Enrichment for the Ion Proton TM Next Generation Sequencing System
SureSelect Target Enrichment for the Ion Proton TM Next Generation Sequencing System Demonstrated performance you can count on Christina Chiu Product Manager, SureSelect Kyeong Jeong Ph.D. R&D Scientist
More informationA shotgun introduction to sequence assembly (with Velvet) MCB Brem, Eisen and Pachter
A shotgun introduction to sequence assembly (with Velvet) MCB 247 - Brem, Eisen and Pachter Hot off the press January 27, 2009 06:00 AM Eastern Time llumina Launches Suite of Next-Generation Sequencing
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics Richard Corbett Canada s Michael Smith Genome Sciences Centre Vancouver, British Columbia June 28, 2017 Our mandate is to advance knowledge about cancer and other diseases
More informationTaking Advantage of Long RNA-Seq Reads. Vince Magrini Pacific Biosciences User Group Meeting September 18, 2013
Taking Advantage of Long RNA-Seq Reads Vince Magrini Pacific Biosciences User Group Meeting September 18, 2013 Overview Proof-of-Principle SMART-cDNA Synthesis PB-SBL size distributions Gene Annotation
More informationConsensus Ensemble Approaches Improve De Novo Transcriptome Assemblies
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Computer Science and Engineering: Theses, Dissertations, and Student Research Computer Science and Engineering, Department
More informationAnalysis of RNA-seq Data. Bernard Pereira
Analysis of RNA-seq Data Bernard Pereira The many faces of RNA-seq Applications Discovery Find new transcripts Find transcript boundaries Find splice junctions Comparison Given samples from different experimental
More informationRNA-Seq Module 2 From QC to differential gene expression.
RNA-Seq Module 2 From QC to differential gene expression. Ying Zhang Ph.D, Informatics Analyst Research Informatics Support System (RISS) MSI Apr. 24, 2012 RNA-Seq Tutorials Tutorial 1: Introductory (Mar.
More informationSurely Better Target Enrichment from Sample to Sequencer
sureselect TARGET ENRICHMENT solutions Surely Better Target Enrichment from Sample to Sequencer Agilent s market leading SureSelect platform provides a complete portfolio of catalog to custom products,
More informationCSE 549: RNA-Seq aided gene finding
CSE 549: RNA-Seq aided gene finding Finding Genes We ll break gene finding methods into 3 main categories. ab initio latin from the beginning w/o experimental evidence comparative make use of knowledge
More informationDe novo meta-assembly of ultra-deep sequencing data
De novo meta-assembly of ultra-deep sequencing data Hamid Mirebrahim 1, Timothy J. Close 2 and Stefano Lonardi 1 1 Department of Computer Science and Engineering 2 Department of Botany and Plant Sciences
More informationDE NOVO WHOLE GENOME ASSEMBLY AND SEQUENCING OF THE SUPERB FAIRYWREN. (Malurus cyaneus) JOSHUA PEÑALBA LEO JOSEPH CRAIG MORITZ ANDREW COCKBURN
DE NOVO WHOLE GENOME ASSEMBLY AND SEQUENCING OF THE SUPERB FAIRYWREN (Malurus cyaneus) JOSHUA PEÑALBA LEO JOSEPH CRAIG MORITZ ANDREW COCKBURN ... 2014 2015 2016 2017 ... 2014 2015 2016 2017 Synthetic
More informationMate-pair library data improves genome assembly
De Novo Sequencing on the Ion Torrent PGM APPLICATION NOTE Mate-pair library data improves genome assembly Highly accurate PGM data allows for de Novo Sequencing and Assembly For a draft assembly, generate
More informationresequencing storage SNP ncrna metagenomics private trio de novo exome ncrna RNA DNA bioinformatics RNA-seq comparative genomics
RNA Sequencing T TM variation genetics validation SNP ncrna metagenomics private trio de novo exome mendelian ChIP-seq RNA DNA bioinformatics custom target high-throughput resequencing storage ncrna comparative
More informationFunctional genomics to improve wheat disease resistance. Dina Raats Postdoctoral Scientist, Krasileva Group
Functional genomics to improve wheat disease resistance Dina Raats Postdoctoral Scientist, Krasileva Group Talk plan Goal: to contribute to the crop improvement by isolating YR resistance genes from cultivated
More informationAnalysis of structural variation. Alistair Ward USTAR Center for Genetic Discovery University of Utah
Analysis of structural variation Alistair Ward USTAR Center for Genetic Discovery University of Utah What is structural variation? What differentiates SV from short variants? What are the major SV types?
More informationAlignment and Assembly
Alignment and Assembly Genome assembly refers to the process of taking a large number of short DNA sequences and putting them back together to create a representation of the original chromosomes from which
More informationCSE182-L16. LW statistics/assembly
CSE182-L16 LW statistics/assembly Silly Quiz Who are these people, and what is the occasion? Genome Sequencing and Assembly Sequencing A break at T is shown here. Measuring the lengths using electrophoresis
More informationAssessing De-Novo Transcriptome Assemblies
Assessing De-Novo Transcriptome Assemblies Shawn T. O Neil Center for Genome Research and Biocomputing Oregon State University Scott J. Emrich University of Notre Dame 100K Contigs, Perfect 1M Contigs,
More informationDe novo Genome Assembly
De novo Genome Assembly A/Prof Torsten Seemann Winter School in Mathematical & Computational Biology - Brisbane, AU - 3 July 2017 Introduction The human genome has 47 pieces MT (or XY) The shortest piece
More informationGeneScissors: a comprehensive approach to detecting and correcting spurious transcriptome inference owing to RNA-seq reads misalignment
GeneScissors: a comprehensive approach to detecting and correcting spurious transcriptome inference owing to RNA-seq reads misalignment Zhaojun Zhang, Shunping Huang, Jack Wang, Xiang Zhang, Fernando Pardo
More informationAnalytics Behind Genomic Testing
A Quick Guide to the Analytics Behind Genomic Testing Elaine Gee, PhD Director, Bioinformatics ARUP Laboratories 1 Learning Objectives Catalogue various types of bioinformatics analyses that support clinical
More informationBioinformatics in next generation sequencing projects
Bioinformatics in next generation sequencing projects Rickard Sandberg Assistant Professor Department of Cell and Molecular Biology Karolinska Institutet May 2013 Standard sequence library generation Illumina
More informationGenome Assembly and Annotation of Isochrysis Galbana
Genome Assembly and Annotation of Isochrysis Galbana By: Yi Wang Institution: California State University San Marcos Date: May 14, 2014 Abstract Isochrysis Galbana is a species of cocoolithophores, which
More informationMachine Learning. HMM applications in computational biology
10-601 Machine Learning HMM applications in computational biology Central dogma DNA CCTGAGCCAACTATTGATGAA transcription mrna CCUGAGCCAACUAUUGAUGAA translation Protein PEPTIDE 2 Biological data is rapidly
More informationSurely Better Target Enrichment from Sample to Sequencer and Analysis
sureselect TARGET ENRIChment solutions Surely Better Target Enrichment from Sample to Sequencer and Analysis Agilent s market leading SureSelect platform provides a complete portfolio of catalog to custom
More informationTruSPAdes: analysis of variations using TruSeq Synthetic Long Reads (TSLR)
tru TruSPAdes: analysis of variations using TruSeq Synthetic Long Reads (TSLR) Anton Bankevich Center for Algorithmic Biotechnology, SPbSU Sequencing costs 1. Sequencing costs do not follow Moore s law
More informationDe Novo and Hybrid Assembly
On the PacBio RS Introduction The PacBio RS utilizes SMRT technology to generate both Continuous Long Read ( CLR ) and Circular Consensus Read ( CCS ) data. In this document, we describe sequencing the
More informationSOAPdenovo-Trans: De novo transcriptome assembly with short RNA-Seq reads
Sequence analysis : De novo transcriptome assembly with short RNA-Seq reads Yinlong Xie 1,2,3,, Gengxiong Wu 1,, Jingbo Tang 1,4,, Ruibang Luo 1,2,6,, Jordan Patterson 5, Shanlin Liu 1, Weihua Huang 1,
More informationQuality Control of Next Generation Sequence Data
Quality Control of Next Generation Sequence Data January 17, 2018 Kane Tse, Assistant Bioinformatics Coordinator Canada s Michael Smith Genome Sciences Centre BC Cancer Agency Canada s Michael Smith Genome
More informationAnalysis of structural variation. Alistair Ward - Boston College
Analysis of structural variation Alistair Ward - Boston College What is structural variation? What differentiates SV from short variants? What are the major SV types? Summary of MEI detection What is an
More informationCPSC 583 Fall 2010 biovis. Sheelagh Carpendale
CPSC 583 Fall 2010 biovis Sheelagh Carpendale BioVis Data generation Used to be expensive and time consuming Now recent innovations make it cost effective and rapid Bottle neck of discovery now in the
More informationAnalysis of neo-antigens to identify T-cell neo-epitopes in human Head & Neck cancer. Project XX1001. Customer Detail
Analysis of neo-antigens to identify T-cell neo-epitopes in human Head & Neck cancer Project XX Customer Detail Table of Contents. Bioinformatics analysis pipeline...3.. Read quality check. 3.2. Read alignment...3.3.
More informationFrom Infection to Genbank
From Infection to Genbank How a pathogenic bacterium gets its genome to NCBI Torsten Seemann VLSCI - Life Sciences Computation Centre - Genomics Theme - Lab Meeting - Friday 27 April 2012 The steps 1.
More informationPurpose of sequence assembly
Sequence Assembly Purpose of sequence assembly Reconstruct long DNA/RNA sequences from short sequence reads Genome sequencing RNA sequencing for gene discovery But not for transcript quantification Variant
More informationshort read genome assembly Sorin Istrail CSCI1820 Short-read genome assembly algorithms 3/6/2014
1 short read genome assembly Sorin Istrail CSCI1820 Short-read genome assembly algorithms 3/6/2014 2 Genomathica Assembler Mathematica notebook for genome assembly simulation Assembler can be found at:
More informationGENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.
!! www.clutchprep.com CONCEPT: OVERVIEW OF GENOMICS Genomics is the study of genomes in their entirety Bioinformatics is the analysis of the information content of genomes - Genes, regulatory sequences,
More informationSlide 1. Slide 2. Slide 3
Notes for Voice over on Sequencing Module Slide 1 The purpose of this presentation is to describe an adaptive approach to the sequencing of very large conifer genomes. Long considered a task so daunting
More informationPERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM and Look Ahead Approach
Title for Extending Contigs Using SVM and Look Ahead Approach Author(s) Zhu, X; Leung, HCM; Chin, FYL; Yiu, SM; Quan, G; Liu, B; Wang, Y Citation PLoS ONE, 2014, v. 9 n. 12, article no. e114253 Issued
More informationBENG 183 Trey Ideker. Genome Assembly and Physical Mapping
BENG 183 Trey Ideker Genome Assembly and Physical Mapping Reasons for sequencing Complete genome sequencing!!! Resequencing (Confirmatory) E.g., short regions containing single nucleotide polymorphisms
More informationHigh throughput sequencing technologies
High throughput sequencing technologies and NGS applications Mei-yeh Lu 呂美曄 High Throughput Sequencing Core Manager g g p q g g Academia Sinica 6/30/2011 Outlines Evolution of sequencing technologies Sanger
More informationSingle Nucleotide Variant Analysis. H3ABioNet May 14, 2014
Single Nucleotide Variant Analysis H3ABioNet May 14, 2014 Outline What are SNPs and SNVs? How do we identify them? How do we call them? SAMTools GATK VCF File Format Let s call variants! Single Nucleotide
More information