Purpose of sequence assembly
|
|
- Tabitha Stanley
- 6 years ago
- Views:
Transcription
1 Sequence Assembly
2 Purpose of sequence assembly Reconstruct long DNA/RNA sequences from short sequence reads Genome sequencing RNA sequencing for gene discovery But not for transcript quantification Variant discovery
3 Shear Genomic DNA chromosome sheared fragments
4 Sequence both ends of each fragment chromosome End sequences
5 Sequence both ends of each fragment chromosome End sequences
6 Align sequence reads to form contigs chromosome alignment
7 Paired ends allow linking of contigs into scaffolds chromosome contigs captured gaps scaffold In the sequence file, gaps are represented with Ns AGTCCCCTGGGAGATACGNNNNNNNNNNNNNNGATGATCAGCCGCATGAGCAG
8 Genome Assemblers
9 De Novo Genome Assembly Two major strategies: Overlap Layout Consensus Long reads 250 bp Pairwise comparison of reads to identify overlaps Eulerian paths/de Bruin graphs Short reads 250 bp Cataloging of subsequences (k-mers) Reconstruction of paths through the k-mers
10 Overlap Layout Consensus Fragment DNA Sequence fragments Compare all sequence reads in pairwise fashion Calculate number of overlapping bases Build a matrix
11 Overlap matrix
12 Determine Layout of Overlaps Examine best overlaps: Check their layout: GCATCGTG CATCGTGA 12. ATCGTGAT 20. AAGTGAAA 17. AGTGAAAC From: Computational Genome Analysis: An Introduction; Deonieret al.
13 Add new overlaps in a greedy fashion GACCGCAT ATGCGCAT GCATCGTG CATCGTGA ATCGTGAT GCGCATCG CGCAGCGC From: Computational Genome Analysis: An Introduction; Deonieret al.
14 Determine consensus sequence Consensus: GACCGCAT ATGCGCAT GCATCGTG CATCGTGA ATCGTGAT GCGCATCG CGCAGCGC GCGCATCGTGAT From: Computational Genome Analysis: An Introduction; Deonieret al.
15 OLC is computationally expensive 20 reads requires (20 x 20) 20 = 380 comparisons What about 10 million reads? An NP-complete problem
16 De Bruijn Graphs Break sequence reads into a set of overlapping subsequences of length k (k-mers) e.g. AGTTATCCG can be represented by the overlapping 3-mers: AGT, GTT, TTA, TAT, ATC, CCG Count how many times each k-mer occurs Place each k-mer at a node in a graph Make a path (edge) between nodes if their sequences overlap by k-1 (i.e. AGT ßà GTT) Assign the merged sequence to the edge (AGTT) Traverse each edge only once (or more if k-mer abundance implies a repeated sequence) Reconstruct genome from edge sequences
17 DeBruijn Graphs a C G A T A G T C G G Short-read sequencing b TGCAATG 3 GGCGTGC CGTGCAA ATGGCGT 5 CAATGGC Genome: ATGGCGT GGCGTGC CGTGCAA TGCAATG CAATGGC ATGGCGT ATGGCGTGCAATGGCGT Overlap Layout Consensus Vertices are k-mers Edges are pairwise alignments Vertices are (k 1)-mers Edges are k-mers De Bruijn Graphs c CAA 8 GCA 9 7 AAT TGC 10 6 ATG GTG 1 5 TGG CGT 2 4 GGC 3 GCG k-mers from vertices Genome: ATG TGG GGC GCG CGT GTG TGC GCA CAA AAT ATG ATGGCGTGCAATG k-mers from edges d CAA CA 9 AA GT AAT 10 5 CGT AT CG 4 GCG TG GTG 6 TGC 7 8 GCA ATG 1 GC TGG 2 GG 3 GGC Hamiltonian cycle Visit each vertex once (harder to solve) Eulerian cycle Visit each edge once (easier to solve) From Compeau et al., Nature Biotech, 2011
18 Eulerian cycles with sequencing errors a ATGG TGGC GGCG GCGT CGTG GTGC TGCA GCAA CAAT ATG TGG GGC GCG CGT GTG TGC GCA CAA AAT AATG b TGGA GGAG GAGT GGA GAG AGT ATGG TGGC GGCG GCGT CGTG GTGC TGCA GCAA CAAT ATG TGG GGC GCG CGT GTG TGC GCA CAA AAT AGTG c From Compeau et al., Nature Biotech, 2011
19 Eulerian cycle with repeated sequences CAA CA AA 13 GT AAT 14 4 CGT 8 9 AT CG ATG 1 3 GCG 7 TG TGG 10 GTG 5 6 TGC 2 GC GG GGC GCA Genome: ATG TGC GCG CGT GTG TGC GCG CGT GTG TGG GGC GCA CAA AAT ATG ATGCGGTGCGTGGCAATG From Compeau et al., Nature Biotech, 2011
20 It was the best de Bruijn Graph Assembly was the best of the best of times, best of times, it of times, it was times, it was the it was the worst was the worst of the worst of times, worst of times, it After graph construction, try to simplify the graph as much as possible it was the age was the age of the age of foolishness the age of wisdom, age of wisdom, it of wisdom, it was wisdom, it was the
21 de Bruijn Graph Assembly It was the best of times, it it was the worst of times, it of times, it was the the age of foolishness After graph construction, try to simplify the graph as much as possible it was the age of the age of wisdom, it was the
22 Reference-based assembly Useful when a high-quality reference genome sequence is available
23 Inchworm Assembles transcripts (dominant isoforms) Reports novel portions of alternative transcripts Chrysalis Clusters inchworm contigs into groups representing all isoforms for a given gene Builds de Bruijn graphs for each transcript Butterfly builds transcripts by using actual reads to trace paths through the graphs
24 Transcriptome assembly - Inchworm Paralogous genes: Gene A Contigs Gene B Alternative transcripts: Gene C Transcript 1 Transcript 2
25 Transcriptome assembly - Chrysalis Contigs
26 Transcriptome assembly - Butterfly
27 Assembly metrics No. of scaffolds/contigs Largest scaffold/contig N50 scaffold/contig size 50% of genome contained in scaffolds/contigs of size N50 L50 Minimum number of scaffolds/contigs with summed length 50% of genome Genome coverage (read coverage) Each base represented by an average of X reads
28 This Morning s Exercises Assemble a bacterial genome sequence Velvet Generate an interleaved dataset Choose a suitable k-mer range Run assemblies with different k-mer lengths Examine assembly metrics Discovar de novo Generate assembly Compare assembly metrics with Velvet Supplemental exercises Run assemblies using quality-trimmed input data Refine velvet k-mer range for optimal performance
Mapping strategies for sequence reads
Mapping strategies for sequence reads Ernest Turro University of Cambridge 21 Oct 2013 Quantification A basic aim in genomics is working out the contents of a biological sample. 1. What distinct elements
More informationGenome Assembly, part II. Tandy Warnow
Genome Assembly, part II Tandy Warnow How to apply de Bruijn graphs to genome assembly Phillip E C Compeau, Pavel A Pevzner & Glenn Tesler A mathematical concept known as a de Bruijn graph turns the formidable
More informationOutline. The types of Illumina data Methods of assembly Repeats Selecting k-mer size Assembly Tools Assembly Diagnostics Assembly Polishing
Illumina Assembly 1 Outline The types of Illumina data Methods of assembly Repeats Selecting k-mer size Assembly Tools Assembly Diagnostics Assembly Polishing 2 Illumina Sequencing Paired end Illumina
More informationPCR analysis was performed to show the presence and the integrity of the var1csa and var-
Supplementary information: Methods: Table S1: Primer Name Nucleotide sequence (5-3 ) DBL3-F tcc ccg cgg agt gaa aca tca tgt gac tg DBL3-R gac tag ttt ctt tca ata aat cac tcg c DBL5-F cgc cct agg tgc ttc
More informationLecture 11: Gene Prediction
Lecture 11: Gene Prediction Study Chapter 6.11-6.14 1 Gene: A sequence of nucleotides coding for protein Gene Prediction Problem: Determine the beginning and end positions of genes in a genome Where are
More informationIntroduction to metagenome assembly. Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014
Introduction to metagenome assembly Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014 Sequencing specs* Method Read length Accuracy Million reads Time Cost per M 454
More informationSupplemental Data Supplemental Figure 1.
Supplemental Data Supplemental Figure 1. Silique arrangement in the wild-type, jhs, and complemented lines. Wild-type (WT) (A), the jhs1 mutant (B,C), and the jhs1 mutant complemented with JHS1 (Com) (D)
More informationshort read genome assembly Sorin Istrail CSCI1820 Short-read genome assembly algorithms 3/6/2014
1 short read genome assembly Sorin Istrail CSCI1820 Short-read genome assembly algorithms 3/6/2014 2 Genomathica Assembler Mathematica notebook for genome assembly simulation Assembler can be found at:
More informationLecture 10, 20/2/2002: The process of solution development - The CODEHOP strategy for automatic design of consensus-degenerate primers for PCR
Lecture 10, 20/2/2002: The process of solution development - The CODEHOP strategy for automatic design of consensus-degenerate primers for PCR 1 The problem We wish to clone a yet unknown gene from a known
More informationde novo Transcriptome Assembly Nicole Cloonan 1 st July 2013, Winter School, UQ
de novo Transcriptome Assembly Nicole Cloonan 1 st July 2013, Winter School, UQ de novo transcriptome assembly de novo from the Latin expression meaning from the beginning In bioinformatics, we often use
More informationChIP-seq and RNA-seq. Farhat Habib
ChIP-seq and RNA-seq Farhat Habib fhabib@iiserpune.ac.in Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions
More informationChIP-seq and RNA-seq
ChIP-seq and RNA-seq Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions (ChIPchromatin immunoprecipitation)
More informationSupplementary. Table 1: Oligonucleotides and Plasmids. complementary to positions from 77 of the SRα '- GCT CTA GAG AAC TTG AAG TAC AGA CTG C
Supplementary Table 1: Oligonucleotides and Plasmids 913954 5'- GCT CTA GAG AAC TTG AAG TAC AGA CTG C 913955 5'- CCC AAG CTT ACA GTG TGG CCA TTC TGC TG 223396 5'- CGA CGC GTA CAG TGT GGC CAT TCT GCT G
More informationDe novo assembly in RNA-seq analysis.
De novo assembly in RNA-seq analysis. Joachim Bargsten Wageningen UR/PRI/Plant Breeding October 2012 Motivation Transcriptome sequencing (RNA-seq) Gene expression / differential expression Reconstruct
More informationSupporting information for Biochemistry, 1995, 34(34), , DOI: /bi00034a013
Supporting information for Biochemistry, 1995, 34(34), 10807 10815, DOI: 10.1021/bi00034a013 LESNIK 10807-1081 Terms & Conditions Electronic Supporting Information files are available without a subscription
More informationNGS part 2: applications. Tobias Österlund
NGS part 2: applications Tobias Österlund tobiaso@chalmers.se NGS part of the course Week 4 Friday 13/2 15.15-17.00 NGS lecture 1: Introduction to NGS, alignment, assembly Week 6 Thursday 26/2 08.00-09.45
More informationAssembly. Ian Misner, Ph.D. Bioinformatics Crash Course. Bioinformatics Core
Assembly Ian Misner, Ph.D. Bioinformatics Crash Course Multiple flavors to choose from De novo No prior sequence knowledge required Takes what you have and tries to build the best contigs/scaffolds possible
More information10/20/2009 Comp 590/Comp Fall
Lecture 14: DNA Sequencing Study Chapter 8.9 10/20/2009 Comp 590/Comp 790-90 Fall 2009 1 DNA Sequencing Shear DNA into millions of small fragments Read 500 700 nucleotides at a time from the small fragments
More informationLecture 14: DNA Sequencing
Lecture 14: DNA Sequencing Study Chapter 8.9 10/17/2013 COMP 465 Fall 2013 1 Shear DNA into millions of small fragments Read 500 700 nucleotides at a time from the small fragments (Sanger method) DNA Sequencing
More informationFigure S1. Characterization of the irx9l-1 mutant. (A) Diagram of the Arabidopsis IRX9L gene drawn based on information from TAIR (the Arabidopsis
1 2 3 4 5 6 7 8 9 10 11 12 Figure S1. Characterization of the irx9l-1 mutant. (A) Diagram of the Arabidopsis IRX9L gene drawn based on information from TAIR (the Arabidopsis Information Research). Exons
More informationIntroduction to Bioinformatics. Genome sequencing & assembly
Introduction to Bioinformatics Genome sequencing & assembly Genome sequencing & assembly p DNA sequencing How do we obtain DNA sequence information from organisms? p Genome assembly What is needed to put
More informationGenome Assembly CHRIS FIELDS MAYO-ILLINOIS COMPUTATIONAL GENOMICS WORKSHOP, JUNE 19, 2018
Genome Assembly CHRIS FIELDS MAYO-ILLINOIS COMPUTATIONAL GENOMICS WORKSHOP, JUNE 19, 2018 Overview What is genome assembly? Steps in a genome assembly Planning an assembly project QC assessment of assemblies
More informationElectronic Supplementary Information
Electronic Supplementary Material (ESI) for Molecular BioSystems. This journal is The Royal Society of Chemistry 2017 Electronic Supplementary Information Dissecting binding of a β-barrel outer membrane
More informationDe novo genome assembly with next generation sequencing data!! "
De novo genome assembly with next generation sequencing data!! " Jianbin Wang" HMGP 7620 (CPBS 7620, and BMGN 7620)" Genomics lectures" 2/7/12" Outline" The need for de novo genome assembly! The nature
More informationDisease and selection in the human genome 3
Disease and selection in the human genome 3 Ka/Ks revisited Please sit in row K or forward RBFD: human populations, adaptation and immunity Neandertal Museum, Mettman Germany Sequence genome Measure expression
More informationDe novo sequence assembly
2015.6.12 De novo sequence assembly 徐唯哲 Paul Wei Che HSU 中央研究院分子生物研究所研究助技師 Assistant Research Specialist Bioinformatics Service Core, Institute of Molecular Biology, Academia Sinica, Taiwan, R.O.C. Bioinformatics
More informationDe novo assembly of human genomes with massively parallel short read sequencing. Mikk Eelmets Journal Club
De novo assembly of human genomes with massively parallel short read sequencing Mikk Eelmets Journal Club 06.04.2010 Problem DNA sequencing technologies: Sanger sequencing (500-1000 bp) Next-generation
More informationDe Novo Assembly of High-throughput Short Read Sequences
De Novo Assembly of High-throughput Short Read Sequences Chuming Chen Center for Bioinformatics and Computational Biology (CBCB) University of Delaware NECC Third Skate Genome Annotation Workshop May 23,
More informationSupplementary Information. Construction of Lasso Peptide Fusion Proteins
Supplementary Information Construction of Lasso Peptide Fusion Proteins Chuhan Zong 1, Mikhail O. Maksimov 2, A. James Link 2,3 * Departments of 1 Chemistry, 2 Chemical and Biological Engineering, and
More informationCSCI2950-C DNA Sequencing and Fragment Assembly
CSCI2950-C DNA Sequencing and Fragment Assembly Lecture 2: Sept. 7, 2010 http://cs.brown.edu/courses/csci2950-c/ DNA sequencing How we obtain the sequence of nucleotides of a species 5 3 ACGTGACTGAGGACCGTG
More informationORFs and genes. Please sit in row K or forward
ORFs and genes Please sit in row K or forward https://www.flickr.com/photos/teseum/3231682806/in/photostream/ Question: why do some strains of Vibrio cause cholera and others don t? Methods Mechanisms
More informationG+C content. 1 Introduction. 2 Chromosomes Topology & Counts. 3 Genome size. 4 Replichores and gene orientation. 5 Chirochores.
1 Introduction 2 Chromosomes Topology & Counts 3 Genome size 4 Replichores and gene orientation 5 Chirochores 6 7 Codon usage 121 marc.bailly-bechet@univ-lyon1.fr Bacterial genome structures Introduction
More informationSequence assembly. Jose Blanca COMAV institute bioinf.comav.upv.es
Sequence assembly Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing project Unknown sequence { experimental evidence result read 1 read 4 read 2 read 5 read 3 read 6 read 7 Computational requirements
More informationTranscriptome analysis
Statistical Bioinformatics: Transcriptome analysis Stefan Seemann seemann@rth.dk University of Copenhagen April 11th 2018 Outline: a) How to assess the quality of sequencing reads? b) How to normalize
More informationSequence Assembly and Alignment. Jim Noonan Department of Genetics
Sequence Assembly and Alignment Jim Noonan Department of Genetics james.noonan@yale.edu www.yale.edu/noonanlab The assembly problem >>10 9 sequencing reads 36 bp - 1 kb 3 Gb Outline Basic concepts in genome
More informationII 0.95 DM2 (RPP1) DM3 (At3g61540) b
Table S2. F 2 Segregation Ratios at 16 C, Related to Figure 2 Cross n c Phenotype Model e 2 Locus A Locus B Normal F 1 -like Enhanced d Uk-1/Uk-3 149 64 36 49 DM2 (RPP1) DM1 (SSI4) a Bla-1/Hh-0 F 3 111
More informationPGRP negatively regulates NOD-mediated cytokine production in rainbow trout liver cells
Supplementary Information for: PGRP negatively regulates NOD-mediated cytokine production in rainbow trout liver cells Ju Hye Jang 1, Hyun Kim 2, Mi Jung Jang 2, Ju Hyun Cho 1,2,* 1 Research Institute
More informationMeta-IDBA: A de Novo Assembler for Metagenomic Data
Category Meta-IDBA: A de Novo Assembler for Metagenomic Data Yu Peng 1, Henry C.M. Leung 1, S.M. Yiu 1 and Francis Y.L. Chin 1,* 1 Department of Computer Science, Rm 301 Chow Yei Ching Building, The University
More informationY-chromosomal haplogroup typing Using SBE reaction
Schematic of multiplex PCR followed by SBE reaction Multiplex PCR Exo SAP purification SBE reaction 5 A 3 ddatp ddgtp 3 T 5 A G 3 T 5 3 5 G C 5 3 3 C 5 ddttp ddctp 5 T 3 T C 3 A 5 3 A 5 5 C 3 3 G 5 3 G
More informationSupplemental Data. mir156-regulated SPL Transcription. Factors Define an Endogenous Flowering. Pathway in Arabidopsis thaliana
Cell, Volume 138 Supplemental Data mir156-regulated SPL Transcription Factors Define an Endogenous Flowering Pathway in Arabidopsis thaliana Jia-Wei Wang, Benjamin Czech, and Detlef Weigel Table S1. Interaction
More informationSupplement 1: Sequences of Capture Probes. Capture probes were /5AmMC6/CTG TAG GTG CGG GTG GAC GTA GTC
Supplementary Appendixes Supplement 1: Sequences of Capture Probes. Capture probes were /5AmMC6/CTG TAG GTG CGG GTG GAC GTA GTC ACG TAG CTC CGG CTG GA-3 for vimentin, /5AmMC6/TCC CTC GCG CGT GGC TTC CGC
More informationSCIENCE CHINA Life Sciences. Comparative analysis of de novo transcriptome assembly
SCIENCE CHINA Life Sciences SPECIAL TOPIC February 2013 Vol.56 No.2: 156 162 RESEARCH PAPER doi: 10.1007/s11427-013-4444-x Comparative analysis of de novo transcriptome assembly CLARKE Kaitlin 1, YANG
More informationMaterials Protein synthesis kit. This kit consists of 24 amino acids, 24 transfer RNAs, four messenger RNAs and one ribosome (see below).
Protein Synthesis Instructions The purpose of today s lab is to: Understand how a cell manufactures proteins from amino acids, using information stored in the genetic code. Assemble models of four very
More informationde novo metagenome assembly
1 de novo metagenome assembly Rayan Chikhi CNRS Univ. Lille 1 Formation metagenomique de novo metagenomics 2 de novo metagenomics Goal: biological sense out of sequencing data Techniques: 1. de novo assembly
More informationDe novo whole genome assembly
De novo whole genome assembly Qi Sun Bioinformatics Facility Cornell University Sequencing platforms Short reads: o Illumina (150 bp, up to 300 bp) Long reads (>10kb): o PacBio SMRT; o Oxford Nanopore
More informationAnalysis of RNA-seq Data
Analysis of RNA-seq Data A physicist and an engineer are in a hot-air balloon. Soon, they find themselves lost in a canyon somewhere. They yell out for help: "Helllloooooo! Where are we?" 15 minutes later,
More informationNext Generation Sequences & Chloroplast Assembly. 8 June, 2012 Jongsun Park
Next Generation Sequences & Chloroplast Assembly 8 June, 2012 Jongsun Park Table of Contents 1 History of Sequencing Technologies 2 Genome Assembly Processes With NGS Sequences 3 How to Assembly Chloroplast
More informationevaluated with UAS CLB eliciting UAS CIT -N Libraries increase in the
Supplementary Figures Supplementary Figure 1: Promoter scaffold library assemblies. Many ensembless of libraries were evaluated in this work. As a legend, the box outline color in top half of the figure
More informationAdd 5µl of 3N NaOH to DNA sample (final concentration 0.3N NaOH).
Bisulfite Treatment of DNA Dilute DNA sample to 2µg DNA in 50µl ddh 2 O. Add 5µl of 3N NaOH to DNA sample (final concentration 0.3N NaOH). Incubate in a 37ºC water bath for 30 minutes. To 55µl samples
More informationTable S1. Bacterial strains (Related to Results and Experimental Procedures)
Table S1. Bacterial strains (Related to Results and Experimental Procedures) Strain number Relevant genotype Source or reference 1045 AB1157 Graham Walker (Donnelly and Walker, 1989) 2458 3084 (MG1655)
More informationDe novo whole genome assembly
De novo whole genome assembly Lecture 1 Qi Sun Bioinformatics Facility Cornell University Data generation Sequencing Platforms Short reads: Illumina Long reads: PacBio; Oxford Nanopore Contiging/Scaffolding
More informationSupplementary Figure 1A A404 Cells +/- Retinoic Acid
Supplementary Figure 1A A44 Cells +/- Retinoic Acid 1 1 H3 Lys4 di-methylation SM-actin VEC cfos (-) RA (+) RA 14 1 1 8 6 4 H3 Lys79 di-methylation SM-actin VEC cfos (-) RA (+) RA Supplementary Figure
More informationBioinformatic analysis of Illumina sequencing data for comparative genomics Part I
Bioinformatic analysis of Illumina sequencing data for comparative genomics Part I Dr David Studholme. 18 th February 2014. BIO1033 theme lecture. 1 28 February 2014 @davidjstudholme 28 February 2014 @davidjstudholme
More informationHes6. PPARα. PPARγ HNF4 CD36
SUPPLEMENTARY INFORMATION Supplementary Table Positions and Sequences of ChIP primers -63 AGGTCACTGCCA -79 AGGTCTGCTGTG Hes6-0067 GGGCAaAGTTCA ACOT -395 GGGGCAgAGTTCA PPARα -309 GGCTCAaAGTTCAaGTTCA CPTa
More informationGenomics and Gene Recognition Genes and Blue Genes
Genomics and Gene Recognition Genes and Blue Genes November 1, 2004 Prokaryotic Gene Structure prokaryotes are simplest free-living organisms studying prokaryotes can give us a sense what is the minimum
More informationSAY IT WITH DNA: Protein Synthesis Activity by Larry Flammer
TEACHER S GUIDE SAY IT WITH DNA: Protein Synthesis Activity by Larry Flammer SYNOPSIS This activity uses the metaphor of decoding a secret message for the Protein Synthesis process. Students teach themselves
More informationΔPDD1 x ΔPDD1. ΔPDD1 x wild type. 70 kd Pdd1. Pdd3
Supplemental Fig. S1 ΔPDD1 x wild type ΔPDD1 x ΔPDD1 70 kd Pdd1 50 kd 37 kd Pdd3 Supplemental Fig. S1. ΔPDD1 strains express no detectable Pdd1 protein. Western blot analysis of whole-protein extracts
More informationRNA-Seq. Joshua Ainsley, PhD Postdoctoral Researcher Lab of Leon Reijmers Neuroscience Department Tufts University
RNA-Seq Joshua Ainsley, PhD Postdoctoral Researcher Lab of Leon Reijmers Neuroscience Department Tufts University joshua.ainsley@tufts.edu Day five Alternative splicing Assembly RNA edits Alternative splicing
More informationSearch for and Analysis of Single Nucleotide Polymorphisms (SNPs) in Rice (Oryza sativa, Oryza rufipogon) and Establishment of SNP Markers
DNA Research 9, 163 171 (2002) Search for and Analysis of Single Nucleotide Polymorphisms (SNPs) in Rice (Oryza sativa, Oryza rufipogon) and Establishment of SNP Markers Shinobu Nasu, Junko Suzuki, Rieko
More informationSupplemental material
Supplemental material Diversity of O-antigen repeat-unit structures can account for the substantial sequence variation of Wzx translocases Yaoqin Hong and Peter R. Reeves School of Molecular Bioscience,
More informationDe novo Genome Assembly
De novo Genome Assembly A/Prof Torsten Seemann Winter School in Mathematical & Computational Biology - Brisbane, AU - 3 July 2017 Introduction The human genome has 47 pieces MT (or XY) The shortest piece
More informationDe novo whole genome assembly
De novo whole genome assembly Lecture 1 Qi Sun Minghui Wang Bioinformatics Facility Cornell University DNA Sequencing Platforms Illumina sequencing (100 to 300 bp reads) Overlapping reads ~180bp fragment
More informationRNA-sequencing. Next Generation sequencing analysis Anne-Mette Bjerregaard. Center for biological sequence analysis (CBS)
RNA-sequencing Next Generation sequencing analysis 2016 Anne-Mette Bjerregaard Center for biological sequence analysis (CBS) Terms and definitions TRANSCRIPTOME The full set of RNA transcripts and their
More informationSupporting Online Information
Supporting Online Information Isolation of Human Genomic DNA Sequences with Expanded Nucleobase Selectivity Preeti Rathi, Sara Maurer, Grzegorz Kubik and Daniel Summerer* Department of Chemistry and Chemical
More informationSUPPLEMENTARY INFORMATION
1. RNA/DNA sequences used in this study 2. Height and stiffness measurements on hybridized molecules 3. Stiffness maps at varying concentrations of target DNA 4. Stiffness measurements on RNA/DNA hybrids.
More informationRNA-Seq with the Tuxedo Suite
RNA-Seq with the Tuxedo Suite Monica Britton, Ph.D. Sr. Bioinformatics Analyst September 2015 Workshop The Basic Tuxedo Suite References Trapnell C, et al. 2009 TopHat: discovering splice junctions with
More informationDe novo genome assembly. Dr Torsten Seemann
De novo genome assembly Dr Torsten Seemann IMB Winter School - Brisbane Mon 1 July 2013 Introduction Ideal world I would not need to give this talk! Human DNA Non-existent USB3 device AGTCTAGGATTCGCTA
More informationSUPPLEMENTARY MATERIALS AND METHODS. E. coli strains, plasmids, and growth conditions. Escherichia coli strain P90C (1)
SUPPLEMENTARY MATERIALS AND METHODS E. coli strains, plasmids, and growth conditions. Escherichia coli strain P90C (1) dinb::kan (lab stock) derivative was used as wild-type. MG1655 alka tag dinb (2) is
More informationSupplementary Materials for
www.sciencesignaling.org/cgi/content/full/10/494/eaan6284/dc1 Supplementary Materials for Activation of master virulence regulator PhoP in acidic ph requires the Salmonella-specific protein UgtL Jeongjoon
More informationSupplemental Data. Bennett et al. (2010). Plant Cell /tpc
BRN1 ---------MSSSNGGVPPGFRFHPTDEELLHYYLKKKISYEKFEMEVIKEVDLNKIEPWDLQDRCKIGSTPQNEWYFFSHKDRKYPTGS 81 BRN2 --------MGSSSNGGVPPGFRFHPTDEELLHYYLKKKISYQKFEMEVIREVDLNKLEPWDLQERCKIGSTPQNEWYFFSHKDRKYPTGS 82 SMB
More informationOutline. DNA Sequencing. Whole Genome Shotgun Sequencing. Sequencing Coverage. Whole Genome Shotgun Sequencing 3/28/15
Outline Introduction Lectures 22, 23: Sequence Assembly Spring 2015 March 27, 30, 2015 Sequence Assembly Problem Different Solutions: Overlap-Layout-Consensus Assembly Algorithms De Bruijn Graph Based
More informationGenome Assembly. J Fass UCD Genome Center Bioinformatics Core Friday September, 2015
Genome Assembly J Fass UCD Genome Center Bioinformatics Core Friday September, 2015 From reads to molecules What s the Problem? How to get the best assemblies for the smallest expense (sequencing) and
More informationRNASEQ WITHOUT A REFERENCE
RNASEQ WITHOUT A REFERENCE Experimental Design Assembly in Non-Model Organisms And other (hopefully useful) Stuff Meg Staton mstaton1@utk.edu University of Tennessee Knoxville, TN I. Project Design Things
More informationSUPPLEMENTAL MATERIAL GENOTYPING WITH MULTIPLEXING TARGETED RESEQUENCING
SUPPLEMENTAL MATERIAL GENOTYPING WITH MULTIPLEXING TARGETED RESEQUENCING All of the patients and control subjects were sequenced and genotyped in the same way. Shotgun libraries of approximately 250 bp
More informationGenome Assembly Using de Bruijn Graphs. Biostatistics 666
Genome Assembly Using de Bruijn Graphs Biostatistics 666 Previously: Reference Based Analyses Individual short reads are aligned to reference Genotypes generated by examining reads overlapping each position
More informationBiol 478/595 Intro to Bioinformatics
Biol 478/595 Intro to Bioinformatics September M 1 Labor Day 4 W 3 MG Database Searching Ch. 6 5 F 5 MG Database Searching Hw1 6 M 8 MG Scoring Matrices Ch 3 and Ch 4 7 W 10 MG Pairwise Alignment 8 F 12
More informationNESTED Sequence-based Typing (SBT) protocol for epidemiological typing of Legionella pneumophila directly from clinical samples
NESTED Sequence-based Typing (SBT) protocol for epidemiological typing of Legionella pneumophila directly from clinical samples VERSION 2.0 SUMMARY This procedure describes the use of nested Sequence-Based
More informationOverexpression Normal expression Overexpression Normal expression. 26 (21.1%) N (%) P-value a N (%)
SUPPLEMENTARY TABLES Table S1. Alteration of ZNF322A protein expression levels in relation to clinicopathological parameters in 123 Asian and 74 Caucasian lung cancer patients. Asian patients Caucasian
More informationGenome Sequencing and Assembly
Genome Sequencing and Assembly History of Sequencing What was the first fully sequenced nucleic acid? Yeast trna (alanine trna) Robert Holley 1965 Image: Wikipedia History of Sequencing Sequencing began
More informationDierks Supplementary Fig. S1
Dierks Supplementary Fig. S1 ITK SYK PH TH K42R wt K42R (kinase deficient) R29C E42K Y323F R29C E42K Y323F (reduced phospholipid binding) (enhanced phospholipid binding) (reduced Cbl binding) E42K Y323F
More informationCauses and Effects of N-Terminal Codon Bias in Bacterial Genes. Mikk Eelmets Journal Club
Causes and Effects of N-Terminal Codon Bias in Bacterial Genes Mikk Eelmets Journal Club 21.2.214 Introduction Ribosomes were first observed in the mid-195s (Nobel Prize in 1974) Nobel Prize in 29 for
More informationDe Novo Co-Assembly Of Bacterial Genomes From Multiple Single Cells
Wayne State University Wayne State University Theses 1-1-2014 De Novo Co-Assembly Of Bacterial Genomes From Multiple Single Cells Narjes Sadat Movahedi Tabrizi Wayne State University, Follow this and additional
More informationSUPPLEMENTARY INFORMATION
doi: 10.1038/nature07182 SUPPLEMENTAL FIGURES AND TABLES Fig. S1. myf5-expressing cells give rise to brown fat depots and skeletal muscle (a) Perirenal BAT from control (cre negative) and myf5-cre:r26r3-yfp
More informationHomework. A bit about the nature of the atoms of interest. Project. The role of electronega<vity
Homework Why cited articles are especially useful. citeulike science citation index When cutting and pasting less is more. Project Your protein: I will mail these out this weekend If you haven t gotten
More informationIntroduction to RNA sequencing
Introduction to RNA sequencing Bioinformatics perspective Olga Dethlefsen NBIS, National Bioinformatics Infrastructure Sweden November 2017 Olga (NBIS) RNA-seq November 2017 1 / 49 Outline Why sequence
More informationArabidopsis actin depolymerizing factor AtADF4 mediates defense signal transduction triggered by the Pseudomonas syringae effector AvrPphB
Arabidopsis actin depolymerizing factor mediates defense signal transduction triggered by the Pseudomonas syringae effector AvrPphB Files in this Data Supplement: Supplemental Table S1 Supplemental Table
More informationstrain devoid of the aox1 gene [1]. Thus, the identification of AOX1 in the intracellular
Additional file 2 Identification of AOX1 in P. pastoris GS115 with a Mut s phenotype Results and Discussion The HBsAg producing strain was originally identified as a Mut s (methanol utilization slow) strain
More informationSupplementary Figures
Supplementary Figures Supplementary Fig. 1 Characterization of GSCs. a. Immunostaining of primary GSC spheres from GSC lines. Nestin (neural progenitor marker, red), TLX (green). Merged images of nestin,
More informationCat. # Product Size DS130 DynaExpress TA PCR Cloning Kit (ptakn-2) 20 reactions Box 1 (-20 ) ptakn-2 Vector, linearized 20 µl (50 ng/µl) 1
Product Name: Kit Component TA PCR Cloning Kit (ptakn-2) Cat. # Product Size DS130 TA PCR Cloning Kit (ptakn-2) 20 reactions Box 1 (-20 ) ptakn-2 Vector, linearized 20 µl (50 ng/µl) 1 2 Ligation Buffer
More informationTranscriptome Assembly and Evaluation, using Sequencing Quality Control (SEQC) Data
Transcriptome Assembly and Evaluation, using Sequencing Quality Control (SEQC) Data Introduction The US Food and Drug Administration (FDA) has coordinated the Sequencing Quality Control project (SEQC/MAQC-III)
More informationTranscriptome Assembly, Functional Annotation (and a few other related thoughts)
Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Monica Britton, Ph.D. Sr. Bioinformatics Analyst June 23, 2017 Differential Gene Expression Generalized Workflow File Types
More informationGene synthesis by circular assembly amplification
Gene synthesis by circular assembly amplification Duhee Bang & George M Church Supplementary figures and text: Supplementary Figure 1. Dpo4 gene (1.05kb) construction by various methods. Supplementary
More informationLecture 18: Single-cell Sequencing and Assembly. Spring 2018 May 1, 2018
Lecture 18: Single-cell Sequencing and Assembly Spring 2018 May 1, 2018 1 SINGLE-CELL SEQUENCING AND ASSEMBLY 2 Single-cell Sequencing Motivation: Vast majority of environmental bacteria are unculturable
More informationSupplemental Information. Human Senataxin Resolves RNA/DNA Hybrids. Formed at Transcriptional Pause Sites. to Promote Xrn2-Dependent Termination
Supplemental Information Molecular Cell, Volume 42 Human Senataxin Resolves RNA/DNA Hybrids Formed at Transcriptional Pause Sites to Promote Xrn2-Dependent Termination Konstantina Skourti-Stathaki, Nicholas
More informationMultiplexing Genome-scale Engineering
Multiplexing Genome-scale Engineering Harris Wang, Ph.D. Department of Systems Biology Department of Pathology & Cell Biology http://wanglab.c2b2.columbia.edu Rise of Genomics An Expanding Toolbox Esvelt
More informationBioInformatics and Computational Molecular Biology. Course Website
BioInformatics and Computational Molecular Biology Course Website http://bioinformatics.uchc.edu What is Bioinformatics Bioinformatics upgrades the information content of biological measurements. Discovery
More informationBioinformatics? Assembly, annotation, comparative genomics and a bit of phylogeny.
Bioinformatics? Assembly, annotation, comparative genomics and a bit of phylogeny stefano.gaiarsa@unimi.it Case study! it s a FAKE ONE, do not run away in panic! There s an outbreak of Mycoplasma bovis
More informationNAME:... MODEL ANSWER... STUDENT NUMBER:... Maximum marks: 50. Internal Examiner: Hugh Murrell, Computer Science, UKZN
COMP710, Bioinformatics with Julia, Test One, Thursday the 20 th of April, 2017, 09h30-11h30 1 NAME:...... MODEL ANSWER... STUDENT NUMBER:...... Maximum marks: 50 Internal Examiner: Hugh Murrell, Computer
More informationLecture 19A. DNA computing
Lecture 19A. DNA computing What exactly is DNA (deoxyribonucleic acid)? DNA is the material that contains codes for the many physical characteristics of every living creature. Your cells use different
More informationIt is not the strongest of the species that survive, nor the most intelligent, but the one most responsive to change
Generation of transcriptome resources in rubber in response to Corynespora cassiicola causing Corynespora leaf disease for gene discovery and marker identification using NGS platform C. Bindu Roy and T.
More information