Title: High-quality genome assembly of channel catfish, Ictalurus punctatus
|
|
- Erika Chapman
- 6 years ago
- Views:
Transcription
1 Author s response to reviews Title: High-quality genome assembly of channel catfish, Ictalurus punctatus Authors: Qiong Shi (shiqiong@genomics.cn) Xiaohui Chen (xhchenffri@hotmail.com) Liqiang Zhong (lqzhongffri@hotmail.com) Chao Bian (bianchao@genomics.cn) Pao Xu (xup@ffrc.cn) Ying Qiu (qiuying@genomics.cn) Junmin Xu (xujunmin@genomics.cn) Shiyong Zhang (shiyongzhang@hotmail.com) Yu Huang (huangyu@genomics.cn) Jia Li (lijia1@genomics.cn) Minghua Wang (w @sina.com) Qin Qin (qinqinapple1980@163.com) Chao Peng (pengchao@genomics.cn) Alex Wong (alexwong@genomics.cn) Zhifei Zhu (HDzhuzhifei@126.com) Min Wang (wangmin2@genomics.cn) Xinxin You (youxinxin@genomics.cn) Ruobo Gu (guruobo@genomics.cn) Xiaohua Zhu (xhz824@sina.com) Wenji Bian (js6060@sina.com)
2 Version: 1 Date: 28 Jun 2016 Author s response to reviews: June 28, 2016 RE: Your submission to GigaScience - GIGA-D Dear Scott, Thanks for your kind help. Our point-by-point responses (in the blue color) to the reviewers opinions are provided as follows for your consideration. Changes in our revised manuscript were highlighted in yellow. By the way, Prof. Lius genome paper was published after we submitted our manuscript. They sequenced channel catfish with much bigger data and similar quality, but their assembly is shorter than ours (783 Mb vs 845 Mb). Please see more details in the Table 2. The differences may be due to our better sample or existence of different subspecies. On the other hand, our good experience with analysis tools may contribute to the longer assembly. In fact, in the past few years, my group has analyzed whole genomes of over 20 aquatic animals. Our first genome paper was published in Nature Communications (2014, 5:5594) two years ago. In 2016, we also have been involved in two more important manuscripts, which will be accepted for publication in Nature (seahorse genome; as one of the corresponding authors) and Nature Genetics. Best regards, Qiong Shi, PhD, Professor BGI Shenzhen China Reviewer #1: Overall the data presented have resulting in a relatively high quality genome assembly and is potentially useful additional genomic resource for the species. The NGS and bioinformatic methods and data presented are generally appropriate.
3 However, the manuscript needs to be comprehensively edited to correct the language and make it more concise, before it can be considered for publication. Answer: Thanks for your suggestion. We have carefully polished the manuscript (changes are highlighted in yellow), with focus on the issues mentioned by the reviewers. The major issue that needs to be addressed in the manuscript is the justification of the study and the identification of the research gap it attempts to fill. The statement in the Background section of the Abstract to justify the study states: "genetic and molecular resources for the optimized breeding and biological characterization of this fish species are still limited" - is not supported by even a cursory search of the literature, as demonstrated by the 3 papers below. Answer: Sorry for the inappropriate statement. This sentence was removed in the revised manuscript. The existence of this previous work needs to be acknowledged and it is recommended the authors add an initial paragraph placed immediately following the "Data description" and the "Library construction..." sections, citing the papers indicated below and stating how their data/study adds new and useful genomic resources for this species. "A large amount of genome resources have been developed for catfish including genetic linkage maps, physical maps, BAC end sequences (BES), integrated linkage and physical maps using BES-derived markers, physical map contig-specific sequences, and draft genome sequences" from: Whole genome comparative analysis of channel catfish (Ictalurus punctatus) with four model fish species Yanliang Jiang, Xiaoyu Gao, Shikai Liu, Yu Zhang, Hong Liu, Fanyue Sun, Lisui Bao, Geoff Waldbieser and Zhanjiang Liu BMC Genomics :780 DOI: /
4 "A total of Gb of sequences were generated from each strain" [5 strains in total] from: Identification and Analysis of Genome-Wide SNPs Provide Insight into Signatures of Selection and Domestication in Channel Catfish (Ictalurus punctatus) Sun L1, Liu S1, Wang R1, Jiang Y1, Zhang Y1, Zhang J1, Bao L1, Kaltenboeck L1, Dunham R1, Waldbieser G2, Liu Z1. PLoS One Oct 14;9(10):e doi: /journal.pone "we constructed a high-density genetic map for channel catfish. This map possesses the highest marker density among all the genetic linkage maps constructed for any aquaculture species" From: Construction of a high-density, high-resolution genetic map and its integration with BAC-based physical map in channel catfish Yun Li,1, Shikai Liu,1, Zhenkui Qin,1 Geoff Waldbieser,2 Ruijia Wang,1 Luyang Sun,1 Lisui Bao,1 Roy G. Danzmann,3 Rex Dunham,1 and Zhanjiang Liu1,* Luyang Sun,1 Shikai Liu,1 Ruijia Wang,1 Yanliang Jiang,1 Yu Zhang,1 Jiaren Zhang,1 Lisui Bao,1 Ludmilla Kaltenboeck,1 Rex Dunham,1 Geoff Waldbieser,2 and Zhanjiang Liu1,* PLoS One. 2014; 9(10): e DNA Res Feb; 22(1): Answer: Thanks for your reference list. In the revised manuscript, we added several sentences to cite these reports of channel catfish genome resources in the Background section (lines 45-49). Other points requiring attention: L indicate the length and number of the reads. L81 - do not need to define "Ns" L89 and/or L99- refer to other estimates of genome size for Ictalurus punctatus - see above references
5 L105 & L144 - no indication in the Findings or "methods" section that RNA was sequenced - if so then "transcriptome" needs to be added to the title and the "methods", or a citation is needed to indicate the source of these data L either delete the last sentence or add 2-4 extra columns to Table 1 giving genome assembly and annotation statistics for other recently sequenced fish genomes, so the the reader can determine if the their channel catfish is "indeed of high-level accuracy". L179 - add depth coverage and longest scaffold data to Table 1 Answer: Thanks for all your good advice. We have modified each part according to your suggestions. The sentence that stated indeed of high-level accuracy was removed in the revised manuscript. Reviewer #2: This paper describes an annotated genome assembly of the commercially important channel catfish, obtained using Illumina sequencing technology. The methods used for sequencing, assembling, annotating and validating are appropriate. Overall, the assembly statistics (e.g. N50) are good for a short read assembly. The total assembled size (845 Mbp) almost exactly matches the genome size predicted from a k-mer distribution (839 Mbp), however both are lower than expected (~1 Gbp, and sources listed below). Since the assembly is based on short reads only, I would expect the assembly to contain some collapsed repeat contigs. The assembly size then fits the expected length of 1 Gbp, but not the k-mer estimate. The latter is based on a single short k-mer length (17), and may therefore be less accurate; I would suggest repeating this analysis with additional (longer) k-mers. Answer: Thanks for your suggestion. We have reanalyzed our data by utilizing the longer k-mer (19 and 21) as you recommended, and obtained similar estimates of the genome size (866 Mb and 858 Mb respectively), which are consistent with the result in our manuscript. Please see more details in the attached figure. The 1-Gb genome size predicted by flow cytometry may be appropriate for the American channel catfish; therefore, we revised the corresponding description in our new manuscript. The annotation yields protein-coding genes, which is also slightly low compared to zebrafish and other sources. However, the gene predictions have been adequately checked for completeness.
6 Finally, this work lays claim to the 'first high-quality channel catfish genome'. However, alternative (competing?) channel catfish genome sequencing efforts have been in progress since at least 2010, resulting in several publications, e.g.: * Lu (2011) The catfish genome database cbarbel: an informatic platform for genome biology of ictalurid catfish. NAR 39, D * Jiang (2013) Whole genome comparative analysis of channel catfish (Ictalurus punctatus) with four model fish species. BMC Genomics 14, 780 * Liu (2016) The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts. Nat Commun 7, None of this work is cited in the current paper. The authors should acknowledge such prior work, and put their efforts in context. For example, is the specimen/strain used for sequencing here of special interest for aquaculture in China? At the very least, the authors should refrain from any claims of being first; a claim of higher quality needs to be substantiated. Answer: Thanks for your advice. We cited these references in the Background section (lines 45-49) and removed the inappropriate statement. Reviewer #3: This manuscript describes Illumina sequencing and assembly of the channel catfish draft genome using SOAPdenovo2 software. Gene annotation included TblastN searches using proteins sequences from six fish species, de novo gene prediction using Augustus and Genscan, and RNA-seq using reads from skin and muscle tissue. Repetitive elements were also identified in the genome assembly. Most methodology is sufficiently documented to aid the reader in replicating their experiments except that the information regarding gap-filling is not detailed. The Conclusions state that this is the first high-quality channel catfish genome. Answer: We added the detailed description of gap-filling as follows (lines ): Gap closing was performed using approximate 480 million Illumina paired-end reads, generated from three libraries with insert sizes of 250, 500 and 800 bp, as the input for GapCloser (v1.12-r6, default parameters and -p set to 25) [15]. We also removed the inappropriate statement of the first highquality channel catfish genome in the revised manuscript. In fact, the first channel catfish genome assembly, named "Coco", was recently published online ( Full disclosure: I am a member of that USDA/Auburn research team and co-author of the manuscript. The Coco assembly utilized genomic DNA from a doubled haploid individual, Illumina and PacBio sequencing, and the assembly appears to be more contiguous than the BGI assembly. The
7 N50 statistics show that the upper 50% of the Coco assembly is made up of fewer (2,839 vs 66,332) and longer (77.2 kb vs 48.5 kb) contigs than the BGI assembly. The N50 scaffold lengths are similar (7.7 vs 7.2 Mb), but 50% of the bases in the Coco assembly are contained in only 31 scaffolds and 98% of the assembled bases are contained in 594 scaffolds. Furthermore, 97% of the Coco assembly was aligned to the 29 channel catfish chromosomes. It would be useful to have more assembly statistics for a more comprehensive description of the BGI assembly and for direct comparison with the Coco assembly (see their Table 1). Answer: Thanks for your suggestion. We added additional rows to compare these two assemblies in Table 2. The authors have estimated the channel catfish genome size at 839 Mb. The USDA/Auburn kmer-based estimate is 1 Gb (Supp. Figure 1), which is closer to published estimates of haploid genome content based on flow cytometry (Tiersch et al 1990; Tiersch and Goudie 1993). The authors should explain why their genome size estimate is so much lower than the flow cytometry data. Does this affect the parameters utilized in SOAPdenovo and potentially collapse genomic regions that arise from local duplication? Answer: It is an interesting question. In fact, Kmer-estimation and genome assembling are independent. Hence, they couldnt affect each other. We dont know the exact reasons for the difference at present, but we would try to obtain an independent assembly using your genome data once they are available. The difference could be due to different assembly tools or different samples (subspecies?). Our fishes had been inbred in a local farm for over 3 generations in China. The abstract states a predicted Mb of repetitive sequence but the manuscript does not describe how this number was determined. The authors should clarify whether the total assembly length of Mb includes repetitive content. Answer: According to your advice, we listed the detailed methods for repeat annotation in our revised manuscript (lines ). Our full assembly, Mb, indeed contains the repeat sequences of Mb. The BGI assembly predicted 21,556 coding genes whereas the Coco assembly predicted 26,661 coding genes. It is unclear to what extent the difference is due to assembly integrity or whether
8 fewer genes in the BGI assembly are due to a more limited RNAseq dataset (only skin and muscle tissue). Channel catfish EST, cdna, and RNAseq datasets from a wider variety of tissues and cell types have been available in GenBank for several years and could be useful to determine whether the missing 20% of genes actually exist in the BGI assembly. Answer: Thanks for your comments. After comparing the gene number yielded by each annotation step of both works, we observed that the basic difference of gene number is derived from the annotation step of Augustus software. About 34,061 genes in BGI version were predicted by Augustus software, which are much lower than that in Coco version (46,090). In our previous works with over 20 genome sequences of aquatic animals, we found very limited contribution of transcriptome sequences to the predicted gene number. As we know, most of the genes are known based on the available data from other reported species. Usually, the unknown genes account for less than 10% of the total gene number in a newly sequenced fish. In summary, the assembly produced in the current research is not the first high-quality channel catfish genome assembly and most parameters demonstrate it is not as complete as the Coco assembly published in Nature Communications. The Coco assembly will soon be available in GenBank which will permit the authors to perform a head-to-head comparison of the two assemblies to identify unique features of their assembly that justify publication. Answer: Sorry for the inappropriate statement with the first high-quality channel catfish genome assembly. In fact, we submitted our manuscript before the recent publication of your Coco genome paper. We now removed the sentence in our revised manuscript. Best regards, Qiong Shi, PhD, Professor BGI Shenzhen China
Genomic resources. for non-model systems
Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing
More informationDe novo whole genome assembly
De novo whole genome assembly Qi Sun Bioinformatics Facility Cornell University Sequencing platforms Short reads: o Illumina (150 bp, up to 300 bp) Long reads (>10kb): o PacBio SMRT; o Oxford Nanopore
More informationA draft sequence of bread wheat chromosome 7B based on individual MTP BAC sequencing using pair end and mate pair libraries.
A draft sequence of bread wheat chromosome 7B based on individual MTP BAC sequencing using pair end and mate pair libraries. O. A. Olsen, T. Belova, B. Zhan, S. R. Sandve, J. Hu, L. Li, J. Min, J. Chen,
More informationYellow-bellied marmot genome. Gabriela Pinho Graduate Student Blumstein & Wayne Labs EEB - UCLA
Yellow-bellied marmot genome Gabriela Pinho Graduate Student Blumstein & Wayne Labs EEB - UCLA Why do we need an annotated genome?.. Daniel T. Blumstein Kenneth B. Armitage 1962 2002 Samples & measurements
More informationDe novo assembly in RNA-seq analysis.
De novo assembly in RNA-seq analysis. Joachim Bargsten Wageningen UR/PRI/Plant Breeding October 2012 Motivation Transcriptome sequencing (RNA-seq) Gene expression / differential expression Reconstruct
More informationSequence assembly. Jose Blanca COMAV institute bioinf.comav.upv.es
Sequence assembly Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing project Unknown sequence { experimental evidence result read 1 read 4 read 2 read 5 read 3 read 6 read 7 Computational requirements
More informationThe New Genome Analyzer IIx Delivering more data, faster, and easier than ever before. Jeremy Preston, PhD Marketing Manager, Sequencing
The New Genome Analyzer IIx Delivering more data, faster, and easier than ever before Jeremy Preston, PhD Marketing Manager, Sequencing Illumina Genome Analyzer: a Paradigm Shift 2000x gain in efficiency
More informationNGS developments in tomato genome sequencing
NGS developments in tomato genome sequencing 16-02-2012, Sandra Smit TATGTTTTGGAAAACATTGCATGCGGAATTGGGTACTAGGTTGGACCTTAGTACC GCGTTCCATCCTCAGACCGATGGTCAGTCTGAGAGAACGATTCAAGTGTTGGAAG ATATGCTTCGTGCATGTGTGATAGAGTTTGGTGGCCATTGGGATAGCTTCTTACC
More informationDNBseq TM SERVICE OVERVIEW Plant and Animal Whole Genome Re-Sequencing
TM SERVICE OVERVIEW Plant and Animal Whole Genome Re-Sequencing Plant and animal whole genome re-sequencing (WGRS) involves sequencing the entire genome of a plant or animal and comparing the sequence
More informationDE NOVO WHOLE GENOME ASSEMBLY AND SEQUENCING OF THE SUPERB FAIRYWREN. (Malurus cyaneus) JOSHUA PEÑALBA LEO JOSEPH CRAIG MORITZ ANDREW COCKBURN
DE NOVO WHOLE GENOME ASSEMBLY AND SEQUENCING OF THE SUPERB FAIRYWREN (Malurus cyaneus) JOSHUA PEÑALBA LEO JOSEPH CRAIG MORITZ ANDREW COCKBURN ... 2014 2015 2016 2017 ... 2014 2015 2016 2017 Synthetic
More informationMate-pair library data improves genome assembly
De Novo Sequencing on the Ion Torrent PGM APPLICATION NOTE Mate-pair library data improves genome assembly Highly accurate PGM data allows for de Novo Sequencing and Assembly For a draft assembly, generate
More informationSequencing and assembly of the sheep genome reference sequence
Sequencing and assembly of the sheep genome reference sequence Yu Jiang Kunming Institute of Zoology, CAS, China the International Sheep Genomics Consortium (ISGC) ISGC Presentations Yu Jiang, Kunming
More informationGap Filling for a Human MHC Haplotype Sequence
American Journal of Life Sciences 2016; 4(6): 146-151 http://www.sciencepublishinggroup.com/j/ajls doi: 10.11648/j.ajls.20160406.12 ISSN: 2328-5702 (Print); ISSN: 2328-5737 (Online) Gap Filling for a Human
More informationTechnologies, resources and tools for the exploitation of the sheep and goat genomes.
Technologies, resources and tools for the exploitation of the sheep and goat genomes. B. P. Dalrymple, G. Tosser-Klopp, N. Cockett, A. Archibald, W. Zhang and J. Kijas. The plan The current state of the
More informationDe novo assembly of human genomes with massively parallel short read sequencing. Mikk Eelmets Journal Club
De novo assembly of human genomes with massively parallel short read sequencing Mikk Eelmets Journal Club 06.04.2010 Problem DNA sequencing technologies: Sanger sequencing (500-1000 bp) Next-generation
More informationExploiting novel rice baseline datasets: WGS, BAC-based platinum genome sequencing and full-length transcriptomics
Exploiting novel rice baseline datasets: WGS, BAC-based platinum genome sequencing and full-length transcriptomics Dario Copetti, PhD Arizona Genomics Institute The University of Arizona International
More informationIntroduction to metagenome assembly. Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014
Introduction to metagenome assembly Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014 Sequencing specs* Method Read length Accuracy Million reads Time Cost per M 454
More informationDe novo genome assembly with next generation sequencing data!! "
De novo genome assembly with next generation sequencing data!! " Jianbin Wang" HMGP 7620 (CPBS 7620, and BMGN 7620)" Genomics lectures" 2/7/12" Outline" The need for de novo genome assembly! The nature
More informationGenomics and Transcriptomics of Spirodela polyrhiza
Genomics and Transcriptomics of Spirodela polyrhiza Doug Bryant Bioinformatics Core Facility & Todd Mockler Group, Donald Danforth Plant Science Center Desired Outcomes High-quality genomic reference sequence
More informationDe novo whole genome assembly
De novo whole genome assembly Lecture 1 Qi Sun Bioinformatics Facility Cornell University Data generation Sequencing Platforms Short reads: Illumina Long reads: PacBio; Oxford Nanopore Contiging/Scaffolding
More informationDe novo Genome Assembly
De novo Genome Assembly A/Prof Torsten Seemann Winter School in Mathematical & Computational Biology - Brisbane, AU - 3 July 2017 Introduction The human genome has 47 pieces MT (or XY) The shortest piece
More informationTranscriptome Assembly, Functional Annotation (and a few other related thoughts)
Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Monica Britton, Ph.D. Sr. Bioinformatics Analyst June 23, 2017 Differential Gene Expression Generalized Workflow File Types
More informationDE NOVO GENOME ASSEMBLY OF THE AFRICAN CATFISH (CLARIAS GARIEPINUS)
DE NOVO GENOME ASSEMBLY OF THE AFRICAN CATFISH (CLARIAS GARIEPINUS) Kovács B. a,, Barta E. c, Pongor S. L. b, Uri Cs. a, Patócs A. b, Orbán L. d, Müller T. a, Urbányi B. a a Department of Aquaculture,
More informationDe Novo Assembly of High-throughput Short Read Sequences
De Novo Assembly of High-throughput Short Read Sequences Chuming Chen Center for Bioinformatics and Computational Biology (CBCB) University of Delaware NECC Third Skate Genome Annotation Workshop May 23,
More informationDe novo whole genome assembly
De novo whole genome assembly Lecture 1 Qi Sun Minghui Wang Bioinformatics Facility Cornell University DNA Sequencing Platforms Illumina sequencing (100 to 300 bp reads) Overlapping reads ~180bp fragment
More informationWheat Genome Structural Annotation Using a Modular and Evidence-combined Annotation Pipeline
Wheat Genome Structural Annotation Using a Modular and Evidence-combined Annotation Pipeline Xi Wang Bioinformatics Scientist Computational Life Science Page 1 Bayer 4:3 Template 2010 March 2016 17/01/2017
More informationAssembly of Ariolimax dolichophallus using SOAPdenovo2
Assembly of Ariolimax dolichophallus using SOAPdenovo2 Charles Markello, Thomas Matthew, and Nedda Saremi Image taken from Banana Slug Genome Project, S. Weber SOAPdenovo Assembly Tool Short Oligonucleotide
More informationGenome-wide comparative analysis of channel catfish (Ictalurus punctatus) Yanliang Jiang
Genome-wide comparative analysis of channel catfish (Ictalurus punctatus) by Yanliang Jiang A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements
More informationGenomic Technologies. Michael Schatz. Feb 1, 2018 Lecture 2: Applied Comparative Genomics
Genomic Technologies Michael Schatz Feb 1, 2018 Lecture 2: Applied Comparative Genomics Welcome! The primary goal of the course is for students to be grounded in theory and leave the course empowered to
More informationNEXT GENERATION SEQUENCING. Farhat Habib
NEXT GENERATION SEQUENCING HISTORY HISTORY Sanger Dominant for last ~30 years 1000bp longest read Based on primers so not good for repetitive or SNPs sites HISTORY Sanger Dominant for last ~30 years 1000bp
More informationSequence Assembly and Alignment. Jim Noonan Department of Genetics
Sequence Assembly and Alignment Jim Noonan Department of Genetics james.noonan@yale.edu www.yale.edu/noonanlab The assembly problem >>10 9 sequencing reads 36 bp - 1 kb 3 Gb Outline Basic concepts in genome
More informationGenome evolution on the allotetraploid Xenopus laevis
Genome evolution on the allotetraploid Xenopus laevis Taejoon Kwon Department of Biomedical Engineering, School of Life Sciences Ulsan National Institute of Science & Technology (UNIST) Xenopus Bioinformatics
More informationThe Basics of Understanding Whole Genome Next Generation Sequence Data
The Basics of Understanding Whole Genome Next Generation Sequence Data Heather Carleton-Romer, MPH, Ph.D. ASM-CDC Infectious Disease and Public Health Microbiology Postdoctoral Fellow PulseNet USA Next
More informationMapping. Main Topics Sept 11. Saving results on RCAC Scaffolding and gap closing Assembly quality
Mapping Main Topics Sept 11 Saving results on RCAC Scaffolding and gap closing Assembly quality Saving results on RCAC Core files When a program crashes, it will produce a "coredump". these are very large
More informationHigh quality reference genome of the domestic sheep (Ovis aries) Yu Jiang and Brian P. Dalrymple
High quality reference genome of the domestic sheep (Ovis aries) Yu Jiang and Brian P. Dalrymple CSIRO Livestock Industries on behalf of the International Sheep Genomics Consortium Outline of presentation
More informationExperimental Design. Sequencing. Data Quality Control. Read mapping. Differential Expression analysis
-Seq Analysis Quality Control checks Reproducibility Reliability -seq vs Microarray Higher sensitivity and dynamic range Lower technical variation Available for all species Novel transcript identification
More informationMassive Analysis of cdna Ends for simultaneous Genotyping and Transcription Profiling in High Throughput
Next Generation (Sequencing) Tools for Nucleotide-Based Information Massive Analysis of cdna Ends for simultaneous Genotyping and Transcription Profiling in High Throughput Björn Rotter, PhD GenXPro GmbH,
More informationGenome Projects. Part III. Assembly and sequencing of human genomes
Genome Projects Part III Assembly and sequencing of human genomes All current genome sequencing strategies are clone-based. 1. ordered clone sequencing e.g., C. elegans well suited for repetitive sequences
More informationHigh-Throughput Bioinformatics: Re-sequencing and de novo assembly. Elena Czeizler
High-Throughput Bioinformatics: Re-sequencing and de novo assembly Elena Czeizler 13.11.2015 Sequencing data Current sequencing technologies produce large amounts of data: short reads The outputted sequences
More informationDe novo meta-assembly of ultra-deep sequencing data
De novo meta-assembly of ultra-deep sequencing data Hamid Mirebrahim 1, Timothy J. Close 2 and Stefano Lonardi 1 1 Department of Computer Science and Engineering 2 Department of Botany and Plant Sciences
More informationGenome Sequencing-- Strategies
Genome Sequencing-- Strategies Bio 4342 Spring 04 What is a genome? A genome can be defined as the entire DNA content of each nucleated cell in an organism Each organism has one or more chromosomes that
More informationRNA-Sequencing analysis
RNA-Sequencing analysis Markus Kreuz 25. 04. 2012 Institut für Medizinische Informatik, Statistik und Epidemiologie Content: Biological background Overview transcriptomics RNA-Seq RNA-Seq technology Challenges
More informationSequencing the genomes of Nicotiana sylvestris and Nicotiana tomentosiformis Nicolas Sierro
Sequencing the genomes of Nicotiana sylvestris and Nicotiana tomentosiformis Nicolas Sierro Philip Morris International R&D, Philip Morris Products S.A., Neuchatel, Switzerland Introduction Nicotiana sylvestris
More informationIntroduction to CGE tools
Introduction to CGE tools Pimlapas Leekitcharoenphon (Shinny) Research Group of Genomic Epidemiology, DTU-Food. WHO Collaborating Centre for Antimicrobial Resistance in Foodborne Pathogens and Genomics.
More informationBIOINFORMATICS ORIGINAL PAPER
BIOINFORMATICS ORIGINAL PAPER Vol. 27 no. 21 2011, pages 2957 2963 doi:10.1093/bioinformatics/btr507 Genome analysis Advance Access publication September 7, 2011 : fast length adjustment of short reads
More informationTranscriptome Assembly and Evaluation, using Sequencing Quality Control (SEQC) Data
Transcriptome Assembly and Evaluation, using Sequencing Quality Control (SEQC) Data Introduction The US Food and Drug Administration (FDA) has coordinated the Sequencing Quality Control project (SEQC/MAQC-III)
More informationBioinformatics in next generation sequencing projects
Bioinformatics in next generation sequencing projects Rickard Sandberg Assistant Professor Department of Cell and Molecular Biology Karolinska Institutet May 2013 Standard sequence library generation Illumina
More informationSupplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly
Supplementary Tables Supplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly Library Read length Raw data Filtered data insert size (bp) * Total Sequence depth Total Sequence
More informationState of the art de novo assembly of human genomes from massively parallel sequencing data
State of the art de novo assembly of human genomes from massively parallel sequencing data Yingrui Li, 1 Yujie Hu, 1,2 Lars Bolund 1,3 and Jun Wang 1,2* 1 BGI-Shenzhen, Shenzhen, Guangdong 518083, China
More informationDNA. bioinformatics. genomics. personalized. variation NGS. trio. custom. assembly gene. tumor-normal. de novo. structural variation indel.
DNA Sequencing T TM variation DNA amplicon mendelian trio genomics NGS bioinformatics tumor-normal custom SNP resequencing target validation de novo prediction personalized comparative genomics exome private
More informationGenome sequencing in Senecio squalidus
Genome sequencing in Senecio squalidus Outline of project A new NERC funded grant, the genomic basis of adaptation and species divergence in Senecio in collaboration with Richard Abbott and Dmitry Filatov
More informationSequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es
Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio
More information1000 Insect Transcriptomes Evolution - 1KITE
1KITE 1K Insect Transcriptome Evolution 1000 Insect Transcriptomes Evolution - 1KITE An Example of Handling "Big Data" Karen Meusemann, on behalf of the 1KITE Consortium CSIRO Ecosystem Sciences, Australian
More informationThe Human Genome and its upcoming Dynamics
The Human Genome and its upcoming Dynamics Matthias Platzer Genome Analysis Leibniz Institute for Age Research - Fritz-Lipmann Institute (FLI) Sequencing of the Human Genome Publications 2004 2001 2001
More informationDe Novo and Hybrid Assembly
On the PacBio RS Introduction The PacBio RS utilizes SMRT technology to generate both Continuous Long Read ( CLR ) and Circular Consensus Read ( CCS ) data. In this document, we describe sequencing the
More informationGENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.
!! www.clutchprep.com CONCEPT: OVERVIEW OF GENOMICS Genomics is the study of genomes in their entirety Bioinformatics is the analysis of the information content of genomes - Genes, regulatory sequences,
More informationHow much sequencing do I need? Emily Crisovan Genomics Core September 26, 2018
How much sequencing do I need? Emily Crisovan Genomics Core September 26, 2018 How much sequencing? Three questions: 1. How much sequence is required for good experimental design? 2. What type of sequencing
More informationHow much sequencing do I need? Emily Crisovan Genomics Core
How much sequencing do I need? Emily Crisovan Genomics Core How much sequencing? Three questions: 1. How much sequence is required for good experimental design? 2. What type of sequencing run is best?
More informationGenome Assembly Software for Different Technology Platforms. PacBio Canu Falcon. Illumina Soap Denovo Discovar Platinus MaSuRCA.
Genome Assembly Software for Different Technology Platforms PacBio Canu Falcon 10x SuperNova Illumina Soap Denovo Discovar Platinus MaSuRCA Experimental design using Illumina Platform Estimate genome size:
More informationLecture 2: Biology Basics Continued
Lecture 2: Biology Basics Continued Central Dogma DNA: The Code of Life The structure and the four genomic letters code for all living organisms Adenine, Guanine, Thymine, and Cytosine which pair A-T and
More informationBasics of RNA-Seq. (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility
2018 ABRF Meeting Satellite Workshop 4 Bridging the Gap: Isolation to Translation (Single Cell RNA-Seq) Sunday, April 22 Basics of RNA-Seq (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly,
More information1. Please separate the consent section into statements regarding:
Author s response to reviews Title: Next-generation sequencing for D47N mutation in Cx50 analysis associated with autosomal dominant congenital cataract in a six-generation Chinese family Authors: chao
More informationIntroduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph
Introduction to Plant Genomics and Online Resources Manish Raizada University of Guelph Genomics Glossary http://www.genomenewsnetwork.org/articles/06_00/sequence_primer.shtml Annotation Adding pertinent
More informationThe Irys System. Rapid Genome Wide Mapping for de novo Assembly and Structural Variation Analysis. Jack Peart, Ph.D. Director of Sales EMEA
The Irys System Rapid Genome Wide Mapping for de novo Assembly and Structural Variation Analysis Jack Peart, Ph.D. Director of Sales EMEA BioNano Snapshot Developed & Commercialized the Irys System for
More informationSCIENCE CHINA Life Sciences. Comparative analysis of de novo transcriptome assembly
SCIENCE CHINA Life Sciences SPECIAL TOPIC February 2013 Vol.56 No.2: 156 162 RESEARCH PAPER doi: 10.1007/s11427-013-4444-x Comparative analysis of de novo transcriptome assembly CLARKE Kaitlin 1, YANG
More informationPlant Breeding and Agri Genomics. Team Genotypic 24 November 2012
Plant Breeding and Agri Genomics Team Genotypic 24 November 2012 Genotypic Family: The Best Genomics Experts Under One Roof 10 PhDs and 78 MSc MTech BTech ABOUT US! Genotypic is a Genomics company, which
More informationBioinformatics for Microbial Biology
Bioinformatics for Microbial Biology Chaochun Wei ( 韦朝春 ) ccwei@sjtu.edu.cn http://cbb.sjtu.edu.cn/~ccwei Fall 2013 1 Outline Part I: Visualization tools for microbial genomes Tools: Gbrowser Part II:
More informationTaking Advantage of Long RNA-Seq Reads. Vince Magrini Pacific Biosciences User Group Meeting September 18, 2013
Taking Advantage of Long RNA-Seq Reads Vince Magrini Pacific Biosciences User Group Meeting September 18, 2013 Overview Proof-of-Principle SMART-cDNA Synthesis PB-SBL size distributions Gene Annotation
More informationComparison and Evaluation of Cotton SNPs Developed by Transcriptome, Genome Reduction on Restriction Site Conservation and RAD-based Sequencing
Comparison and Evaluation of Cotton SNPs Developed by Transcriptome, Genome Reduction on Restriction Site Conservation and RAD-based Sequencing Hamid Ashrafi Amanda M. Hulse, Kevin Hoegenauer, Fei Wang,
More informationuser s guide Question 1
Question 1 How does one find a gene of interest and determine that gene s structure? Once the gene has been located on the map, how does one easily examine other genes in that same region? doi:10.1038/ng966
More informationOutline. Array platform considerations: Comparison between the technologies available in microarrays
Microarray overview Outline Array platform considerations: Comparison between the technologies available in microarrays Differences in array fabrication Differences in array organization Applications of
More informationSequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es
Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio
More informationGenome Assembly. J Fass UCD Genome Center Bioinformatics Core Friday September, 2015
Genome Assembly J Fass UCD Genome Center Bioinformatics Core Friday September, 2015 From reads to molecules What s the Problem? How to get the best assemblies for the smallest expense (sequencing) and
More informationThe tomato genome re-seq project
The tomato genome re-seq project http://www.tomatogenome.net 5 February 2013, Richard Finkers & Sjaak van Heusden Rationale Genetic diversity in commercial tomato germplasm relatively narrow Unexploited
More informationA Short Sequence Splicing Method for Genome Assembly Using a Three- Dimensional Mixing-Pool of BAC Clones and High-throughput Technology
Send Orders for Reprints to reprints@benthamscience.ae 210 The Open Biotechnology Journal, 2015, 9, 210-215 Open Access A Short Sequence Splicing Method for Genome Assembly Using a Three- Dimensional Mixing-Pool
More informationTruSPAdes: analysis of variations using TruSeq Synthetic Long Reads (TSLR)
tru TruSPAdes: analysis of variations using TruSeq Synthetic Long Reads (TSLR) Anton Bankevich Center for Algorithmic Biotechnology, SPbSU Sequencing costs 1. Sequencing costs do not follow Moore s law
More informationApplications of Next Generation Sequencing in Metagenomics Studies
Applications of Next Generation Sequencing in Metagenomics Studies Francesca Rizzo, PhD Genomix4life Laboratory of Molecular Medicine and Genomics Department of Medicine and Surgery University of Salerno
More informationSequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es
Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio
More informationWhy can GBS be complicated? Tools for filtering, error correction and imputation.
Why can GBS be complicated? Tools for filtering, error correction and imputation. Edward Buckler USDA-ARS Cornell University http://www.maizegenetics.net Many Organisms Are Diverse Humans are at the lower
More informationGenomics. Data Analysis & Visualization. Camilo Valdes
Genomics Data Analysis & Visualization Camilo Valdes cvaldes3@miami.edu https://github.com/camilo-v Center for Computational Science, University of Miami ccs.miami.edu Today Sequencing Technologies Background
More informationProceedings of the World Congress on Genetics Applied to Livestock Production,
Genomics using the Assembly of the Mink Genome B. Guldbrandtsen, Z. Cai, G. Sahana, T.M. Villumsen, T. Asp, B. Thomsen, M.S. Lund Dept. of Molecular Biology and Genetics, Research Center Foulum, Aarhus
More informationuser s guide Question 3
Question 3 During a positional cloning project aimed at finding a human disease gene, linkage data have been obtained suggesting that the gene of interest lies between two sequence-tagged site markers.
More informationDeep Sequencing technologies
Deep Sequencing technologies Gabriela Salinas 30 October 2017 Transcriptome and Genome Analysis Laboratory http://www.uni-bc.gwdg.de/index.php?id=709 Microarray and Deep-Sequencing Core Facility University
More informationIntroduction to 'Omics and Bioinformatics
Introduction to 'Omics and Bioinformatics Chris Overall Department of Bioinformatics and Genomics University of North Carolina Charlotte Acquire Store Analyze Visualize Bioinformatics makes many current
More informationEstimating the rates and modes of creation of new genetic variation in plants using NGS technologies
Estimating the rates and modes of creation of new genetic variation in plants using NGS technologies 14/06/2016 Supervisor: Prof. Michele Morgante Co-supervisor: Fabio Marroni PhD Student: Ettore Zapparoli
More informationSimultaneous profiling of transcriptome and DNA methylome from a single cell
Additional file 1: Supplementary materials Simultaneous profiling of transcriptome and DNA methylome from a single cell Youjin Hu 1, 2, Kevin Huang 1, 3, Qin An 1, Guizhen Du 1, Ganlu Hu 2, Jinfeng Xue
More informationWorkflow of de novo assembly
Workflow of de novo assembly Experimental Design Clean sequencing data (trim adapter and low quality sequences) Run assembly software for contiging and scaffolding Evaluation of assembly Several iterations:
More informationChromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Supplementary Material
Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions Joshua N. Burton 1, Andrew Adey 1, Rupali P. Patwardhan 1, Ruolan Qiu 1, Jacob O. Kitzman 1, Jay Shendure 1 1 Department
More informationBioinformatic analysis of Illumina sequencing data for comparative genomics Part I
Bioinformatic analysis of Illumina sequencing data for comparative genomics Part I Dr David Studholme. 18 th February 2014. BIO1033 theme lecture. 1 28 February 2014 @davidjstudholme 28 February 2014 @davidjstudholme
More informationRapid Transcriptome Characterization for a nonmodel organism using 454 pyrosequencing
Rapid Transcriptome Characterization for a nonmodel organism using 454 pyrosequencing "#$%&'()*+,"(-*."#$%&/.,"*01*0.,(%-*.&0("2*01*3,$,45,"-*4#66&*71** 3"#)(82,"-*2&9:)($*)1*"(03&"2-*#)66(*.(8$6#*;
More informationBioinformatics Advice on Experimental Design
Bioinformatics Advice on Experimental Design Where do I start? Please refer to the following guide to better plan your experiments for good statistical analysis, best suited for your research needs. Statistics
More informationThe Genome Analysis Centre. Building Excellence in Genomics and Computa5onal Bioscience
Building Excellence in Genomics and Computa5onal Bioscience Resequencing approaches Sarah Ayling Crop Genomics and Diversity sarah.ayling@tgac.ac.uk Why re- sequence plants? To iden
More informationWhat the Genome of Raffaelea lauricola Can Tell Us About Laurel Wilt
What the Genome of Raffaelea lauricola Can Tell Us About Laurel Wilt Laurel Wilt Summit November 3-4, 2016 Dr. Jeffrey Rollins Associate Professor Plant Pathology Department University of Florida Gainesville,
More informationReviewers' Comments: Reviewer #1 (Remarks to the Author)
Reviewers' Comments: Reviewer #1 (Remarks to the Author) In this study, Rosenbluh et al reported direct comparison of two screening approaches: one is genome editing-based method using CRISPR-Cas9 (cutting,
More informationMapping strategies for sequence reads
Mapping strategies for sequence reads Ernest Turro University of Cambridge 21 Oct 2013 Quantification A basic aim in genomics is working out the contents of a biological sample. 1. What distinct elements
More informationNature Biotechnology: doi: /nbt Supplementary Figure 1. Number and length distributions of the inferred fosmids.
Supplementary Figure 1 Number and length distributions of the inferred fosmids. Fosmid were inferred by mapping each pool s sequence reads to hg19. We retained only those reads that mapped to within a
More informationFunctional genomics to improve wheat disease resistance. Dina Raats Postdoctoral Scientist, Krasileva Group
Functional genomics to improve wheat disease resistance Dina Raats Postdoctoral Scientist, Krasileva Group Talk plan Goal: to contribute to the crop improvement by isolating YR resistance genes from cultivated
More informationde novo Transcriptome Assembly Nicole Cloonan 1 st July 2013, Winter School, UQ
de novo Transcriptome Assembly Nicole Cloonan 1 st July 2013, Winter School, UQ de novo transcriptome assembly de novo from the Latin expression meaning from the beginning In bioinformatics, we often use
More informationNature Methods: doi: /nmeth Supplementary Figure 1. Ideograms showing scaffold boundaries and segmental duplication locations.
Supplementary Figure 1 Ideograms showing scaffold boundaries and segmental duplication locations. Blue lines mark the boundaries of scaffolds. Black marks show the locations of segmental duplications.
More informationHIGH-QUALITY ASSEMBLY OF THE DURUM WHEAT GENOME CV. SVEVO
HIGH-QUALITY ASSEMBLY OF THE DURUM WHEAT GENOME CV. SVEVO Luigi Cattivelli, The International Durum Wheat Genome Sequencing Consortium 31/01/2017 1 Durum wheat Durum wheat with a total production of about
More information