Ensembl: A Genomic Toolset for pigs, poultry, plants, pests, pathogens and pollinators. Paul Kersey
|
|
- Ambrose Griffith
- 6 years ago
- Views:
Transcription
1 Ensembl: A Genomic Toolset for pigs, poultry, plants, pests, pathogens and pollinators Paul Kersey
2 A brief history of genome sequencing 1995 Haemophilus influenzae 1.8 Mb 1996 Saccharomyces cerevisiae 12 Mb 1999 Drosophila melanogaster 140 Mb 2001 Homo sapiens 3.1 Gb Sequencing technology is continuously improving, but (massively parallel) next generation techniques really were game-changers 2 pkersey@ebi.ac.uk
3 Cost of Sequencing a Human Genome
4 A brief history of genome sequencing genomes project (2500 human genomes) genomes project (1,0001 Arabidopsis genomes) Genomics England (100,000 human genomes) 4 pkersey@ebi.ac.uk
5 What can we do with thousands of genome sequences? Statistical association of traits with markers Increased marker resolution to find causative variants Understand population structure and evolutionary processes Track epidemics Assay for known variation Environmental distribution Tool for managing crosses More genomes More statistical power, find rarer causative alleles 5 pkersey@ebi.ac.uk
6 Thousands of genomes a tool for breeding Characterize germplasm of land races and wild relatives Understand what s actually present in an existing line Find alleles associated with traits Combine genotyping with various (laboratory, greenhouse, field) phenotyping mechanisms, themselves increasingly automated and high-throughput Manage crosses 6 pkersey@ebi.ac.uk
7 Everyone can do their own experiments, but EMBL-EBI would like to maintain a cataologue of reference genomes and variants for all majorly studied species Selected lines can be re-phenotyped and analysed against the same reference data One major challenge: organising the pan-genome No single genome is enough to serve as a reference for many species Variants, functional elements present in some strains but not in the reference Reference is still a useful concept: but needs to be extended choose your own reference according to need 7 pkersey@ebi.ac.uk
8 Phenotyping data Immensely varied Dependent on an environment (GxPxE) Anything you can measure from molecular assays to in-field imaging Increasing use of structured controlled vocabularies for human readable, inter-operable data summaries Meta data is critical What has been assayed? Where was it assayed? How has it been assayed? 8 pkersey@ebi.ac.uk
9 The EBI mission EMBL-EBI provides freely available data from life science experiments, performs basic research in computational biology and offers an extensive user training programme, supporting researchers in academia and industry. We also coordinated the ELIXIR pilot phase and are hosting the ELIXIR hub 9 pkersey@ebi.ac.uk
10 EBI provides Structured archives (and associated submission services) for most major types of molecular biological data e.g. European Nucleotide Archive (part of the ENA-GenBank- DDBJ International Nucleotide Sequence Database Consortium) European Variation Archive now accepting submissions in VCF format ArrayExpress, PRIDE, Metabolights Integrative, interpreted services providing access to that data in a biologically meaningful context e.g. Ensembl 10 pkersey@ebi.ac.uk
11 Ensembl A modular suite of software for genome analysis and visualisation developed jointly by the Wellcome Trust Sanger Institute and the European Bioinformatics Institute Now used for genomes from across the taxonomic space Offers a standard set of interfaces to a wide range of genome-scale data, including: Web-based GUI Public mysql server Perl and REST-ful APIs FTP Data mining tool (constructed using BioMart) framework with its own set of interfaces: web GUI, web services, command line and local client 1 1 pkersey@ebi.ac.uk 2015 ELIXIR Innovation and SME Forum Wageningen 18th-19th March
12 vertebrates bacteria fungi protists plants metazoa 26 th July 2013 ELIXIR Innovation and SME Forum Wageningen 18th-19th March 2015
13 Agriculturally relevant species in Ensembl Farm animals Crop plants Pests Vectors Pollinators Pathogens Symbionts and commensuals
14 14 ELIXIR Innovation and SME Forum Wageningen 18th-19th March 2015
15 15 ELIXIR Innovation and SME Forum Wageningen 18th-19th March 2015
16 16 ELIXIR Innovation and SME Forum Wageningen 18th-19th March 2015
17 1 7
18 Gene tree pipeline Take canonical protein for each gene belonging to one Ensembl Genomes clade Cluster: WU-BLASTP + Smith-Waterman allversus-all, hcluster_sg Align: multiple aligners consensified by M- Coffee Build trees: PhyML-WAG + PhyML-HKY + NJ-p + NJ-dN + NJ-dS + species tree TreeBeST-merge O r t h o l o g s & P a r a l o g s Infer orthologues and paralogues
19 Orthologues and paralogues Paralogues: Any gene pairwise relationship where the ancestor node is a duplication event Orthologues: Any gene pairwise relationship where the ancestor node is a speciation event
20 Orthology / paralogy types ortholog_one2one ortholog_one2many ortholog_many2many apparent_ortholog_one2one possible_ortholog (weakly supported duplication node) within_species_paralog other_paralog (too distant to be in the same tree) contiguous_gene_split (artefact) putative_gene_split (artefact)
21 ELIXIR Innovation and SME Forum Wageningen 18th-19th March 2015
22 Pairwise whole genome alignments & synteny Only for certain combinations of species Generated using (B)LASTz-net Synteny Organisms of relatively recent divergence show similar blocks of genes in the same relative positions in the genome Shows how the genome is cut and pasted in the course of evolution Calculated using pairwise whole genome alignments Only for certain combinations of species
23 Ensembl Ensembl supports many livestock species Ensembl provides automatic gene annotation for these species Ensembl works with Havana to support manual annotation in Pigs Ensembl provides Variation databases and functional annotation where the data exists Ensembl is playing an active role in FAANG and will integrate the functional data generated as it becomes available
24 FAANG Functional Annotation of Animal Genomes High quality transcriptomic and regulatory annotation of Animal Genomes Open Data released pre-publication Common data and analysis standards EBI leading establishment of infrastructure for data sharing and standard
25 25
26 26
27 The bread wheat genome Large haploid genome size is > 5 Gb But in fact, the genome is an alloxhexaploid (triploid genome size ~ 16 Gb) Each diploid genome is quite homozygous 27 pkersey@ebi.ac.uk
28 Evolution of hexaploid bread wheat 2 8 pkersey@ebi.ac.uk
29 The bread wheat genome Genome has been sequenced by Illumina after chromosome sorting Assembly is fragmented, but gene models are broadly comparable to other grasses Chromosome 3B has been sequenced BAC-by-BAC 29
30 Wheat in Ensembl Plants We represent the IWGSC chromosome survey sequence with the addition of the finished 3B sequence. We also use PopSeq data (from IPK, Gatersleben) to group scaffolds into bins based on genetic locations 30
31 1:1 orthology calls over 19 cereals including the three sub-genomes of bread wheat 3 1 ELIXIR Innovation and SME Forum Wageningen 18th-19th March 2015
32 32 ELIXIR Innovation and SME Forum Wageningen 18th-19th March 2015
33 33 ELIXIR Innovation and SME Forum Wageningen 18th-19th March 2015
34 Polymorphism data for bread wheat ~900,000 SNPs provided by CerealsDB, as follows: The Axiom 820K SNP Array contains 820,000 SNPs of which ~684,000 have been mapped. The iselect 80K Array contains over 80,000 SNP loci of which ~58,000 have been mapped. The KASP probeset contains ~3,900 SNP loci of which ~3,100 have been loaded in Ensembl Plants 34
35 35 ELIXIR Innovation and SME Forum Wageningen 18th-19th March 2015
36 36 ELIXIR Innovation and SME Forum Wageningen 18th-19th March 2015
37 37
38 Inter-homoeologous variants Genome combination Mismatch length (in reference genome), bp Alignment length, bp % mismatch B on A 2,881,969 41,739, D on A 2,665,562 43,228, A on B 2,892,005 41,749, D on B 2,739,967 44,238, A on D 2,689,840 43,252, B on D 2,745,993 43,244, Mismatch defined as length on reference not matched in non reference 38 pkersey@ebi.ac.uk
39 39 ELIXIR Innovation and SME Forum Wageningen 18th-19th March 2015
40 Bread wheat whole genome alignment DNA-DNA pairwise alignments with lastz Brachypodium distachyon: 617,996,145 Mb (14% of bread wheat) in 1,310,922 blocks Hordeum vulgare: 423,284,874 Mb (9% of bread wheat) in 2,902,234 blocks Oryza sativa Japonica: 312,857,683 Mb out of 4,460,951,632 (7% of bread wheat) in 718,036 blocks 40
41 Additional alignment data for bread wheat Repbase repeats Triticeae repeats from TREP Wheat RNA-Seq, ESTs, and UniGene datasets have been aligned to the Triticum aestivum genome: 454 RNA-seq data for the following INSDC studies: SRP02455 (Akhuvnova et al.), ERP (Brenchley et al.), SRP Sequences from TriFLDB Transcriptome assembly from diploid einkorn wheat Triticum monococcum (Fox et al.) 41
42 Diploid progenitors of bread wheat Aegilops tauschii (DD) and Triticum urartu (AA) are also included in Ensembl Plants In addition, we have RNA-seq data from Triticum monococcum (AA) These genomes have been aligned to rice, and barley Relevant RNA-seq reads have been also aligned ELIXIR Innovation and SME Forum Wageningen 18th-19th March 2015
43 Bread wheat whole genome alignment DNA-DNA pairwise alignments with lastz Brachypodium distachyon: 617,996,145 Mb (14% of bread wheat) in 1,310,922 blocks Hordeum vulgare: 423,284,874 Mb (9% of bread wheat) in 2,902,234 blocks Oryza sativa Japonica: 312,857,683 Mb out of 4,460,951,632 (7% of bread wheat) in 718,036 blocks 43
44 Accessing Ensembl Data Programatically 5 easy methods
45 Access method 1:FTP downloads ftp://ftp.ensemblgenomes.org/pub/ Genomic, cdna and protein sequence (FASTA) Annotated sequence (EMBL / GenBank) Gene sets (GTF) Resequencing alignments individuals / strains (EMF) Whole-genome multiple alignments (EMF) Gene-based multiple alignments (EMF) Constrained elements (BED) Database dumps (MySQL)
46 Access method 2: mysql MySQL: an open-source relational database management system (RDBMS) Used as the back end to support most Ensembl pipelines and applications You get the database from and install locally On the Ensembl Genomes FTP site, you can download the Ensembl schema as a.sql file. You can also download the data files /data/mysql/bin/mysql -u mysqldba create database zea_mays_core_24_77_6; exit; /data/mysql/bin/mysql -u mysqldba zea_mays_core_24_77_6 < zea_mays_core_24_77_6 /data/mysql/bin/mysqlimport -u mysqldba --fields_escaped_by=\\ zea_mays_core_24_77_6 -L *.txt 4 6 pkersey@ebi.ac.uk Gramene Workshop, Plant and Animal Genomes XIII
47 Access method 3: Ensembl Perl API Mature, fully featured Perl API (Applications Programming Interface) for Ensembl resources Perl: a commonly used programming language in bioinformatics, designed to make easy thing easy and hard things possible Provides access to: Genomic sequence Genome features e.g. genes, translations Annotation e.g. cross-references
48 Access method 4: REST API REpresentational State Transfer is an abstraction of the architecture of the World Wide Web; more precisely, REST is an architectural style consisting of a coordinated set of architectural constraints applied to components, connectors, and data elements, within a distributed hypermedia system. REST ignores the details of component implementation and protocol syntax in order to focus on the roles of components, the constraints upon their interaction with other components, and their interpretation of significant data elements (Wikipedia) A style for structuring URLs (i.e. web addresses) according to the content they contain RESTful web service or RESTful web API Allows users to access data simply by invoking the URL Often returns a data structure defined in a simple grammar (e.g. JSON) which can be imported into an object in any programming language
49 Access method 5: BioMart A generic tool to facilitate the design and query of data warehouses Data warehouses are databases designed to optimise the performance of certain commonly performed queries May be less flexible than normalised schema Less suitable for maintaining primary data (harder to automatically define constraints due to form of data model) Nonetheless, can still be implemented within RDBMS BioMart uses mysql We have gene-centric and variant centric BioMarts for all Ensembl divisions BioMart comes with its own web interface
50 BioMart Web UI
51 Access Method 6: Virtual Machines Download an environment containing all of Ensembl to run on your machine In effect, you are downloading/running a model of a computer As long as your computer can support running the VM, there should be no problem with library incompatibilities etc. - all the resources Ensembl needs are within the VM Increasingly, a model of choice for running web-based services (e.g. in cloud environments) you don t deploy into a platform, you deploy a whole platform We use OpenBox, an open source virtualisation platform 51 pkersey@ebi.ac.uk Gramene Workshop, Plant and Animal Genomes XIII
52 Funding Ensembl Genomes Funded by EMBL EU (INFRAVEC, Microme, transplant, AllBio) BBSRC (PhytoPath, wheat/barley/midge sequencing, UK-US collaboration, RNAcentral) Wellcome Trust (PomBase) NIH/NIAID (VectorBase) NSF (Gramene collaboration) Bill and Melinda Gates Foundation (wheat rust) 52
53 People James Allen, Irina Armean, Dan Bolser, Bruce Bolt, Mikkel Christensen, Paul Davis, Thomas Down, Christoph Grabmueller, Kevin Howe, Arnaud Kerhornou, Julia Khobdova, Eugene Kulesha, Nick Langridge, Dan Lawson, Mark McDowall, Uma Maheswari, Gareth Maslen, Michael Nuhn, Chuang Kee Ong, Michael Paulini, Helder Pedro, Anton Petrov, Dan Staines, Brandon Walts, Gary Williams The vertebrate genomics EBI (Paul Flicek) 53 pkersey@ebi.ac.uk
54
Adventures in Cereal Genomics And Future Directions for Genomic Infrastructure. Paul Kersey
Adventures in Cereal Genomics And Future Directions for Genomic Infrastructure Paul Kersey A brief history of genome sequencing 1995 Haemophilus influenzae 1.8 Mb 1996 Saccharomyces cerevisiae 12 Mb 1999
More information1 st transplant user training workshop Versailles, 12th-13th November 2012
trans-national Infrastructure for Plant Genomic Science 1 st transplant user training workshop Versailles, 12th-13th November 2012 Paul Kersey, EMBL-EBI More people, less land Plant genome sequences, 2005
More informationBrowsing Genomes with Ensembl Genomes
Browsing Genomes with Ensembl Genomes www.ensemblgenomes.org Coursebook http://www.ebi.ac.uk/~blaise/beca BECA- ILRI 16 th October 2013 Chat room: http://tinyurl.com/ensembl-nairobi TABLE OF CONTENTS Introduction
More informationEnsembl and ENA. High level overview and use cases. Denise Carvalho-Silva. Ensembl Outreach Team
Ensembl and ENA High level overview and use cases Denise Carvalho-Silva Ensembl Outreach Team On behalf of Ensembl and ENA teams European Molecular Biology Laboratories Euroepan Bioinformatics Institute
More informationTriticeae genome MIPS PlantsDB. Manuel Spannagl transplant workshop Versailles Nov 2012
Triticeae genome resources @ MIPS PlantsDB Manuel Spannagl transplant workshop Versailles Nov 2012 What is MIPS PlantsDB? Generic database schema+system for the integration, management and (comparative)
More informationTriticeae Gene Nomenclature
A Wheat Initiative workshop Triticeae Gene Nomenclature 11-13 October 2016, Munich (Germany) Report 1 Background A number of high quality genome sequences have been completed in recent years for species
More informationTraining materials.
Training materials - Ensembl training materials are protected by a CC BY license - http://creativecommons.org/licenses/by/4.0/ - If you wish to re-use these materials, please credit Ensembl for their creation
More informationOverview of the next two hours...
Overview of the next two hours... Before tea Session 1, Browser: Introduction Ensembl Plants and plant variation data Hands-on Variation in the Ensembl browser Displaying your data in Ensembl After tea
More informationCustomer Case Study: University of Bristol Wheat Genomics
Customer Case Study: University of Bristol Wheat Genomics Why wheat? Wheat is one of the three most important crops for human and livestock feed, and with food supply an increasing global concern, the
More informationEuropean Genome phenome Archive at the European Bioinformatics Institute. Helen Parkinson Head of Molecular Archives
European Genome phenome Archive at the European Bioinformatics Institute Helen Parkinson Head of Molecular Archives What is EMBL-EBI? International, non-profit research institute Part of the European Molecular
More informationGREG GIBSON SPENCER V. MUSE
A Primer of Genome Science ience THIRD EDITION TAGCACCTAGAATCATGGAGAGATAATTCGGTGAGAATTAAATGGAGAGTTGCATAGAGAACTGCGAACTG GREG GIBSON SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc.
More informationEnsembl workshop. Thomas Randall, PhD bioinformatics.unc.edu. handouts, papers, datasets
Ensembl workshop Thomas Randall, PhD tarandal@email.unc.edu bioinformatics.unc.edu www.unc.edu/~tarandal/ensembl handouts, papers, datasets Ensembl is a joint project between EMBL - EBI and the Sanger
More informationIdentifying the functional bases of trait variation in Brassica napus using Associative Transcriptomics
Brassica genome structure and evolution Genome framework for association genetics Establishing marker-trait associations 31 st March 2014 GENOME RELATIONSHIPS BETWEEN SPECIES U s TRIANGLE 31 st March 2014
More informationExcerpts from a Seminar at Bayer CropScience February 2015
Excerpts from a Seminar at Bayer CropScience February 2015 Kellye Eversole IWGSC Executive Director Seminar Ghent, Belgium 12 February 2015 Update on the International Wheat Genome Sequencing Consortium
More informationChapter 22 Next Generation Sequencing Enabled Genetics in Hexaploid Wheat
Chapter 22 Next Generation Sequencing Enabled Genetics in Hexaploid Wheat Ricardo H. Ramirez-Gonzalez, Vanesa Segovia, Nicholas Bird, Mario Caccamo, and Cristobal Uauy Abstract Next Generation Sequencing
More informationWheat Genome Structural Annotation Using a Modular and Evidence-combined Annotation Pipeline
Wheat Genome Structural Annotation Using a Modular and Evidence-combined Annotation Pipeline Xi Wang Bioinformatics Scientist Computational Life Science Page 1 Bayer 4:3 Template 2010 March 2016 17/01/2017
More informationGuided tour to Ensembl
Guided tour to Ensembl Introduction Introduction to the Ensembl project Walk-through of the browser Variations and Functional Genomics Comparative Genomics BioMart Ensembl Genome browser http://www.ensembl.org
More informationThe Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica
The Ensembl Database Dott.ssa Inga Prokopenko Corso di Genomica 1 www.ensembl.org Lecture 7.1 2 What is Ensembl? Public annotation of mammalian and other genomes Open source software Relational database
More informationBIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP
Jasper Decuyper BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP MB&C2017 Workshop Bioinformatics for dummies 2 INTRODUCTION Imagine your workspace without the computers Both in research laboratories and in
More informationElixir: European Bioinformatics Research Infrastructure. Rolf Apweiler
Elixir: European Bioinformatics Research Infrastructure Rolf Apweiler EMBL-EBI Service Mission To enable life science research and its translation to medicine, agriculture, the bioindustries and society
More informationEnsembl Tools. EBI is an Outstation of the European Molecular Biology Laboratory.
Ensembl Tools EBI is an Outstation of the European Molecular Biology Laboratory. Questions? We ve muted all the mics Ask questions in the Chat box in the webinar interface I will check the Chat box periodically
More informationChapter 2: Access to Information
Chapter 2: Access to Information Outline Introduction to biological databases Centralized databases store DNA sequences Contents of DNA, RNA, and protein databases Central bioinformatics resources: NCBI
More informationThe international effort to sequence the 17Gb wheat genome: Yes, Wheat can!
ACTTGTGCATAGCATGCAATGCCAT ATATAGCAGTCTGCTAAGTCTATAG CAGACCCTCAACGTGGATCATCCGT AGCTAGCCATGACATTGATCCTGAT TTACACCATGTACTATCGAGAGCAG TACTACCATGTTACGATCAAAGCCG TTACGATAGCATGAACTTGTGCATA GCATGCAATGCCATATATAGCAGTC
More informationThe University of California, Santa Cruz (UCSC) Genome Browser
The University of California, Santa Cruz (UCSC) Genome Browser There are hundreds of available userselected tracks in categories such as mapping and sequencing, phenotype and disease associations, genes,
More informationA tutorial introduction into the MIPS PlantsDB barley&wheat database instances
transplant 2 nd user training workshop Poznan, Poland, June, 27 th, 2013 A tutorial introduction into the MIPS PlantsDB barley&wheat database instances TUTORIAL ANSWERS Please direct any questions related
More informationGenetics and Bioinformatics
Genetics and Bioinformatics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Lecture 1: Setting the pace 1 Bioinformatics what s
More informationIntegrating and Leveraging Cereal and Grass Genome Resources. David Marshall Information and Computational Sciences Group
Integrating and Leveraging Cereal and Grass Genome Resources David Marshall Information and Computational Sciences Group David.Marshall@hutton.ac.uk What I m going to cover Brief overview of the new barley
More informationTwo Mark question and Answers
1. Define Bioinformatics Two Mark question and Answers Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. There are three
More informationAnchoring and ordering NGS contig assemblies by population sequencing (POPSEQ)
Anchoring and ordering NGS contig assemblies by population sequencing (POPSEQ) Martin Mascher IPK Gatersleben PAG XXII January 14, 2012 slide 1 Proof-of-principle in barley Diploid model for wheat 5 Gb
More informationThis place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.
G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY Methods or systems for genetic
More informationFrom Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow
From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with
More informationThis software/database/presentation is a "United States Government Work" under the terms of the United States Copyright Act. It was written as part
This software/database/presentation is a "United States Government Work" under the terms of the United States Copyright Act. It was written as part of the author's official duties as a United States Government
More informationIntroduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph
Introduction to Plant Genomics and Online Resources Manish Raizada University of Guelph Genomics Glossary http://www.genomenewsnetwork.org/articles/06_00/sequence_primer.shtml Annotation Adding pertinent
More informationBasics of RNA-Seq. (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility
2018 ABRF Meeting Satellite Workshop 4 Bridging the Gap: Isolation to Translation (Single Cell RNA-Seq) Sunday, April 22 Basics of RNA-Seq (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly,
More informationGenomic resources. for non-model systems
Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing
More informationELIXIR: data for molecular biology and points of entry for marine scientists
ELIXIR: data for molecular biology and points of entry for marine scientists Guy Cochrane, EMBL-EBI EuroMarine 2018 General Assembly meeting 17-18 January 2018 www.elixir-europe.org Scales of molecular
More informationlatestdevelopments relevant for the Ag sector André Eggen Agriculture Segment Manager, Europe
Overviewof Illumina s latestdevelopments relevant for the Ag sector André Eggen Agriculture Segment Manager, Europe Seminar der Studienrichtung Tierwissenschaften, TÜM, July 1, 2009 Overviewof Illumina
More information9/19/13. cdna libraries, EST clusters, gene prediction and functional annotation. Biosciences 741: Genomics Fall, 2013 Week 3
cdna libraries, EST clusters, gene prediction and functional annotation Biosciences 741: Genomics Fall, 2013 Week 3 1 2 3 4 5 6 Figure 2.14 Relationship between gene structure, cdna, and EST sequences
More informationAdvanced Technology in Phytoplasma Research
Advanced Technology in Phytoplasma Research Sequencing and Phylogenetics Wednesday July 8 Pauline Wang pauline.wang@utoronto.ca Lethal Yellowing Disease Phytoplasma Healthy palm Lethal yellowing of palm
More informationHigh peformance computing infrastructure for bioinformatics
High peformance computing infrastructure for bioinformatics Scott Hazelhurst University of the Witwatersrand December 2009 What we need Skills, time What we need Skills, time Fast network Lots of storage
More informationDeakin Research Online
Deakin Research Online This is the published version: Church, Philip, Goscinski, Andrzej, Wong, Adam and Lefevre, Christophe 2011, Simplifying gene expression microarray comparative analysis., in BIOCOM
More informationLeonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015
Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH BIOL 7210 A Computational Genomics 2/18/2015 The $1,000 genome is here! http://www.illumina.com/systems/hiseq-x-sequencing-system.ilmn Bioinformatics bottleneck
More informationTranscriptome Assembly, Functional Annotation (and a few other related thoughts)
Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Monica Britton, Ph.D. Sr. Bioinformatics Analyst June 23, 2017 Differential Gene Expression Generalized Workflow File Types
More informationUC Davis UC Davis Previously Published Works
UC Davis UC Davis Previously Published Works Title Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome
More informationBasic Bioinformatics: Homology, Sequence Alignment,
Basic Bioinformatics: Homology, Sequence Alignment, and BLAST William S. Sanders Institute for Genomics, Biocomputing, and Biotechnology (IGBB) High Performance Computing Collaboratory (HPC 2 ) Mississippi
More informationGene-centered resources at NCBI
COURSE OF BIOINFORMATICS a.a. 2014-2015 Gene-centered resources at NCBI We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about genes, serving
More informationPharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001
Pharmacogenetics: A SNPshot of the Future Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 1 I. What is pharmacogenetics? It is the study of how genetic variation affects drug response
More informationBrowsing Genomes with Ensembl
April Feb 2006 2007 Browsing Genomes with Ensembl Joint project Ensembl - Project EMBL European Bioinformatics Institute (EBI) Wellcome Trust Sanger Institute Produce accurate, automatic genome annotation
More informationIntroduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks
Introduction to Bioinformatics CPSC 265 Thanks to Jonathan Pevsner, Ph.D. Textbooks Johnathan Pevsner, who I stole most of these slides from (thanks!) has written a textbook, Bioinformatics and Functional
More informationIntroduction to BIOINFORMATICS
Introduction to BIOINFORMATICS Antonella Lisa CABGen Centro di Analisi Bioinformatica per la Genomica Tel. 0382-546361 E-mail: lisa@igm.cnr.it http://www.igm.cnr.it/pagine-personali/lisa-antonella/ What
More informationGENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.
!! www.clutchprep.com CONCEPT: OVERVIEW OF GENOMICS Genomics is the study of genomes in their entirety Bioinformatics is the analysis of the information content of genomes - Genes, regulatory sequences,
More informationWhat is Bioinformatics?
What is Bioinformatics? Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline. - NCBI The ultimate goal of the field is
More informationNature Biotechnology: doi: /nbt.3943
Supplementary Figure 1. Distribution of sequence depth across the bacterial artificial chromosomes (BACs). The x-axis denotes the sequencing depth (X) of each BAC and y-axis denotes the number of BACs
More informationTraining materials.
Training materials Ensembl training materials are protected by a CC BY license http://creativecommons.org/licenses/by/4.0/ If you wish to re-use these materials, please credit Ensembl for their creation
More informationDiscovery and development of exome-based, co-dominant single nucleotide polymorphism markers in hexaploid wheat (Triticum aestivum L.
Application note GAPP-0001 Discovery and development of exome-based, co-dominant single nucleotide polymorphism markers in hexaploid wheat (Triticum aestivum L.) Keith J. Edwards, University of Bristol,
More information1000 Insect Transcriptomes Evolution - 1KITE
1KITE 1K Insect Transcriptome Evolution 1000 Insect Transcriptomes Evolution - 1KITE An Example of Handling "Big Data" Karen Meusemann, on behalf of the 1KITE Consortium CSIRO Ecosystem Sciences, Australian
More informationFUNCTIONAL BIOINFORMATICS
Molecular Biology-2018 1 FUNCTIONAL BIOINFORMATICS PREDICTING THE FUNCTION OF AN UNKNOWN PROTEIN Suppose you have found the amino acid sequence of an unknown protein and wish to find its potential function.
More informationArray-Ready Oligo Set for the Rat Genome Version 3.0
Array-Ready Oligo Set for the Rat Genome Version 3.0 We are pleased to announce Version 3.0 of the Rat Genome Oligo Set containing 26,962 longmer probes representing 22,012 genes and 27,044 gene transcripts.
More informationKnetMiner USER TUTORIAL
KnetMiner USER TUTORIAL Keywan Hassani-Pak ROTHAMSTED RESEARCH 10 NOVEMBER 2017 About KnetMiner KnetMiner, with a silent "K" and standing for Knowledge Network Miner, is a suite of open-source software
More information3. human genomics clone genes associated with genetic disorders. 4. many projects generate ordered clones that cover genome
Lectures 30 and 31 Genome analysis I. Genome analysis A. two general areas 1. structural 2. functional B. genome projects a status report 1. 1 st sequenced: several viral genomes 2. mitochondria and chloroplasts
More informationIntroduc)on to Databases and Resources Biological Databases and Resources
Introduc)on to Bioinforma)cs Online Course : IBT Introduc)on to Databases and Resources Biological Databases and Resources Learning Objec)ves Introduc)on to Databases and Resources - Understand how bioinforma)cs
More informationThe 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential
The 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential Applications Richard Finkers Researcher Plant Breeding, Wageningen UR Plant Breeding, P.O. Box 16, 6700 AA, Wageningen, The Netherlands,
More informationNEXT GENERATION SEQUENCING. Farhat Habib
NEXT GENERATION SEQUENCING HISTORY HISTORY Sanger Dominant for last ~30 years 1000bp longest read Based on primers so not good for repetitive or SNPs sites HISTORY Sanger Dominant for last ~30 years 1000bp
More informationGenomic resources and gene/qtl discovery in cereals
Genomic resources and gene/qtl discovery in cereals Roberto Tuberosa Dept. of Agroenvironmental Sciences & Technology University of Bologna, Italy The ABDC Congress 1-4 March 2010 Gudalajara, Mexico Outline
More informationThe Irys System. Rapid Genome Wide Mapping for de novo Assembly and Structural Variation Analysis. Jack Peart, Ph.D. Director of Sales EMEA
The Irys System Rapid Genome Wide Mapping for de novo Assembly and Structural Variation Analysis Jack Peart, Ph.D. Director of Sales EMEA BioNano Snapshot Developed & Commercialized the Irys System for
More informationELE4120 Bioinformatics. Tutorial 5
ELE4120 Bioinformatics Tutorial 5 1 1. Database Content GenBank RefSeq TPA UniProt 2. Database Searches 2 Databases A common situation for alignment is to search through a database to retrieve the similar
More informationThe RNA tools registry
university of copenhagen f a c u lt y o f h e a lt h a n d m e d i c a l s c i e n c e s The RNA tools registry A community effort to catalog RNA bioinformatics resources and their relationships Anne Wenzel
More informationSupplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly
Supplementary Tables Supplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly Library Read length Raw data Filtered data insert size (bp) * Total Sequence depth Total Sequence
More informationI.1 The Principle: Identification and Application of Molecular Markers
I.1 The Principle: Identification and Application of Molecular Markers P. Langridge and K. Chalmers 1 1 Introduction Plant breeding is based around the identification and utilisation of genetic variation.
More informationBIOINFORMATICS AN OVERVIEW
BIOINFORMATICS AN OVERVIEW T.R. Sharma Genoinformatics Lab, National Research Centre on Plant Biotechnology I.A.R.I, New Delhi 110012 trsharma@nrcpb.org Introduction Bioinformatics is the computational
More informationBiology 644: Bioinformatics
Processes Activation Repression Initiation Elongation.... Processes Splicing Editing Degradation Translation.... Transcription Translation DNA Regulators DNA-Binding Transcription Factors Chromatin Remodelers....
More informationIntroductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie. Sander van Boheemen Medical Microbiology
Introductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie Sander van Boheemen Medical Microbiology Next-generation sequencing Next-generation sequencing (NGS), also known as
More informationCBC Data Therapy. Metagenomics Discussion
CBC Data Therapy Metagenomics Discussion General Workflow Microbial sample Generate Metaomic data Process data (QC, etc.) Analysis Marker Genes Extract DNA Amplify with targeted primers Filter errors,
More informationHigh Cross-Platform Genotyping Concordance of Axiom High-Density Microarrays and Eureka Low-Density Targeted NGS Assays
High Cross-Platform Genotyping Concordance of Axiom High-Density Microarrays and Eureka Low-Density Targeted NGS Assays Ali Pirani and Mohini A Patil ISAG July 2017 The world leader in serving science
More informationFruit and Nut Trees Genomics and Quantitative Genetics
Fruit and Nut Trees Genomics and Quantitative Genetics Jasper Rees Department of Biotechnology University of the Western Cape South Africa jrees@uwc.ac.za The Challenges of Tree Breeding Long breeding
More informationEvolutionary Genetics: Part 1 Polymorphism in DNA
Evolutionary Genetics: Part 1 Polymorphism in DNA S. chilense S. peruvianum Winter Semester 2012-2013 Prof Aurélien Tellier FG Populationsgenetik Color code Color code: Red = Important result or definition
More informationGlobal Biomolecular Information Infrastructure and Australia. Graham Cameron Director The EMBL Australia Bioinformatics Resource
Global Biomolecular Information Infrastructure and Australia Graham Cameron Director The EMBL Australia Bioinformatics Resource What is bioinformatics? Methods, data, IT to exploit biomolecular information
More informationEMBL-EBI and pan-national scale bioinformatics: Relevance to Australia
EMBL-EBI and pan-national scale bioinformatics: Relevance to Australia Paul Flicek Vertebrate Genomics, European Molecular Biology Laboratory Wellcome Trust Sanger Institute European Molecular Biology
More informationCrash-course in genomics
Crash-course in genomics Molecular biology : How does the genome code for function? Genetics: How is the genome passed on from parent to child? Genetic variation: How does the genome change when it is
More informationPlant Breeding and Agri Genomics. Team Genotypic 24 November 2012
Plant Breeding and Agri Genomics Team Genotypic 24 November 2012 Genotypic Family: The Best Genomics Experts Under One Roof 10 PhDs and 78 MSc MTech BTech ABOUT US! Genotypic is a Genomics company, which
More informationMolecular and Applied Genetics
Molecular and Applied Genetics Ian King, Iain Donnison, Helen Ougham, Julie King and Sid Thomas Developing links between rice and the grasses 6 Gene isolation 7 Informatics 8 Statistics and multivariate
More informationApplied Bioinformatics
Applied Bioinformatics Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Course overview What is bioinformatics Data driven science: the creation and advancement
More informationFrom the genome to the field : how to improve the isolation of genomic regions of interest for plant breeding.
From the genome to the field : how to improve the isolation of genomic regions of interest for plant breeding. Hélène BERGES Director of the French Plant Genomic Center INRA Toulouse The French Plant Genomic
More informationTIGR THE INSTITUTE FOR GENOMIC RESEARCH
Introduction to Genome Annotation: Overview of What You Will Learn This Week C. Robin Buell May 21, 2007 Types of Annotation Structural Annotation: Defining genes, boundaries, sequence motifs e.g. ORF,
More informationFunctional genomics to improve wheat disease resistance. Dina Raats Postdoctoral Scientist, Krasileva Group
Functional genomics to improve wheat disease resistance Dina Raats Postdoctoral Scientist, Krasileva Group Talk plan Goal: to contribute to the crop improvement by isolating YR resistance genes from cultivated
More informationA brief introduction to Marker-Assisted Breeding. a BASF Plant Science Company
A brief introduction to Marker-Assisted Breeding a BASF Plant Science Company Gene Expression DNA is stored in chromosomes within the nucleus of each cell RNA Cell Chromosome Gene Isoleucin Proline Valine
More informationDurum Wheat Genomics and Breeding EWG Annual report and action plan
Coordinating global research for wheat Durum Wheat Genomics and Breeding EWG Annual report and action plan NAME OF EXPERT WORKING GROUP EWG on DURUM WHEAT GENOMICS AND BREEDING LEADERSHIP & AUTHORSHIP
More informationPlant Science into Practice: the Pre-Breeding Revolution
Plant Science into Practice: the Pre-Breeding Revolution Alison Bentley, Ian Mackay, Richard Horsnell, Phil Howell & Emma Wallington @AlisonRBentley NIAB is active at every point of the crop improvement
More informationExperimental Design Microbial Sequencing
Experimental Design Microbial Sequencing Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu General rules for preparing
More informationMarker types. Potato Association of America Frederiction August 9, Allen Van Deynze
Marker types Potato Association of America Frederiction August 9, 2009 Allen Van Deynze Use of DNA Markers in Breeding Germplasm Analysis Fingerprinting of germplasm Arrangement of diversity (clustering,
More informationComputational Challenges in Life Sciences Research Infrastructures
Computational Challenges in Life Sciences Research Infrastructures Alvis Brazma European Bioinformatics Institute European Molecular Biology Laboratory European Bioinformatics Institute (EBI) EBI is in
More informationInterpreting Genome Data for Personalised Medicine. Professor Dame Janet Thornton EMBL-EBI
Interpreting Genome Data for Personalised Medicine Professor Dame Janet Thornton EMBL-EBI Deciphering a genome 3 billion bases 4 million variants 21,000 coding variants 10,000 non-synonymous variants 50-100
More informationEECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science
EECS 730 Introduction to Bioinformatics Sequence Alignment Luke Huan Electrical Engineering and Computer Science http://people.eecs.ku.edu/~jhuan/ Database What is database An organized set of data Can
More informationContact us for more information and a quotation
GenePool Information Sheet #1 Installed Sequencing Technologies in the GenePool The GenePool offers sequencing service on three platforms: Sanger (dideoxy) sequencing on ABI 3730 instruments Illumina SOLEXA
More informationLecture 12. Genomics. Mapping. Definition Species sequencing ESTs. Why? Types of mapping Markers p & Types
Lecture 12 Reading Lecture 12: p. 335-338, 346-353 Lecture 13: p. 358-371 Genomics Definition Species sequencing ESTs Mapping Why? Types of mapping Markers p.335-338 & 346-353 Types 222 omics Interpreting
More informationLARGE DATA AND BIOMEDICAL COMPUTATIONAL PIPELINES FOR COMPLEX DISEASES
1 LARGE DATA AND BIOMEDICAL COMPUTATIONAL PIPELINES FOR COMPLEX DISEASES Ezekiel Adebiyi, PhD Professor and Head, Covenant University Bioinformatics Research and CU NIH H3AbioNet node Covenant University,
More informationA tutorial introduction into the MIPS PlantsDB barley&wheat databases. Manuel Spannagl&Kai Bader transplant user training Poznan June 2013
A tutorial introduction into the MIPS PlantsDB barley&wheat databases Manuel Spannagl&Kai Bader transplant user training Poznan June 2013 MIPS PlantsDB tutorial - some exercises Please go to: http://mips.helmholtz-muenchen.de/plant/genomes.jsp
More informationACCELERATING GENOMIC ANALYSIS ON THE CLOUD. Enabling the PanCancer Analysis of Whole Genomes (PCAWG) consortia to analyze thousands of genomes
ACCELERATING GENOMIC ANALYSIS ON THE CLOUD Enabling the PanCancer Analysis of Whole Genomes (PCAWG) consortia to analyze thousands of genomes Enabling the PanCancer Analysis of Whole Genomes (PCAWG) consortia
More informationEcological genomics and molecular adaptation: state of the Union and some research goals for the near future.
Ecological genomics and molecular adaptation: state of the Union and some research goals for the near future. Louis Bernatchez Genomics and Conservation of Aquatic Resources Université LAVAL! Molecular
More informationIntroduction and Public Sequence Databases. BME 110/BIOL 181 CompBio Tools
Introduction and Public Sequence Databases BME 110/BIOL 181 CompBio Tools Todd Lowe March 29, 2011 Course Syllabus: Admin http://www.soe.ucsc.edu/classes/bme110/spring11 Reading: Chapters 1, 2 (pp.29-56),
More information