all official ACS meetings and events without the express written consent from the ACS.

Size: px
Start display at page:

Download "all official ACS meetings and events without the express written consent from the ACS."

Transcription

1 248th American Chemical Society National Meeting & Exposition The use of any device to capture images (e.g., cameras and camera phones) or sound (e.g., tape and digital recorders) or stream, upload or rebroadcast speakers or presentations is strictly prohibited at Services Research Training Industry all official ACS meetings and events without the express written consent from the ACS. NO RECORDING PLEASE

2 ChEMBL - Linking chemistry and biology to enable mapping onto molecular pathways Louisa Bellis Chemical Content Curator, ChEMBL Database EMBL-EBI, UK ACS 10 th -14 th August 2014 Services Research Training Industry

3 Drug discovery From discovering a target to a drug reaching the market: 12 years. Bioinformatics shortens time to target discovery. EMBL-EBI services support all stages of drug discovery.

4 Data resources at EMBL-EBI Genes, genomes & variation European Nucleotide Archive 1000 Genomes Ensembl Ensembl Genomes European Genome-phenome Archive Metagenomics portal Gene, protein & metabolite expression Literature & ontologies Europe PubMed Central Gene Ontology Experimental Factor Ontology ArrayExpress Expression Atlas Metabolights PRIDE Protein sequences, families & motifs InterPro Pfam UniProt Molecular structures Protein Data Bank in Europe Electron Microscopy Data Bank Chemical biology Reactions, interactions & pathways IntAct Reactome MetaboLights ChEMBL Systems BioModels Enzyme Portal ChEBI BioSamples

5 What is the ChEMBL database? Open access data for drug discovery Core of database is the primary Med Chem literature, covering over 33 years of research inc: J. Med. Chem., Bioorg. Med. Chem. Lett., Science Data-sharing arrangement with PubChem Contributed datasets Mainly NTD, GSK kinase set, TG-Gates, DNDi etc Marketed Drugs Mostly FDA OrangeBook, Dailymed etc

6

7 Data Statistics Small Molecules Peptides Biotherapeutics Compound Content Abstracted from 57,156 papers across 47 journals 1,411,786 compounds (~450,000 from PubChem) 12,843,338 activities (>6.0 million from PubChem) binding measurements, functional assays and ADMET 10,579 targets, inc. > 5,400 protein targets and over 2,900 human targets.

8 ChEMBL Target Types Molecular Non-molecular Nucleic acid Protein Cell-line Tissue Subcellular-fraction Organism DNA HEK293 cells Nervous Mitochondria Drosophila Single Protein Protein Complex Protein Family PDE5 Nicotinic acetylcholine receptor Muscarinic receptors

9 Bioactivity measurements > 5000 different endpoint types (e.g., Ki, LD50, heart rate change, sleep time, growth inhibition) Standardisation of endpoints and units Standardisation of endpoints and mapping to BAO Half-life, Elimination half-life, t1/2 -> T1/2 -> BAO_ Conversion of concentrations to nm Anti-logging of log data pic50 = 9 -> IC50 = 1nM pchembl values for simple comparison -Log(molar IC50, XC50, EC50, AC50, Ki, Kd or Potency)

10 pchembl

11 MAPPING TO PATHWAYS

12 Compound Report Card

13 Compound Report Card

14 Target Browsing & Searching Target Name

15 Target Browsing & Searching

16 Target Report Card

17 Target Report Card

18

19 For performing more general search (e.g., for a disease process, animal model, cell type of interest)

20

21 For specific pathway searching: OPRM1

22 Search for OPRM1 in ChEMBL

23

24 Compounds linked to activity for OPRM1

25 Overlay ChEMBL Compounds that interact with Pathway

26 EBI RDF Platform

27 ChEMBL RDF Compound Bioactivity Assay Target Ref ftp://ftp.ebi.ac.uk/pub/databases/chembl/chembl-rdf/

28 Federated (cross-database) queries Reactome proteins + drug-like molecules

29 Proteins in pathway interacting with drug-like cmpds Caspase-8 34 TNF-alpha 12 TNF-R1 7 RIPK1

30 ADME SARFARI

31 Homepage Keyword Search Components Molecule Search Protein BLAST Search Submit molecule to ADME Target Model Prediction Submit molecule to substructure/similarity search Run BLAST Search FASTA Lookup

32 Target Model Search Aim: Identify ADME protein targets which may bind and possibly metabolise user submitted structure User can draw, upload or drag n drop a structure Click this button to submit structure to model

33 Model Search Workflow Default Results Page: Orthologues* *Orthologue page default colouring changes when running a model search. Targets are coloured green when predicted to bind user submitted compound structure. Related Target Bioactivities Protein Tissue Expression Query Related Compounds Pharmacokinetics Overview

34 Rows = Orthologues Groups Orthologues All Data Filter table by organism, source, class (Phase I, Phase II, Transporter,..) Export Data Columns = Organism Link to Target Report Card Link to Orthologue Group overview page alignments, variation data, external links

35 Bioactivities All Data Hide/Show Extra Columns Export Data All ChEMBL ADME Related Bioactivity Data

36 CROP PROTECTION DATA

37 Crop Protection Data Collaboration with Syngenta in adding crop protection data Project to extract and incorporate additional insecticide, fungicide and herbicide bioactivity data 2,444 articles of interest identified through combination of text-mining and prioritisation by Syngenta 245,370 activity data points 34,940 distinct compound structures Of which 28K are new to ChEMBL

38 Data Content Target Organisms

39 Data Content Target Organisms Additional crop protection data set contains: 441 plant targets (organism/tissue/protein targets) 71 protein targets (all species) 326 fungal targets (organism/tissue/protein targets) 17 protein targets (all species) 281 insect targets (organism/tissue/protein targets) 50 protein targets (all species)

40 ChEMBL Integration + Future Plans Crop protection data incorporated into ChEMBL_19 release Look into feasibility of adding new data in future releases Enhanced mechanisms for searching/filtering crop protection data Improved organism classification and standardisation of endpoints

41 Help and Feedback addresses for support queries and feedback General questions and feedback on ChEMBL interface: Reporting of data errors:

42 Acknowledgements ChEMBL John Overington Francis Atkinson Patricia Bento Jon Chambers Mark Davies Nathan Dedman Anna Gaulton Anne Hersey Michal Nowotka George Papadatos José Bach Hardie Samuel Croset Felix Krueger Grace Mugumbate Rita Santos Crop Protection Data Namrata Kale (assay curation) Gerard van Westen (identification of relevant papers)

43 Thank you Any questions? Services Research Training Industry

Elixir: European Bioinformatics Research Infrastructure. Rolf Apweiler

Elixir: European Bioinformatics Research Infrastructure. Rolf Apweiler Elixir: European Bioinformatics Research Infrastructure Rolf Apweiler EMBL-EBI Service Mission To enable life science research and its translation to medicine, agriculture, the bioindustries and society

More information

European Genome phenome Archive at the European Bioinformatics Institute. Helen Parkinson Head of Molecular Archives

European Genome phenome Archive at the European Bioinformatics Institute. Helen Parkinson Head of Molecular Archives European Genome phenome Archive at the European Bioinformatics Institute Helen Parkinson Head of Molecular Archives What is EMBL-EBI? International, non-profit research institute Part of the European Molecular

More information

The ChEMBL Database. ICIC 2012 Berlin, Germany October John P. Overington EMBL- EBI.

The ChEMBL Database. ICIC 2012 Berlin, Germany October John P. Overington EMBL- EBI. The ChEMBL Database ICIC 2012 Berlin, Germany October 2012 John P. Overington EMBL- EBI jpo@ebi.ac.uk Chemical Space All compounds Drug-like compounds Available compounds Only certain molecules have features

More information

Introduction to EMBL-EBI.

Introduction to EMBL-EBI. Introduction to EMBL-EBI www.ebi.ac.uk What is EMBL-EBI? Part of EMBL Austria, Belgium, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands,

More information

Leveraging Open Chemogenomics Data and Tools with KNIME

Leveraging Open Chemogenomics Data and Tools with KNIME Leveraging Open Chemogenomics Data and Tools with KNIME George Papadatos ChEMBL Group georgep@ebi.ac.uk What is EMBL-EBI? Europe s home for biological data, services, research and training A trusted data

More information

Drug Targets - an overview of historical success and protein kinase inhibitors - successes and attrition. John P. Overington

Drug Targets - an overview of historical success and protein kinase inhibitors - successes and attrition. John P. Overington Drug Targets - an overview of historical success and protein kinase inhibitors - successes and attrition John P. Overington jpo@ebi.ac.uk Assay/Target ChEMBL The Organisation of Drug Discovery 1. Scientific

More information

Towards standard, accessible and reproducible Metabolomics

Towards standard, accessible and reproducible Metabolomics Towards standard, accessible and reproducible Metabolomics Reza Salek PhD Metabolism and Molecular Informatics The European Bioinformatics Institute (EMBL-EBI) Email: Reza.salek@ebi.ac.uk The 1st International

More information

Global Biomolecular Information Infrastructure and Australia. Graham Cameron Director The EMBL Australia Bioinformatics Resource

Global Biomolecular Information Infrastructure and Australia. Graham Cameron Director The EMBL Australia Bioinformatics Resource Global Biomolecular Information Infrastructure and Australia Graham Cameron Director The EMBL Australia Bioinformatics Resource What is bioinformatics? Methods, data, IT to exploit biomolecular information

More information

ELIXIR: data for molecular biology and points of entry for marine scientists

ELIXIR: data for molecular biology and points of entry for marine scientists ELIXIR: data for molecular biology and points of entry for marine scientists Guy Cochrane, EMBL-EBI EuroMarine 2018 General Assembly meeting 17-18 January 2018 www.elixir-europe.org Scales of molecular

More information

Professor Ewan Birney FRS Director, EMBL-EBI

Professor Ewan Birney FRS Director, EMBL-EBI Infrastructures for research and innovation Professor Ewan Birney FRS Director, EMBL-EBI www.ebi.ac.uk Outline of talk Who Am I, What is EMBL? The change in genomics The needs for stratified patients in

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introduction to Bioinformatics Sequence Alignment Luke Huan Electrical Engineering and Computer Science http://people.eecs.ku.edu/~jhuan/ Database What is database An organized set of data Can

More information

EMBL-EBI Overview EMBL-EBI Overview

EMBL-EBI Overview EMBL-EBI Overview EMBL-EBI Overview Welcome Welcome to the European Bioinformatics Institute (EMBL-EBI), a global hub for big data in biology. We promote scientific progress by providing freely available data to the life-science

More information

CollecTF Documentation

CollecTF Documentation CollecTF Documentation Release 1.0.0 Sefa Kilic August 15, 2016 Contents 1 Curation submission guide 3 1.1 Data.................................................... 3 1.2 Before you start.............................................

More information

DRAGON DATABASE OF GENES ASSOCIATED WITH PROSTATE CANCER (DDPC) Monique Maqungo

DRAGON DATABASE OF GENES ASSOCIATED WITH PROSTATE CANCER (DDPC) Monique Maqungo DRAGON DATABASE OF GENES ASSOCIATED WITH PROSTATE CANCER (DDPC) Monique Maqungo South African National Bioinformatics Institute University of the Western Cape RELEVEANCE OF DATA SHARING! Fragmented data

More information

The Open Pharmacological Concepts Triple Store

The Open Pharmacological Concepts Triple Store The Open Pharmacological Concepts Triple Store Gerhard F. Ecker Dept Medicinal Chemistry, Univ Vienna Gerhard.f.ecker@univie.ac.at; www.openphacts.org The Innovative Medicines Initiative EC funded public-private

More information

Bioinformatics for Proteomics. Ann Loraine

Bioinformatics for Proteomics. Ann Loraine Bioinformatics for Proteomics Ann Loraine aloraine@uab.edu What is bioinformatics? The science of collecting, processing, organizing, storing, analyzing, and mining biological information, especially data

More information

The Gene Ontology Annotation (GOA) project application of GO in SWISS-PROT, TrEMBL and InterPro

The Gene Ontology Annotation (GOA) project application of GO in SWISS-PROT, TrEMBL and InterPro Comparative and Functional Genomics Comp Funct Genom 2003; 4: 71 74. Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cfg.235 Conference Review The Gene Ontology Annotation

More information

The Open PHArmacological Concepts Triple Store. Egon Willighagen Dept of Bioinformatics BiGCaT, Maastricht #oteu13

The Open PHArmacological Concepts Triple Store. Egon Willighagen Dept of Bioinformatics BiGCaT, Maastricht #oteu13 The Open PHArmacological Concepts Triple Store Egon Willighagen Dept of Bioinformatics BiGCaT, Maastricht University @egonwillighagen, #oteu13 The Innovative Medicines Initiative EC funded public-private

More information

Object Groups. SRI International Bioinformatics

Object Groups. SRI International Bioinformatics Object Groups 1 SRI International Bioinformatics Object Groups Collect and save lists of genes, metabolites, pathways Transform, filter, and analyze them Share groups with colleagues Use groups in conjunction

More information

The Open PHACTS Discovery Platform. Semantic Data Integration for Life Sciences

The Open PHACTS Discovery Platform. Semantic Data Integration for Life Sciences The Open PHACTS Discovery Platform Semantic Data Integration for Life Sciences Patent Expiry Generic Competition http://www.rsc.org/chemistryworld/issues/2009/january/pharmarefocusesonthepatent Cliff.asp

More information

IPA Advanced Training Course

IPA Advanced Training Course IPA Advanced Training Course Academia Sinica 2015 Oct Gene( 陳冠文 ) Supervisor and IPA certified analyst 1 Review for Introductory Training course Searching Building a Pathway Editing a Pathway for Publication

More information

Nordic Register and Biobank Data

Nordic Register and Biobank Data Nordic Register and Biobank Data A basis for innovative research on health and welfare Juni Palmgren Karolinska Institutet, Stockholm Nordic conference on Real World Data Helsinki November 2016 Nordic

More information

AN UNPRECEDENTLY LARGE-SCALE KINASE INHIBITOR SET ENABLING THE ACCURATE PREDICTION OF COMPOUND- KINASE ACTIVITIES: A WAY TOWARDS

AN UNPRECEDENTLY LARGE-SCALE KINASE INHIBITOR SET ENABLING THE ACCURATE PREDICTION OF COMPOUND- KINASE ACTIVITIES: A WAY TOWARDS Supporting Information AN UNPRECEDENTLY LARGE-SCALE KINASE INHIBITOR SET ENABLING THE ACCURATE PREDICTION OF COMPOUND- KINASE ACTIVITIES: A WAY TOWARDS SELECTIVE PROMISCUITY BY DESIGN? Serge Christmann-Franck

More information

PRIDE and ProteomeXchange

PRIDE and ProteomeXchange PRIDE and ProteomeXchange Henning Hermjakob Head of Molecular Systems European Bioinformatics Institute hhe@ebi.ac.uk Director of Bioinformatics National Center for Protein Sciences, Beijing Data resources

More information

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with

More information

Exercise1 ArrayExpress Archive - High-throughput sequencing example

Exercise1 ArrayExpress Archive - High-throughput sequencing example ArrayExpress and Atlas practical: querying and exporting gene expression data at the EBI Gabriella Rustici gabry@ebi.ac.uk This practical will introduce you to the data content and query functionality

More information

A White Paper on SCan- MarK Explorer The Sophic Cancer Biomarker Knowledge Environment

A White Paper on SCan- MarK Explorer The Sophic Cancer Biomarker Knowledge Environment A White Paper on SCan- MarK Explorer The Sophic Cancer Biomarker Knowledge Environment I. Abstract: The three- year SCan- MarK Explorer Phase I and II NCI Small Business Innovation Research (SBIR) Project

More information

ELE4120 Bioinformatics. Tutorial 5

ELE4120 Bioinformatics. Tutorial 5 ELE4120 Bioinformatics Tutorial 5 1 1. Database Content GenBank RefSeq TPA UniProt 2. Database Searches 2 Databases A common situation for alignment is to search through a database to retrieve the similar

More information

Applied Bioinformatics

Applied Bioinformatics Applied Bioinformatics Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Course overview What is bioinformatics Data driven science: the creation and advancement

More information

The University of California, Santa Cruz (UCSC) Genome Browser

The University of California, Santa Cruz (UCSC) Genome Browser The University of California, Santa Cruz (UCSC) Genome Browser There are hundreds of available userselected tracks in categories such as mapping and sequencing, phenotype and disease associations, genes,

More information

The Open Pharmacological Concepts Triple Store.

The Open Pharmacological Concepts Triple Store. The Open Pharmacological Concepts Triple Store www.openphacts.org Public Domain Drug Discovery Data: Pharma are accessing, processing, storing & re-processing Literature Patents PubChem Genbank Databases

More information

What has Open PHACTS done so far? Lee Harland On behalf of the Open PHACTS Consortium

What has Open PHACTS done so far? Lee Harland On behalf of the Open PHACTS Consortium What has Open PHACTS done so far? Lee Harland On behalf of the Open PHACTS Consortium Pre-competitive Informatics: Pharma are all accessing, processing, storing & re-processing external research data Literature

More information

Functional analysis using EBI Metagenomics

Functional analysis using EBI Metagenomics Functional analysis using EBI Metagenomics Contents Tutorial information... 2 Tutorial learning objectives... 2 An introduction to functional analysis using EMG... 3 What are protein signatures?... 3 Assigning

More information

ELIXIR connects national centres and EMBL EBI to build a sustainable European infrastructure for biological research data. medicine.

ELIXIR connects national centres and EMBL EBI to build a sustainable European infrastructure for biological research data. medicine. ELIXIR connects national centres and EMBL EBI to build a sustainable European infrastructure for biological research data. medicine agriculture 1 environment bioindustries ELIXIR underpins life science

More information

Overview of the next two hours...

Overview of the next two hours... Overview of the next two hours... Before tea Session 1, Browser: Introduction Ensembl Plants and plant variation data Hands-on Variation in the Ensembl browser Displaying your data in Ensembl After tea

More information

AP BIOLOGY. Investigation #3 Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST. Slide 1 / 32. Slide 2 / 32.

AP BIOLOGY. Investigation #3 Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST. Slide 1 / 32. Slide 2 / 32. New Jersey Center for Teaching and Learning Slide 1 / 32 Progressive Science Initiative This material is made freely available at www.njctl.org and is intended for the non-commercial use of students and

More information

ArrayExpress: Quick tour

ArrayExpress: Quick tour Melissa Burke [1] Gene Expression Beginner 0.5 hour This quick tour provides an overview of EMBL-EBI s functional genomics database ArrayExpress. This course was updated in December 2015. An undergraduate-level

More information

Interpreting Genome Data for Personalised Medicine. Professor Dame Janet Thornton EMBL-EBI

Interpreting Genome Data for Personalised Medicine. Professor Dame Janet Thornton EMBL-EBI Interpreting Genome Data for Personalised Medicine Professor Dame Janet Thornton EMBL-EBI Deciphering a genome 3 billion bases 4 million variants 21,000 coding variants 10,000 non-synonymous variants 50-100

More information

Deakin Research Online

Deakin Research Online Deakin Research Online This is the published version: Church, Philip, Goscinski, Andrzej, Wong, Adam and Lefevre, Christophe 2011, Simplifying gene expression microarray comparative analysis., in BIOCOM

More information

Genetics and Bioinformatics

Genetics and Bioinformatics Genetics and Bioinformatics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Lecture 1: Setting the pace 1 Bioinformatics what s

More information

Elixir: Overview, Progress and Futures

Elixir: Overview, Progress and Futures Elixir: Overview, Progress and Futures 20th Meeting of the EC-US Biotechnology Task Force Barcelona 3 June 2010 Andrew Lyall, ELIXIR Project Manager www.elixir-europe.org ELIXIR: a sustainable infrastructure

More information

WP4: Data services and reporting standards

WP4: Data services and reporting standards WP4: Data services and reporting standards WP leader Guy Cochrane EMBL-EBI WP co-leader Nils-Peder Willassen UiT Task 4.1 leader Jeena Rajan EMBL-EBI Task 4.2 leader Jeena Rajan EMBL-EBI Task 4.3 leader

More information

Targeting of the disease related proteome by small molecules

Targeting of the disease related proteome by small molecules Targeting of the disease related proteome by small molecules Workshop on chemical information 2018 EPFL Modest v. Korff utline Introduction Data sources Diseases Proteins Compounds Data relations Diseases

More information

Using semantic web technology to accelerate plant breeding.

Using semantic web technology to accelerate plant breeding. Using semantic web technology to accelerate plant breeding. Pierre-Yves Chibon 1,2,3, Benoît Carrères 1, Heleena de Weerd 1, Richard G. F. Visser 1,2,3, and Richard Finkers 1,3 1 Wageningen UR Plant Breeding,

More information

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005 Bioinformatics is the recording, annotation, storage, analysis, and searching/retrieval of nucleic acid sequence (genes and RNAs), protein sequence and structural information. This includes databases of

More information

Knowledge-Guided Analysis with KnowEnG Lab

Knowledge-Guided Analysis with KnowEnG Lab Han Sinha Song Weinshilboum Knowledge-Guided Analysis with KnowEnG Lab KnowEnG Center Powerpoint by Charles Blatti Knowledge-Guided Analysis KnowEnG Center 2017 1 Exercise In this exercise we will be doing

More information

TARGET VALIDATION. Maaike Everts, PhD (with slides from Dr. Suto)

TARGET VALIDATION. Maaike Everts, PhD (with slides from Dr. Suto) TARGET VALIDATION Maaike Everts, PhD (with slides from Dr. Suto) Drug Discovery & Development Source: http://dlab.cl/molecular-design/drug-discovery-phases/ How do you identify a target? Target: the naturally

More information

EBI web resources I: databases and tools. Yanbin Yin Spring 2013

EBI web resources I: databases and tools. Yanbin Yin Spring 2013 EBI web resources I: databases and tools Yanbin Yin Spring 2013 1 Outline Intro to EBI Databases and web tools UniProt Gene Ontology Hands on PracBce MOST MATERIALS ARE FROM: hkp://www.ebi.ac.uk/training/online/course-

More information

Motif Discovery in Drosophila

Motif Discovery in Drosophila Motif Discovery in Drosophila Wilson Leung Prerequisites Annotation of Transcription Start Sites in Drosophila Resources Web Site FlyBase The MEME Suite FlyFactorSurvey Web Address http://flybase.org http://meme-suite.org/

More information

Capabilities & Services

Capabilities & Services Capabilities & Services Accelerating Research & Development Table of Contents Introduction to DHMRI 3 Services and Capabilites: Genomics 4 Proteomics & Protein Characterization 5 Metabolomics 6 In Vitro

More information

dixa-a data infrastructure for chemical safety Jos Kleinjans Dept of Toxicogenomics Maastricht University

dixa-a data infrastructure for chemical safety Jos Kleinjans Dept of Toxicogenomics Maastricht University dixa-a data infrastructure for chemical safety Jos Kleinjans Dept of Toxicogenomics Maastricht University Big data in toxicogenomics The amountof data in ourworldhas been exploding. And the abilityto store,

More information

Product Applications for the Sequence Analysis Collection

Product Applications for the Sequence Analysis Collection Product Applications for the Sequence Analysis Collection Pipeline Pilot Contents Introduction... 1 Pipeline Pilot and Bioinformatics... 2 Sequence Searching with Profile HMM...2 Integrating Data in a

More information

ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data

ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Functional Genomics Team EBI-EMBL emma@ebi.ac.uk http://www.ebi.ac.uk/~emma/bcn_2012/ Talk structure Why do we need a database for

More information

Bioinformatics: Present & Future from an EBI Perspective. Janet Thornton. UCL February 14 th 2012

Bioinformatics: Present & Future from an EBI Perspective. Janet Thornton. UCL February 14 th 2012 Bioinformatics: Present & Future from an EBI Perspective Janet Thornton UCL February 14 th 2012 What is bioinformatics? The science of storing, retrieving and analysing large amounts of biological information

More information

Sequence Databases and database scanning

Sequence Databases and database scanning Sequence Databases and database scanning Marjolein Thunnissen Lund, 2012 Types of databases: Primary sequence databases (proteins and nucleic acids). Composite protein sequence databases. Secondary databases.

More information

Browsing Genes and Genomes with Ensembl

Browsing Genes and Genomes with Ensembl Browsing Genes and Genomes with Ensembl Emily Perry Ensembl Outreach Project Leader EMBL-EBI Objectives What is Ensembl? What type of data can you get in Ensembl? How to navigate the Ensembl browser website.

More information

The Open PHACTS Project: Progress and Future Sustainability

The Open PHACTS Project: Progress and Future Sustainability The Open PHACTS Project: Progress and Future Sustainability Lee Harland & Bryn Williams-Jones Open PHACTS / ConnectedDiscovery Tom Plasterer AstraZeneca/Open PHACTS Rep Fundamental issue: There is a *lot*

More information

Large Scale Enzyme Func1on Discovery: Sequence Similarity Networks for the Protein Universe

Large Scale Enzyme Func1on Discovery: Sequence Similarity Networks for the Protein Universe Large Scale Enzyme Func1on Discovery: Sequence Similarity Networks for the Protein Universe Boris Sadkhin University of Illinois, Urbana-Champaign Blue Waters Symposium May 2015 Overview The Protein Sequence

More information

Sequence Based Function Annotation

Sequence Based Function Annotation Sequence Based Function Annotation Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Sequence Based Function Annotation 1. Given a sequence, how to predict its biological

More information

Bioinformatics for Cell Biologists

Bioinformatics for Cell Biologists Bioinformatics for Cell Biologists 15 19 March 2010 Developmental Biology and Regnerative Medicine (DBRM) Schedule Monday, March 15 09.00 11.00 Introduction to course and Bioinformatics (L1) D224 Helena

More information

From AP investigative Laboratory Manual 1

From AP investigative Laboratory Manual 1 Comparing DNA Sequences to Understand Evolutionary Relationships. How can bioinformatics be used as a tool to determine evolutionary relationships and to better understand genetic diseases? BACKGROUND

More information

The European Bioinformatics Institute - Cambridge. Overview

The European Bioinformatics Institute - Cambridge. Overview The European Bioinformatics Institute - Cambridge Overview European Bioinformatics Institute (EMBL-EBI) Wellcome Genome Campus Hinxton, Cambridge CB10 1SD United Kingdom y www.ebi.ac.uk C +44 (0)1223 494

More information

Introduction to BIOINFORMATICS

Introduction to BIOINFORMATICS Introduction to BIOINFORMATICS Antonella Lisa CABGen Centro di Analisi Bioinformatica per la Genomica Tel. 0382-546361 E-mail: lisa@igm.cnr.it http://www.igm.cnr.it/pagine-personali/lisa-antonella/ What

More information

Minipig genome data aid species selection in pharmaceutical discovery and development MRF Rome, th November 2013

Minipig genome data aid species selection in pharmaceutical discovery and development MRF Rome, th November 2013 Minipig genome data aid species selection in pharmaceutical discovery and development MRF Rome, 18-19 th November 2013 Peter Woollard Computational Biology, GlaxoSmithKline Property of GlaxoSmithKline

More information

Patent PHACTS* George Papadatos. ChEMBL Group *patens (adj.): Latin for open, accessible

Patent PHACTS* George Papadatos. ChEMBL Group *patens (adj.): Latin for open, accessible Patent PHACTS* George Papadatos ChEMBL Group georgep@ebi.ac.uk *patens (adj.): Latin for open, accessible Patent annotations in Open PHACTS Huge amount of knowledge hidden in patent corpus Most of which

More information

Introduc)on to Databases and Resources Biological Databases and Resources

Introduc)on to Databases and Resources Biological Databases and Resources Introduc)on to Bioinforma)cs Online Course : IBT Introduc)on to Databases and Resources Biological Databases and Resources Learning Objec)ves Introduc)on to Databases and Resources - Understand how bioinforma)cs

More information

Beyond Text Mining: BRAIN. August 11 th, 2014/ACS

Beyond Text Mining: BRAIN. August 11 th, 2014/ACS Beyond Text Mining: BRAIN August 11 th, 2014/ACS Topics Market Drivers Euretos BRAIN Use cases Close 2 Key drivers The Data Tsunami Datarrhoeia Standards? Needle Transport DIY Data 3 Data: The new oil

More information

Protein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl)

Protein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl) Protein Sequence Analysis BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl) Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical

More information

Basic Bioinformatics: Homology, Sequence Alignment,

Basic Bioinformatics: Homology, Sequence Alignment, Basic Bioinformatics: Homology, Sequence Alignment, and BLAST William S. Sanders Institute for Genomics, Biocomputing, and Biotechnology (IGBB) High Performance Computing Collaboratory (HPC 2 ) Mississippi

More information

Polo Bibliotecario di Scienze, Farmacologia e Scienze Farmaceutiche. Scuola di Dottorato di Ricerca in Scienze Molecolari

Polo Bibliotecario di Scienze, Farmacologia e Scienze Farmaceutiche. Scuola di Dottorato di Ricerca in Scienze Molecolari Polo Bibliotecario di Scienze, Farmacologia e Scienze Farmaceutiche Scuola di Dottorato di Ricerca in Scienze Molecolari Open research data Scuola di Dottorato di Ricerca in Scienze Molecolari Information

More information

Data Retrieval from GenBank

Data Retrieval from GenBank Data Retrieval from GenBank Peter J. Myler Bioinformatics of Intracellular Pathogens JNU, Feb 7-0, 2009 http://www.ncbi.nlm.nih.gov (January, 2007) http://ncbi.nlm.nih.gov/sitemap/resourceguide.html Accessing

More information

Information Extraction from Biomedical Text

Information Extraction from Biomedical Text Information Extraction from Biomedical Text BMI/CS 776 www.biostat.wisc.edu/bmi776/ Mark Craven craven@biostat.wisc.edu Spring 2009 The Information Extraction Task: Named Entity Recognition Analysis of

More information

Quo vadis Drug Discovery

Quo vadis Drug Discovery Quo vadis Drug Discovery Donatella Verbanac, Dubravko Jelić, Sanja Koštrun, Višnja Stepanić, Dinko Žiher PLIVA - Research Institute, Ltd. Prilaz baruna Filipovića 29, 10000 Zagreb, CROATIA The Pharmaceutical

More information

Kyoto Encyclopedia of Genes and Genomes (KEGG)

Kyoto Encyclopedia of Genes and Genomes (KEGG) NPTEL Biotechnology -Systems Biology Kyoto Encyclopedia of Genes and Genomes (KEGG) Dr. M. Vijayalakshmi School of Chemical and Biotechnology SASTRA University Joint Initiative of IITs and IISc Funded

More information

Genome Informatics. Systems Biology and the Omics Cascade (Course 2143) Day 3, June 11 th, Kiyoko F. Aoki-Kinoshita

Genome Informatics. Systems Biology and the Omics Cascade (Course 2143) Day 3, June 11 th, Kiyoko F. Aoki-Kinoshita Genome Informatics Systems Biology and the Omics Cascade (Course 2143) Day 3, June 11 th, 2008 Kiyoko F. Aoki-Kinoshita Introduction Genome informatics covers the computer- based modeling and data processing

More information

Henning Hermjakob Proteomics Services Team European Bioinformatics Institute. From Integrating Standards to Integrating Data: BioMAP and MIBBI

Henning Hermjakob Proteomics Services Team European Bioinformatics Institute. From Integrating Standards to Integrating Data: BioMAP and MIBBI Henning Hermjakob Proteomics Services Team European Bioinformatics Institute From Integrating Standards to Integrating Data: BioMAP and MIBBI BioMAP Background CacinoGenomics (EU IP) project EU directives

More information

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks Introduction to Bioinformatics CPSC 265 Thanks to Jonathan Pevsner, Ph.D. Textbooks Johnathan Pevsner, who I stole most of these slides from (thanks!) has written a textbook, Bioinformatics and Functional

More information

Minimum Information About a Microarray Experiment (MIAME) Successes, Failures, Challenges

Minimum Information About a Microarray Experiment (MIAME) Successes, Failures, Challenges Opinion TheScientificWorldJOURNAL (2009) 9, 420 423 TSW Development & Embryology ISSN 1537-744X; DOI 10.1100/tsw.2009.57 Minimum Information About a Microarray Experiment (MIAME) Successes, Failures, Challenges

More information

ONLINE BIOINFORMATICS RESOURCES

ONLINE BIOINFORMATICS RESOURCES Dedan Githae Email: d.githae@cgiar.org BecA-ILRI Hub; Nairobi, Kenya 16 May, 2014 ONLINE BIOINFORMATICS RESOURCES Introduction to Molecular Biology and Bioinformatics (IMBB) 2014 The larger picture.. Lower

More information

Ensembl workshop. Thomas Randall, PhD bioinformatics.unc.edu. handouts, papers, datasets

Ensembl workshop. Thomas Randall, PhD bioinformatics.unc.edu.   handouts, papers, datasets Ensembl workshop Thomas Randall, PhD tarandal@email.unc.edu bioinformatics.unc.edu www.unc.edu/~tarandal/ensembl handouts, papers, datasets Ensembl is a joint project between EMBL - EBI and the Sanger

More information

The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28.

The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28. Data mining in Ensembl with BioMart Worked Example The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28. Which other genes related to human

More information

Clustering and scoring molecular interactions

Clustering and scoring molecular interactions Clustering and scoring molecular interactions relying on community standards Rafael C. Jimenez rafael@ebi.ac.uk EBI is an Outstation of the European Molecular Biology Laboratory.! Sharing infrastructures

More information

MARINE BIOINFORMATICS & NANOBIOTECHNOLOGY - PBBT305

MARINE BIOINFORMATICS & NANOBIOTECHNOLOGY - PBBT305 MARINE BIOINFORMATICS & NANOBIOTECHNOLOGY - PBBT305 UNIT-1 MARINE GENOMICS AND PROTEOMICS 1. Define genomics? 2. Scope and functional genomics? 3. What is Genetics? 4. Define functional genomics? 5. What

More information

Function Prediction of Proteins from their Sequences with BAR 3.0

Function Prediction of Proteins from their Sequences with BAR 3.0 Open Access Annals of Proteomics and Bioinformatics Short Communication Function Prediction of Proteins from their Sequences with BAR 3.0 Giuseppe Profiti 1,2, Pier Luigi Martelli 2 and Rita Casadio 2

More information

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology. G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY Methods or systems for genetic

More information

The Need for Scientific. Data Annotation. Alick K Law, Ph.D., M.B.A. Marketing Manager IBM Life Sciences.

The Need for Scientific. Data Annotation. Alick K Law, Ph.D., M.B.A. Marketing Manager IBM Life Sciences. The Need for Scientific Data Annotation Alick K Law, Ph.D., M.B.A. Marketing Manager IBM Life Sciences alaw@us.ibm.com Cross disciplinary research approach requires organizations to address diverse needs

More information

What You NEED to Know

What You NEED to Know What You NEED to Know Major DNA Databases NCBI RefSeq EBI DDBJ Protein Structural Databases PDB SCOP CCDC Major Protein Sequence Databases UniprotKB Swissprot PIR TrEMBL Genpept Other Major Databases MIM

More information

Protein Bioinformatics Part I: Access to information

Protein Bioinformatics Part I: Access to information Protein Bioinformatics Part I: Access to information 260.655 April 6, 2006 Jonathan Pevsner, Ph.D. pevsner@kennedykrieger.org Outline [1] Proteins at NCBI RefSeq accession numbers Cn3D to visualize structures

More information

GS Analysis of Microarray Data

GS Analysis of Microarray Data GS01 0163 Analysis of Microarray Data Keith Baggerly and Brad Broom Department of Bioinformatics and Computational Biology UT M. D. Anderson Cancer Center kabagg@mdanderson.org bmbroom@mdanderson.org 8

More information

GS Analysis of Microarray Data

GS Analysis of Microarray Data GS01 0163 Analysis of Microarray Data Keith Baggerly and Brad Broom Department of Bioinformatics and Computational Biology UT M. D. Anderson Cancer Center kabagg@mdanderson.org bmbroom@mdanderson.org 7

More information

Bioinformatics to chemistry to therapy: Some case studies deriving information from the literature

Bioinformatics to chemistry to therapy: Some case studies deriving information from the literature Bioinformatics to chemistry to therapy: Some case studies deriving information from the literature. Donald Walter August 22, 2007 The Typical Drug Development Paradigm Gary Thomas, Medicinal Chemistry:

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics IMBB 2017 RAB, Kigali - Rwanda May 02 13, 2017 Joyce Nzioki Plan for the Week Introduction to Bioinformatics Raw sanger sequence data Introduction to CLC Bio Quality Control

More information

The RNA tools registry

The RNA tools registry university of copenhagen f a c u lt y o f h e a lt h a n d m e d i c a l s c i e n c e s The RNA tools registry A community effort to catalog RNA bioinformatics resources and their relationships Anne Wenzel

More information

Web-based Bioinformatics Applications in Proteomics

Web-based Bioinformatics Applications in Proteomics Web-based Bioinformatics Applications in Proteomics Chiquito Crasto ccrasto@genetics.uab.edu January 30, 2009 NCBI (National Center for Biotechnology Information) http://www.ncbi.nlm.nih.gov/ 1 Pubmed

More information

GS Analysis of Microarray Data

GS Analysis of Microarray Data GS01 0163 Analysis of Microarray Data Keith Baggerly and Kevin Coombes Department of Bioinformatics and Computational Biology UT M. D. Anderson Cancer Center kabagg@mdanderson.org kcoombes@mdanderson.org

More information

NCBI web resources I: databases and Entrez

NCBI web resources I: databases and Entrez NCBI web resources I: databases and Entrez Yanbin Yin Most materials are downloaded from ftp://ftp.ncbi.nih.gov/pub/education/ 1 Homework assignment 1 Two parts: Extract the gene IDs reported in table

More information

Biological Interpretation of Metabolomics Data. Martina Kutmon Maastricht University

Biological Interpretation of Metabolomics Data. Martina Kutmon Maastricht University Biological Interpretation of Metabolomics Data Martina Kutmon Maastricht University Contents Background on pathway analysis WikiPathways Building Research Communities on Biological Pathways Data Analysis

More information

データ統合と計算の融合による創薬研究 水口賢司.

データ統合と計算の融合による創薬研究 水口賢司. データ統合と計算の融合による創薬研究 水口賢司 http://mizuguchilab.org kenji@nibiohn.go.jp Outline - Phenotypic and target-based screening with ligand-based and structure-based drug design - Systems biology based approach -

More information

EMBL-EBI Industry Programme

EMBL-EBI Industry Programme EMBL-EBI Industry Programme Jennifer McDowall Dominic Clark, Industry Programme Manager, clark@ebi.ac.uk. 4th November 2010, Madrid Overview In this presentation, I will start off with a short overview

More information