Genomics: Genome Browsing & Annota3on

Size: px
Start display at page:

Download "Genomics: Genome Browsing & Annota3on"

Transcription

1 Genomics: Genome Browsing & Annota3on Lecture 4 of 4 Introduc/on to BioMart Dr Colleen J. Saunders, PhD South African National Bioinformatics Institute/MRC Unit for Bioinformatics Capacity Development, University of the Western Cape, South Africa Introduc3on to Bioinforma3cs online course : IBT_2016 Dr Colleen J. Saunders

2 Learning Outcomes At the end of this lecture and the prac3cal assignment, you will be able to: Use the Ensembl project s BioMart to extract and retrieve sequence and annota;on data using different filtering strategies

3 Retrieving data sets During research you need simple ways to select and retrieve specific data Big data makes manual retrieval impossible! NCBI - Entrez query system UCSC - Table Browser Ensembl project BioMart Allows you to retrieve annotated gene and variant data and sequences

4 Retrieving Data using Ensembl s BioMart Ensembl developed by EMBL-EBI & Sanger Ins;tute Produces & maintains annota;on of eukaryo;c genomes informa;on is freely available online BioMart is user friendly, web based tool allows extrac;on of data from Ensembl no programming knowledge required Navigate through the BioMart interface using filters

5 Retrieving Data using Ensembl s BioMart hrp://

6 BioMart Databases: Type of Data Ensembl Genes retrieving Ensembl genes, transcripts & proteins External references, protein domains, gene structure, sequences, variants, homology data Ensembl Varia3on retrieving germline & soma;c variants, & structural variants phenotypes, variant consequences, flanking sequences genes, transcripts & regulatory features mapped to variants Ensembl Regula3on - retrieving regulatory features, mirna targets, binding mo;fs Vega - contains the Ensembl-Vega gene set

7 BioMart Datasets: Species data Select the dataset specific to your species of interest

8 BioMart Filters: Restric3ng your query Restrict your query using input data List of HGNC gene symbols or dbsnp rsids Chromosomal region Can use mul;ple filters not all filters work together

9 Retrieving Data using Ensembl s BioMart

10 Retrieving Data using Ensembl s BioMart Example: How many genes on long arm (q) of chr 9? Count buron: Returns number of Ensembl genes that match the query filters

11 BioMart ATributes: Specifying the output Select the arributes of your desired output default output is "Ensembl Gene ID & "Ensembl Transcript ID ARributes organised in categories Different for different datasets Can only select 1 category of arributes

12 BioMart ATributes: Specifying the output

13 BioMart ATributes: Specifying the output

14 BioMart: GeUng the results

15 BioMart: Help & tutorials How to use BioMart - tutorial hrp:// If you are comfortable programming in Perl, you can use the BioMart Perl API to integrate BioMart code into your scripts hrp:// Module name and Trainer name

16 Acknowledgements: These slides were produced by Dr Colleen J. Saunders for the 2016 H3ABioNet Introduction to Bioinformatics Online Couse and are distributed under the Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International licence. Dr Saunders is supported by a research fellowship funded by the South African Department of Science and Technology and the National Research Foundation. This fellowship is held at the South African National Bioinformatics Institute which houses the South African MRC Unit for Bioinformatics Capacity Development.