Tutorial section. VEGA, the genome browser with a difference

Size: px
Start display at page:

Download "Tutorial section. VEGA, the genome browser with a difference"

Transcription

1 VEGA, the genome browser with a difference Keywords: vertebrate, annotation, database, manual, curation Abstract The Vertebrate Genome Annotation (Vega) database is a community resource for browsing manual annotation from a variety of vertebrate genomes of finished sequence ( vega.sanger.ac.uk). Vega is different from other genome browsers as it has a standardised classification of genes which encompasses pseudogenes and non-coding transcripts. The data is manually curated, which is more accurate at identifying splice variants, pseudogenes poly(a) features, non-coding and complex gene structures and arrangements than current automated methods. The database also contains annotation from regions, not just whole genomes, and displays multiple species annotation (human, mouse, dog and zebrafish) for comparative analysis. Vega encourages community feedback that results in annotation updates and manual annotation of finished vertebrate sequence. Since completion of the draft human genome sequence in ,2 and the subsequent finishing of this in many different genome browsers have been developed to enable scientists to access genome data. The initial interpretation of the human genome was through automated annotation such as Ensembl 4 and the UCSC genome browsers. 5 There are currently limits to an automated approach for the analysis of genomes, for example in duplicated regions identifying unprocessed pseudogenes, and therefore there is still a need for manual intervention. As the genome sequence became finished, quality curated browsers such as MapView 6,7 and the H-InvDB 8,9 were developed. The Vertebrate Genome Annotation (Vega) database 10 is a community resource for browsing manual annotation from a variety of vertebrate genomes of finished sequence. 11 Vega is based on the Ensembl schema, with gene objects shown in shades of blue, and also incorporates curation-specific data. The database allows users to view the manual annotation provided by the Havana group at the Wellcome Trust Sanger Institute (WTSI), 12 IMB-Jena, the Joint Genome Institute, Genoscope and Washington University. It currently contains the manual annotation of ten human chromosomes (6, 7, 9, 10, 13, 14, 20, 22, X and Y). As the genome sequencing centres publish the annotation and analysis of their chromosomes then the data will be accessible in Vega. Why is Vega different from other browsers? It has a standardised classification of genes which encompasses pseudogenes and non-coding transcripts. PolyA sites/signals are annotated. The data are manually curated. The data are periodically updated. It contains annotation of haplotypes. & HENRY STEWART PUBLICATIONS BRIEFINGS IN BIOINFORMATICS. VOL 6. NO JUNE

2 Table 1: Vega annotation definitions Known Novel Novel transcript Putative Pseudogene Predicted Ig segment Ig pseudogene segment Identical to human cdna or protein sequences in the Entrez Gene database ( query.fcgi?db¼gene/) Have an open reading frame and are identical or homologous to known vertebrate cdnas and/or proteins from all species Similar to novel gene but no open reading frame or open reading frame ambiguous Homologous to spliced vertebrate expressed sequence tags (ESTs) with no significant open reading frame Homologous to protein sequences with a disrupted CDS and an active gene can be found at another locus Based on ab initio prediction for which at least one exon is supported by biological data (unspliced ESTs, protein sequence similarity with mouse or tetraodon genomes) Only used in chromosome 14 Immunoglobulin gene segments Inactivated immunoglobulin segment Single nucleotide polymorphisms (SNPs) are mapped to manual curation. It is multispecies and small regions of finished sequence can be submitted and annotated as well as whole genomes. It encourages community feedback and results in annotation updates. GENE CLASSIFICATION A standardised set of definitions has been used to categorise the annotation of the different gene features (Table 1). Irrespective of which category gene objects have been assigned to all annotated gene structures are supported by homology to cdnas, expressed sequence tags (ESTs) or protein sequences. GENE NAMING It is important to use the correct gene nomenclature to maintain consistency in the annotation database, especially when comparing haplotypic or syntenic regions. The Vega annotators interact closely with the nomenclature committees from the Human Genome Organisation (HUGO, HGNC), 13 Zebrafish Information Network (ZFIN) 14 and Mouse Genome Database (MGD). 15 If an approved symbol is not available for a gene locus, an interim identifier is used in the format of international clone identifier followed by number, eg RP11 695B14.2. All loci and their associated transcripts and exons are given stable versioned database IDs (eg OTTHUMG ) that are generated and tracked in the Otter database 16 that underlies Vega (see Figure 1). Whenever a locus is edited the version number increases and the date of the change saved. MAIN FEATURES OF VEGA Manual annotation is currently more accurate at identifying splice variants, pseudogenes, polyadenylation features, non-coding genes, complex gene arrangements and clusters than automated methods. Splice variants account for approximately 50 per cent of gene loci in finished chromosomes 9, 10 and X, with an average of 2.5 alternative transcripts per locus. Note the majority are noncoding but have canonical splice sites. Splice variants must be supported by splicing EST/cDNA evidence, but the presence of a coding sequence (CDS) is not essential. Hence the majority of variants are annotated without a CDS. ESTs and cdnas from different species are also used as evidence to predict alternative transcripts as genome comparison studies have shown that gene structures are generally conserved between human and mouse. 17 Pseudogenes are defined as nonfunctional copies of genes and are categorised in Vega into unprocessed and processed pseudogenes (viewed in two shades of grey). They are generated by 190 & HENRY STEWART PUBLICATIONS BRIEFINGS IN BIOINFORMATICS. VOL 6. NO JUNE 2005

3 Official HUGO ID Gene last modified date Stable Otter ID for gene locus Splice variants: 7 coding, 1 non-coding, each with stable Otter transcript ID Figure 1: Curated Locus Report giving information about the NFB1 locus on chromosome 9 either of two mechanisms: retrotransposition or duplication of genomic DNA. Those that arise from retrotransposition are called processed pseudogenes 18 and have no 59 promoter sequence or introns but generally have an integrated poly(a) tail at the 39 end that often retains the poly(a) signal. Unprocessed pseudogenes have arisen from genomic duplication and often have a structure that is very similar to the ancestral gene and may even splice correctly. The majority of pseudogenes of both types contain frameshifts and/or stop codons in the coding region. Pseudogenes are valuable in annotation as they have been implicated in human disease 19 and can be used to study evolution. Poly(A) sites /signals are annotated and may be browsed in Vega. Poly(A) signals are displayed in light red and poly(a) sites in dark red in contigview. Alternative polyadenylation appears to affect many higher eukaryotes, mainly in a tissue-dependent manner which may be implicated in disease. 20 All poly(a) features are checked manually, using large numbers of ESTs marking out the 39 ends of genes and the fact that signals (of which there are 10 variants in human 21 ) are usually found within 60 bases of the poly(a) site. SNPs can be viewed in ContigView and are mapped from the Glovar database 22 onto the clones within Vega. Glovar contains all the data from dbsnp together with SNPs found from comparisons of the trace repository 23 with the current genome build. Using Vega annotation, SNPs are classified as coding (red), untranslated region (pink), intronic (blue) or other (grey). ACCESSING AND QUERYING DATA As the Vega browser is based on Ensembl web code it has similar standard entry points such as keyword search and & HENRY STEWART PUBLICATIONS BRIEFINGS IN BIOINFORMATICS. VOL 6. NO JUNE

4 similarity searching (BLAST, SSAHA). ExportView can be used to download data in formats such as FastA, Gene Feature Format (GFF) and flat files. There is also direct access to annotation via a distributed annotation server (DAS). If required, the Ensembl API 24 can be used to perform more comprehensive searches of the Vega data. Also Vega genes mapped to the current genome assembly can be downloaded from Ensembl using Ensmart. MHC HAPLOTYPE ANNOTATION Unlike other browsers Vega can also contain annotations from regions, not just whole chromosomes. Regions available include the haplotype COX for the major histocompatibility complex (MHC) on human chromosome 6, with more haplotypes to follow. 25 ACCESSING MULTISPECIES ANNOTATION IN VEGA Vega can display multiple species annotation for comparative analysis. In the mouse annotation browser selected regions such as the Del36H deletion region on chromosome 13 and the insulin-dependent diabetes (IDD) susceptibility loci regions. The latter are annotated in both the reference mouse strain (C57BL/6) and the non-obese diabetic (Nod) strain. 26 The zebrafish genome is being sequenced in its entirety at the Sanger Institute and Vega will be the main site for browsing the manually curated data. The reference is Tuebingen strain and Vega currently displays chromosomes/linkage groups 1 25 plus one artificial chromosome, U, that contains all clones with unknown chromosomal locations. The AB chromosome displays clones from the AB strain. Manual annotation is added on a monthly basis and clones which have not yet been annotated (displayed in grey) are shown with features from automated computational analysis (repeat masking, BLAST searches, etc). Recently the finished sequence of the MHC (DLA) class II region from the dog breed Doberman has been annotated and is available in Vega. 27 The sequence displays a high level of conservation with the human, cat and mouse class II region. COMMUNITY FEEDBACK Vega is a community annotation database and therefore to maintain up-to-date annotation it is essential to have feedback from researchers. A webform 28 is available by which users can contact the Vega team to improve/correct annotation if there is additional evidence. Manual annotation of finished vertebrate sequence may also be submitted if it has been peer reviewed and/or meet the annotation standards. 29 FUTURE DEVELOPMENTS IN VEGA Currently available genome browsers often display different transcript structures for the same loci. In order to produce a single standard human gene set the Consensus CDS (CCDS) project has been set up between NCBI, USCS, Ensembl and the Havana group. The aim is to compare the human gene sets produced by RefSeq, Ensembl and Vega and then identify transcripts where the protein coding region is agreed on by all collaborators. These CDSs will be identified by stable CCDS identifiers in all the browsers. In the near future manual annotation of the regions for the ENCODE project 30,31 will be displayed in Vega. As mouse and zebrafish genomes reach completion it is hoped that the manually annotated orthologues may be browsed using multicontigview which is already available in Ensembl. Acknowledgments I gratefully acknowledge the help of Dr Jennifer Ashurst and Dr Laurens Wilming at the Wellcome Trust Sanger Institute. Dr Jane Loveland HAVANA Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK 192 & HENRY STEWART PUBLICATIONS BRIEFINGS IN BIOINFORMATICS. VOL 6. NO JUNE 2005

5 References Tel: +44 (0) Fax: +44 (0) Lander, E. S., Linton, L. M., Birren, B. et al. (2001), Initial sequencing and analysis of the human genome, Nature, Vol. 409(6822), pp Venter, J. C., Adams, M. D., Myers, E. W. et al. (2001), The sequence of the human genome, Science, Vol. 291(5507), pp International Human Genome Sequencing Consortium (2004), Finishing the euchromatic sequence of the human genome, Nature, Vol. 431(7011), pp Hubbard, T., Andrews, D., Caccamo, M. et al. (2005), Ensembl 2005, Nucleic Acids Res., Vol. 33 (Database issue), pp. D Kent, W. J., Sugnet, C. W., Furey, T. S. et al. (2002), The human genome browser at UCSC, Genome Res., Vol. 12(6), pp Wheeler, D. L., Chappey, C., Lash, A. E. et al. (2002), Database resources of the National Center for Biotechnology Information: 2002 update, Nucleic Acids Res., Vol. 30(1), pp URL: mapview/ 8. Imanishi, T., Itoh, T., Suzuki, Y. et al. (2004), Integrative annotation of 21,037 human genes validated by full-length cdna clones, PLoS Biol., Vol. 2(6), p. e URL: Ashurst, J. L., Chen, C.-K., Gilbert, J. G. R. et al. (2005), The Vertebrate Genome Annotation (Vega) database, Nucleic Acids Res., Vol. 33 (Database issue), pp. D URL: URL: Wain, H. M., Lush, M. J., Ducluzeau, F. et al. (2004), Genew: The Human Gene Nomenclature Database, 2004 updates, Nucleic Acids Res., Vol. 32 (Database issue), pp. D Sprague, J., Clements, D., Conlin, T. et al. (2003), The Zebrafish Information Network (ZFIN): The zebrafish model organism database, Nucleic Acids Res., Vol. 31(1), pp Eppig, J. T., Bult, C. J., Kadin, J. A. et al. (2005), The Mouse Genome Database (MGD): From genes to mice a community resource for mouse biology, Nucleic Acids Res., Vol. 33 (Database issue), pp. D Searle, S. M., Gilbert, J., Iyer, V. and Clamp, M. (2004), The otter annotation system, Genome Res., Vol. 14(5), pp Batzoglou, S., Pachter, L., Mesirov, J. P. et al. (2000), Human and mouse gene structure: Comparative analysis and application to exon prediction, Genome Res., Vol. 10(7), pp Vanin, E. F. (1985), Processed pseudogenes: Characteristics and evolution, Annu. Rev. Genet., Vol. 19, pp Kenmochi, N., Yoshihama, M., Higa, S. and Tanaka, T. (2000), The human ribosomal protein L6 gene in a critical region for Noonan syndrome, J. Human Genet., Vol. 45(5), pp Edwalds-Gilbert, G., Veraldi, K. L. and Milcarek, C. (1997), Alternative poly(a) site selection in complex transcription units: Means to an end?, Nucleic Acids Res., Vol. 25(13), pp Beaudoing, E., Freier, S., Wyatt, J. R. et al. (2000), Patterns of variant polyadenylation signal usage in human genes, Genome Res., Vol. 10(7), pp URL: Homo_sapiens/ 23. URL: URL: Stewart, C. A., Horton, R., Allcock, R. J. N. et al. (2004), Complete MHC haplotype sequencing for common disease gene mapping, Genome Res., Vol. 14(6), pp Hill, N. J., Lyons, P. A., Armitage, N. et al. (2000), NOD Idd5 locus controls insulitis and diabetes and overlaps the orthologous CTLA4/ IDDM12 and NRAMP1 loci in humans, Diabetes, Vol. 49(10), pp Debenham, S. L., Hart, E. A., Ashurst, J. L. et al. (2005), Genomic sequence of the class II region of the canine MHC: Comparison with the MHC of other mammalian species, Genomics, Vol. 85(1), pp URL: index.html 29. URL: guidelines.pdf 30. ENCODE Project Consortium (2004), The ENCODE (ENCyclopedia Of DNA Elements) Project, Science, Vol. 306(5696), pp URL: & HENRY STEWART PUBLICATIONS BRIEFINGS IN BIOINFORMATICS. VOL 6. NO JUNE

Ensembl workshop. Thomas Randall, PhD bioinformatics.unc.edu. handouts, papers, datasets

Ensembl workshop. Thomas Randall, PhD bioinformatics.unc.edu.   handouts, papers, datasets Ensembl workshop Thomas Randall, PhD tarandal@email.unc.edu bioinformatics.unc.edu www.unc.edu/~tarandal/ensembl handouts, papers, datasets Ensembl is a joint project between EMBL - EBI and the Sanger

More information

Guided tour to Ensembl

Guided tour to Ensembl Guided tour to Ensembl Introduction Introduction to the Ensembl project Walk-through of the browser Variations and Functional Genomics Comparative Genomics BioMart Ensembl Genome browser http://www.ensembl.org

More information

Array-Ready Oligo Set for the Rat Genome Version 3.0

Array-Ready Oligo Set for the Rat Genome Version 3.0 Array-Ready Oligo Set for the Rat Genome Version 3.0 We are pleased to announce Version 3.0 of the Rat Genome Oligo Set containing 26,962 longmer probes representing 22,012 genes and 27,044 gene transcripts.

More information

The University of California, Santa Cruz (UCSC) Genome Browser

The University of California, Santa Cruz (UCSC) Genome Browser The University of California, Santa Cruz (UCSC) Genome Browser There are hundreds of available userselected tracks in categories such as mapping and sequencing, phenotype and disease associations, genes,

More information

user s guide Question 1

user s guide Question 1 Question 1 How does one find a gene of interest and determine that gene s structure? Once the gene has been located on the map, how does one easily examine other genes in that same region? doi:10.1038/ng966

More information

ab initio and Evidence-Based Gene Finding

ab initio and Evidence-Based Gene Finding ab initio and Evidence-Based Gene Finding A basic introduction to annotation Outline What is annotation? ab initio gene finding Genome databases on the web Basics of the UCSC browser Evidence-based gene

More information

The Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica

The Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica The Ensembl Database Dott.ssa Inga Prokopenko Corso di Genomica 1 www.ensembl.org Lecture 7.1 2 What is Ensembl? Public annotation of mammalian and other genomes Open source software Relational database

More information

Genome annotation & EST

Genome annotation & EST Genome annotation & EST What is genome annotation? The process of taking the raw DNA sequence produced by the genome sequence projects and adding the layers of analysis and interpretation necessary

More information

UCSC Genome Browser. Introduction to ab initio and evidence-based gene finding

UCSC Genome Browser. Introduction to ab initio and evidence-based gene finding UCSC Genome Browser Introduction to ab initio and evidence-based gene finding Wilson Leung 06/2006 Outline Introduction to annotation ab initio gene finding Basics of the UCSC Browser Evidence-based gene

More information

Browsing Genomes with Ensembl

Browsing Genomes with Ensembl April Feb 2006 2007 Browsing Genomes with Ensembl Joint project Ensembl - Project EMBL European Bioinformatics Institute (EBI) Wellcome Trust Sanger Institute Produce accurate, automatic genome annotation

More information

Ensembl: A New View of Genome Browsing

Ensembl: A New View of Genome Browsing 28 TECHNICAL NOTES EMBnet.news 15.3 Ensembl: A New View of Genome Browsing Giulietta M. Spudich and Xosé M. Fernández- Suárez European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxon, Cambs,

More information

Training materials.

Training materials. Training materials Ensembl training materials are protected by a CC BY license http://creativecommons.org/licenses/by/4.0/ If you wish to re-use these materials, please credit Ensembl for their creation

More information

BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers

BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers Web resources: NCBI database: http://www.ncbi.nlm.nih.gov/ Ensembl database: http://useast.ensembl.org/index.html UCSC

More information

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF GENOMICS Genomics is the study of genomes in their entirety Bioinformatics is the analysis of the information content of genomes - Genes, regulatory sequences,

More information

Week 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers

Week 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers Week 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers Web resources: NCBI database: http://www.ncbi.nlm.nih.gov/ Ensembl database: http://useast.ensembl.org/index.html

More information

user s guide Question 3

user s guide Question 3 Question 3 During a positional cloning project aimed at finding a human disease gene, linkage data have been obtained suggesting that the gene of interest lies between two sequence-tagged site markers.

More information

Identifying Genes and Pseudogenes in a Chimpanzee Sequence Adapted from Chimp BAC analysis: TWINSCAN and UCSC Browser by Dr. M.

Identifying Genes and Pseudogenes in a Chimpanzee Sequence Adapted from Chimp BAC analysis: TWINSCAN and UCSC Browser by Dr. M. Identifying Genes and Pseudogenes in a Chimpanzee Sequence Adapted from Chimp BAC analysis: TWINSCAN and UCSC Browser by Dr. M. Brent Prerequisites: A Simple Introduction to NCBI BLAST Resources: The GENSCAN

More information

Aaditya Khatri. Abstract

Aaditya Khatri. Abstract Abstract In this project, Chimp-chunk 2-7 was annotated. Chimp-chunk 2-7 is an 80 kb region on chromosome 5 of the chimpanzee genome. Analysis with the Mapviewer function using the NCBI non-redundant database

More information

Gene-centered resources at NCBI

Gene-centered resources at NCBI COURSE OF BIOINFORMATICS a.a. 2014-2015 Gene-centered resources at NCBI We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about genes, serving

More information

Training materials.

Training materials. Training materials - Ensembl training materials are protected by a CC BY license - http://creativecommons.org/licenses/by/4.0/ - If you wish to re-use these materials, please credit Ensembl for their creation

More information

Gene-centered databases and Genome Browsers

Gene-centered databases and Genome Browsers COURSE OF BIOINFORMATICS a.a. 2015-2016 Gene-centered databases and Genome Browsers We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about

More information

Gene-centered databases and Genome Browsers

Gene-centered databases and Genome Browsers COURSE OF BIOINFORMATICS a.a. 2016-2017 Gene-centered databases and Genome Browsers We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about

More information

Niemann-Pick Type C Disease Gene Variation Database ( )

Niemann-Pick Type C Disease Gene Variation Database (   ) NPC-db (vs. 1.1) User Manual An introduction to the Niemann-Pick Type C Disease Gene Variation Database ( http://npc.fzk.de ) curated 2007/2008 by Dirk Dolle and Heiko Runz, Institute of Human Genetics,

More information

Lecture 7 Motif Databases and Gene Finding

Lecture 7 Motif Databases and Gene Finding Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il Lecture 7 Motif Databases and Gene Finding Motif Databases & Gene Finding Motifs Recap Motif Databases TRANSFAC

More information

user s guide Question 3

user s guide Question 3 Question 3 During a positional cloning project aimed at finding a human disease gene, linkage data have been obtained suggesting that the gene of interest lies between two sequence-tagged site markers.

More information

Web-based tools for Bioinformatics; A (free) introduction to (freely available) NCBI, MUSC and World-wide.

Web-based tools for Bioinformatics; A (free) introduction to (freely available) NCBI, MUSC and World-wide. Page 1 of 18 Web-based tools for Bioinformatics; A (free) introduction to (freely available) NCBI, MUSC and World-wide. When and Where---Wednesdays 1-2pm Room 438 Library Admin Building Beginning September

More information

Annotation Practice Activity [Based on materials from the GEP Summer 2010 Workshop] Special thanks to Chris Shaffer for document review Parts A-G

Annotation Practice Activity [Based on materials from the GEP Summer 2010 Workshop] Special thanks to Chris Shaffer for document review Parts A-G Annotation Practice Activity [Based on materials from the GEP Summer 2010 Workshop] Special thanks to Chris Shaffer for document review Parts A-G Introduction: A genome is the total genetic content of

More information

Chimp BAC analysis: Adapted by Wilson Leung and Sarah C.R. Elgin from Chimp BAC analysis: TWINSCAN and UCSC Browser by Dr. Michael R.

Chimp BAC analysis: Adapted by Wilson Leung and Sarah C.R. Elgin from Chimp BAC analysis: TWINSCAN and UCSC Browser by Dr. Michael R. Chimp BAC analysis: Adapted by Wilson Leung and Sarah C.R. Elgin from Chimp BAC analysis: TWINSCAN and UCSC Browser by Dr. Michael R. Brent Prerequisites: BLAST exercise: Detecting and Interpreting Genetic

More information

Il trascrittoma dei mammiferi

Il trascrittoma dei mammiferi 29 Novembre 2005 Il trascrittoma dei mammiferi dott. Manuela Gariboldi Gruppo di ricerca IFOM: Genetica molecolare dei tumori (responsabile dott. Paolo Radice) Copyright 2005 IFOM Fondazione Istituto FIRC

More information

Investigating Inherited Diseases

Investigating Inherited Diseases Investigating Inherited Diseases The purpose of these exercises is to introduce bioinformatics databases and tools. We investigate an important human gene and see how mutations give rise to inherited diseases.

More information

Access to genes and genomes with. Ensembl. Worked Example & Exercises

Access to genes and genomes with. Ensembl. Worked Example & Exercises Access to genes and genomes with Ensembl Worked Example & Exercises September 2006 1 CONTENTS WORKED EXAMPLE... 2 BROWSING ENSEMBL... 21 Exercises... 21 Answers... 22 BIOMART... 25 Exercises... 25 Answers...

More information

Gap Filling for a Human MHC Haplotype Sequence

Gap Filling for a Human MHC Haplotype Sequence American Journal of Life Sciences 2016; 4(6): 146-151 http://www.sciencepublishinggroup.com/j/ajls doi: 10.11648/j.ajls.20160406.12 ISSN: 2328-5702 (Print); ISSN: 2328-5737 (Online) Gap Filling for a Human

More information

Gene Identification in silico

Gene Identification in silico Gene Identification in silico Nita Parekh, IIIT Hyderabad Presented at National Seminar on Bioinformatics and Functional Genomics, at Bioinformatics centre, Pondicherry University, Feb 15 17, 2006. Introduction

More information

Chapter 2: Access to Information

Chapter 2: Access to Information Chapter 2: Access to Information Outline Introduction to biological databases Centralized databases store DNA sequences Contents of DNA, RNA, and protein databases Central bioinformatics resources: NCBI

More information

GREG GIBSON SPENCER V. MUSE

GREG GIBSON SPENCER V. MUSE A Primer of Genome Science ience THIRD EDITION TAGCACCTAGAATCATGGAGAGATAATTCGGTGAGAATTAAATGGAGAGTTGCATAGAGAACTGCGAACTG GREG GIBSON SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc.

More information

BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP

BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP Jasper Decuyper BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP MB&C2017 Workshop Bioinformatics for dummies 2 INTRODUCTION Imagine your workspace without the computers Both in research laboratories and in

More information

INTRODUCTION TO BIOINFORMATICS. SAINTS GENETICS Ian Bosdet

INTRODUCTION TO BIOINFORMATICS. SAINTS GENETICS Ian Bosdet INTRODUCTION TO BIOINFORMATICS SAINTS GENETICS 12-120522 - Ian Bosdet (ibosdet@bccancer.bc.ca) Bioinformatics bioinformatics is: the application of computational techniques to the fields of biology and

More information

Genome annotation. Erwin Datema (2011) Sandra Smit (2012, 2013)

Genome annotation. Erwin Datema (2011) Sandra Smit (2012, 2013) Genome annotation Erwin Datema (2011) Sandra Smit (2012, 2013) Genome annotation AGACAAAGATCCGCTAAATTAAATCTGGACTTCACATATTGAAGTGATATCACACGTTTCTCTAAT AATCTCCTCACAATATTATGTTTGGGATGAACTTGTCGTGATTTGCCATTGTAGCAATCACTTGAA

More information

Comparison of human (and other) genome browsers

Comparison of human (and other) genome browsers SOFTWARE REVIEW Comparison of human (and other) genome browsers Terrence S. Furey* Institute for Genome Sciences and Policy, Duke University, 101 Science Drive, Box 3382, Durham, NC 27708, USA * Correspondence

More information

After the draft sequence, what next for the Human Genome Mapping Project Resource Centre?

After the draft sequence, what next for the Human Genome Mapping Project Resource Centre? Comparative and Functional Genomics Comp Funct Genom 2001; 2: 176 179. DOI: 10.1002 / cfg.83 Interview: Duncan Campbell After the draft sequence, what next for the Human Genome Mapping Project Resource

More information

Genomic Annotation Lab Exercise By Jacob Jipp and Marian Kaehler Luther College, Department of Biology Genomics Education Partnership 2010

Genomic Annotation Lab Exercise By Jacob Jipp and Marian Kaehler Luther College, Department of Biology Genomics Education Partnership 2010 Genomic Annotation Lab Exercise By Jacob Jipp and Marian Kaehler Luther College, Department of Biology Genomics Education Partnership 2010 Genomics is a new and expanding field with an increasing impact

More information

Hands-On Four Investigating Inherited Diseases

Hands-On Four Investigating Inherited Diseases Hands-On Four Investigating Inherited Diseases The purpose of these exercises is to introduce bioinformatics databases and tools. We investigate an important human gene and see how mutations give rise

More information

Agenda. Web Databases for Drosophila. Gene annotation workflow. GEP Drosophila annotation projects 01/01/2018. Annotation adding labels to a sequence

Agenda. Web Databases for Drosophila. Gene annotation workflow. GEP Drosophila annotation projects 01/01/2018. Annotation adding labels to a sequence Agenda GEP annotation project overview Web Databases for Drosophila An introduction to web tools, databases and NCBI BLAST Web databases for Drosophila annotation UCSC Genome Browser NCBI / BLAST FlyBase

More information

Ensembl and ENA. High level overview and use cases. Denise Carvalho-Silva. Ensembl Outreach Team

Ensembl and ENA. High level overview and use cases. Denise Carvalho-Silva. Ensembl Outreach Team Ensembl and ENA High level overview and use cases Denise Carvalho-Silva Ensembl Outreach Team On behalf of Ensembl and ENA teams European Molecular Biology Laboratories Euroepan Bioinformatics Institute

More information

NCBI & Other Genome Databases. BME 110/BIOL 181 CompBio Tools

NCBI & Other Genome Databases. BME 110/BIOL 181 CompBio Tools NCBI & Other Genome Databases BME 110/BIOL 181 CompBio Tools Todd Lowe March 31, 2011 Admin Reading Dummies Ch 3 Assigned Review: "The impact of next-generation sequencing technology on genetics" by E.

More information

HUMAN GENOME BIOINFORMATICS. Tore Samuelsson, Dec 2009

HUMAN GENOME BIOINFORMATICS. Tore Samuelsson, Dec 2009 HUMAN GENOME BIOINFORMATICS Tore Samuelsson, Dec 2009 The sequenced (gray filled) and unsequenced (white) portions of the human genome. Peter F.R. Little Genome Res. 2005; 15: 1759-1766 Human genome organisation

More information

Bacterial Genome Annotation

Bacterial Genome Annotation Bacterial Genome Annotation Bacterial Genome Annotation For an annotation you want to predict from the sequence, all of... protein-coding genes their stop-start the resulting protein the function the control

More information

Experimental validation of candidates of tissuespecific and CpG-island-mediated alternative polyadenylation in mouse

Experimental validation of candidates of tissuespecific and CpG-island-mediated alternative polyadenylation in mouse Karin Fleischhanderl; Martina Fondi Experimental validation of candidates of tissuespecific and CpG-island-mediated alternative polyadenylation in mouse 108 - Biotechnologie Abstract --- Keywords: Alternative

More information

Outline. Introduction to ab initio and evidence-based gene finding. Prokaryotic gene predictions

Outline. Introduction to ab initio and evidence-based gene finding. Prokaryotic gene predictions Outline Introduction to ab initio and evidence-based gene finding Overview of computational gene predictions Different types of eukaryotic gene predictors Common types of gene prediction errors Wilson

More information

Bioinformatics for Proteomics. Ann Loraine

Bioinformatics for Proteomics. Ann Loraine Bioinformatics for Proteomics Ann Loraine aloraine@uab.edu What is bioinformatics? The science of collecting, processing, organizing, storing, analyzing, and mining biological information, especially data

More information

Genomics: Genome Browsing & Annota3on

Genomics: Genome Browsing & Annota3on Genomics: Genome Browsing & Annota3on Lecture 4 of 4 Introduc/on to BioMart Dr Colleen J. Saunders, PhD South African National Bioinformatics Institute/MRC Unit for Bioinformatics Capacity Development,

More information

Identification of Single Nucleotide Polymorphisms and associated Disease Genes using NCBI resources

Identification of Single Nucleotide Polymorphisms and associated Disease Genes using NCBI resources Identification of Single Nucleotide Polymorphisms and associated Disease Genes using NCBI resources Navreet Kaur M.Tech Student Department of Computer Engineering. University College of Engineering, Punjabi

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introduction to Bioinformatics Sequence Alignment Luke Huan Electrical Engineering and Computer Science http://people.eecs.ku.edu/~jhuan/ Database What is database An organized set of data Can

More information

The Gene Ontology Annotation (GOA) project application of GO in SWISS-PROT, TrEMBL and InterPro

The Gene Ontology Annotation (GOA) project application of GO in SWISS-PROT, TrEMBL and InterPro Comparative and Functional Genomics Comp Funct Genom 2003; 4: 71 74. Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cfg.235 Conference Review The Gene Ontology Annotation

More information

Ensembl Tools. EBI is an Outstation of the European Molecular Biology Laboratory.

Ensembl Tools. EBI is an Outstation of the European Molecular Biology Laboratory. Ensembl Tools EBI is an Outstation of the European Molecular Biology Laboratory. Questions? We ve muted all the mics Ask questions in the Chat box in the webinar interface I will check the Chat box periodically

More information

Genome Annotation Genome annotation What is the function of each part of the genome? Where are the genes? What is the mrna sequence (transcription, splicing) What is the protein sequence? What does

More information

9/19/13. cdna libraries, EST clusters, gene prediction and functional annotation. Biosciences 741: Genomics Fall, 2013 Week 3

9/19/13. cdna libraries, EST clusters, gene prediction and functional annotation. Biosciences 741: Genomics Fall, 2013 Week 3 cdna libraries, EST clusters, gene prediction and functional annotation Biosciences 741: Genomics Fall, 2013 Week 3 1 2 3 4 5 6 Figure 2.14 Relationship between gene structure, cdna, and EST sequences

More information

Human KIR sequences 2003

Human KIR sequences 2003 Immunogenetics (2003) 55:227 239 DOI 10.1007/s00251-003-0572-y ORIGINAL PAPER C. A. Garcia J. Robinson L. A. Guethlein P. Parham J. A. Madrigal S. G. E. Marsh Human KIR sequences 2003 Received: 17 March

More information

TIGR THE INSTITUTE FOR GENOMIC RESEARCH

TIGR THE INSTITUTE FOR GENOMIC RESEARCH Introduction to Genome Annotation: Overview of What You Will Learn This Week C. Robin Buell May 21, 2007 Types of Annotation Structural Annotation: Defining genes, boundaries, sequence motifs e.g. ORF,

More information

Introduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph

Introduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph Introduction to Plant Genomics and Online Resources Manish Raizada University of Guelph Genomics Glossary http://www.genomenewsnetwork.org/articles/06_00/sequence_primer.shtml Annotation Adding pertinent

More information

SeattleSNPs Interactive Tutorial: Database Inteface Entrez, dbsnp, HapMap, Perlegen

SeattleSNPs Interactive Tutorial: Database Inteface Entrez, dbsnp, HapMap, Perlegen SeattleSNPs Interactive Tutorial: Database Inteface Entrez, dbsnp, HapMap, Perlegen The tutorial is designed to take you through the steps necessary to access SNP data from the primary database resources:

More information

COMPUTER RESOURCES II:

COMPUTER RESOURCES II: COMPUTER RESOURCES II: Using the computer to analyze data, using the internet, and accessing online databases Bio 210, Fall 2006 Linda S. Huang, Ph.D. University of Massachusetts Boston In the first computer

More information

Computational gene finding

Computational gene finding Computational gene finding Devika Subramanian Comp 470 Outline (3 lectures) Lec 1 Lec 2 Lec 3 The biological context Markov models and Hidden Markov models Ab-initio methods for gene finding Comparative

More information

Genomes contain all of the information needed for an organism to grow and survive.

Genomes contain all of the information needed for an organism to grow and survive. Section 3: Genomes contain all of the information needed for an organism to grow and survive. K What I Know W What I Want to Find Out L What I Learned Essential Questions What are the components of the

More information

Open Access. Abstract

Open Access. Abstract Software ProSplicer: a database of putative alternative splicing information derived from protein, mrna and expressed sequence tag sequence data Hsien-Da Huang*, Jorng-Tzong Horng*, Chau-Chin Lee and Baw-Jhiune

More information

Genome and DNA Sequence Databases. BME 110: CompBio Tools Todd Lowe April 5, 2007

Genome and DNA Sequence Databases. BME 110: CompBio Tools Todd Lowe April 5, 2007 Genome and DNA Sequence Databases BME 110: CompBio Tools Todd Lowe April 5, 2007 Admin Reading: Chapters 2 & 3 Notes available in PDF format on-line (see class calendar page): http://www.soe.ucsc.edu/classes/bme110/spring07/bme110-calendar.html

More information

BIO4342 Lab Exercise: Detecting and Interpreting Genetic Homology

BIO4342 Lab Exercise: Detecting and Interpreting Genetic Homology BIO4342 Lab Exercise: Detecting and Interpreting Genetic Homology Jeremy Buhler March 15, 2004 In this lab, we ll annotate an interesting piece of the D. melanogaster genome. Along the way, you ll get

More information

Vega and the Otterlace Community Manual Annotation Tool

Vega and the Otterlace Community Manual Annotation Tool Photo bymaj Britt Hansen 2/3/2015 Vega and the Otterlace Community Manual Annotation Tool Toby Hunt Havana group, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK Havana: Human and vertebrate analysis

More information

Figure 1. FasterDB SEARCH PAGE corresponding to human WNK1 gene. In the search page, gene searching, in the mouse or human genome, can be done: 1- By

Figure 1. FasterDB SEARCH PAGE corresponding to human WNK1 gene. In the search page, gene searching, in the mouse or human genome, can be done: 1- By 1 2 3 Figure 1. FasterD SERCH PGE corresponding to human WNK1 gene. In the search page, gene searching, in the mouse or human genome, can be done: 1- y keywords (ENSEML ID, HUGO gene name, synonyms or

More information

Introduction to NGS analyses

Introduction to NGS analyses Introduction to NGS analyses Giorgio L Papadopoulos Institute of Molecular Biology and Biotechnology Bioinformatics Support Group 04/12/2015 Papadopoulos GL (IMBB, FORTH) IMBB NGS Seminar 04/12/2015 1

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature13127 Factors to consider in assessing candidate pathogenic mutations in presumed monogenic conditions The questions itemized below expand upon the definitions in Table 1 and are provided

More information

Entrez Gene: gene-centered information at NCBI

Entrez Gene: gene-centered information at NCBI D54 D58 Nucleic Acids Research, 2005, Vol. 33, Database issue doi:10.1093/nar/gki031 Entrez Gene: gene-centered information at NCBI Donna Maglott*, Jim Ostell, Kim D. Pruitt and Tatiana Tatusova National

More information

Genome Annotation. What Does Annotation Describe??? Genome duplications Genes Mobile genetic elements Small repeats Genetic diversity

Genome Annotation. What Does Annotation Describe??? Genome duplications Genes Mobile genetic elements Small repeats Genetic diversity Genome Annotation Genome Sequencing Costliest aspect of sequencing the genome o But Devoid of content Genome must be annotated o Annotation definition Analyzing the raw sequence of a genome and describing

More information

Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015

Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015 Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH BIOL 7210 A Computational Genomics 2/18/2015 The $1,000 genome is here! http://www.illumina.com/systems/hiseq-x-sequencing-system.ilmn Bioinformatics bottleneck

More information

Browsing Genes and Genomes with Ensembl

Browsing Genes and Genomes with Ensembl Browsing Genes and Genomes with Ensembl Emily Perry Ensembl Outreach Project Leader EMBL-EBI Objectives What is Ensembl? What type of data can you get in Ensembl? How to navigate the Ensembl browser website.

More information

Gene Finding Genome Annotation

Gene Finding Genome Annotation Gene Finding Genome Annotation Gene finding is a cornerstone of genomic analysis Genome content and organization Differential expression analysis Epigenomics Population biology & evolution Medical genomics

More information

CHAPTER 21 LECTURE SLIDES

CHAPTER 21 LECTURE SLIDES CHAPTER 21 LECTURE SLIDES Prepared by Brenda Leady University of Toledo To run the animations you must be in Slideshow View. Use the buttons on the animation to play, pause, and turn audio/text on or off.

More information

The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28.

The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28. Data mining in Ensembl with BioMart Worked Example The human gene encoding Glucose-6-phosphate dehydrogenase (G6PD) is located on chromosome X in cytogenetic band q28. Which other genes related to human

More information

Bioinformatics Course AA 2017/2018 Tutorial 2

Bioinformatics Course AA 2017/2018 Tutorial 2 UNIVERSITÀ DEGLI STUDI DI PAVIA - FACOLTÀ DI SCIENZE MM.FF.NN. - LM MOLECULAR BIOLOGY AND GENETICS Bioinformatics Course AA 2017/2018 Tutorial 2 Anna Maria Floriano annamaria.floriano01@universitadipavia.it

More information

Comparative Genomics. Page 1. REMINDER: BMI 214 Industry Night. We ve already done some comparative genomics. Loose Definition. Human vs.

Comparative Genomics. Page 1. REMINDER: BMI 214 Industry Night. We ve already done some comparative genomics. Loose Definition. Human vs. Page 1 REMINDER: BMI 214 Industry Night Comparative Genomics Russ B. Altman BMI 214 CS 274 Location: Here (Thornton 102), on TV too. Time: 7:30-9:00 PM (May 21, 2002) Speakers: Francisco De La Vega, Applied

More information

Question 2: There are 5 retroelements (2 LINEs and 3 LTRs), 6 unclassified elements (XDMR and XDMR_DM), and 7 satellite sequences.

Question 2: There are 5 retroelements (2 LINEs and 3 LTRs), 6 unclassified elements (XDMR and XDMR_DM), and 7 satellite sequences. Bio4342 Exercise 1 Answers: Detecting and Interpreting Genetic Homology (Answers prepared by Wilson Leung) Question 1: Low complexity DNA can be described as sequences that consist primarily of one or

More information

A new strategy to identify novel genes and gene isoforms: Analysis of human chromosomes 15, 21 and 22

A new strategy to identify novel genes and gene isoforms: Analysis of human chromosomes 15, 21 and 22 Gene 365 (2006) 35 40 www.elsevier.com/locate/gene A new strategy to identify novel genes and gene isoforms: Analysis of human chromosomes 15, 21 and 22 Matteo Rè a,1, Flavio Mignone a,1, Michele Iacono

More information

Aligning GENCODE and RefSeq transcripts By EMBL-EBI and NCBI

Aligning GENCODE and RefSeq transcripts By EMBL-EBI and NCBI Aligning GENCODE and RefSeq transcripts By EMBL-EBI and NCBI Joannella Morales, Ph.D. LRG Project Manager jmorales@ebi.ac.uk contact@lrg-sequence.org https://www.lrg-sequence.org https://www.ensembl.org

More information

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with

More information

Annotating Fosmid 14p24 of D. Virilis chromosome 4

Annotating Fosmid 14p24 of D. Virilis chromosome 4 Lo 1 Annotating Fosmid 14p24 of D. Virilis chromosome 4 Lo, Louis April 20, 2006 Annotation Report Introduction In the first half of Research Explorations in Genomics I finished a 38kb fragment of chromosome

More information

What is Bioinformatics?

What is Bioinformatics? What is Bioinformatics? Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline. - NCBI The ultimate goal of the field is

More information

Introduc)on to Databases and Resources Biological Databases and Resources

Introduc)on to Databases and Resources Biological Databases and Resources Introduc)on to Bioinforma)cs Online Course : IBT Introduc)on to Databases and Resources Biological Databases and Resources Learning Objec)ves Introduc)on to Databases and Resources - Understand how bioinforma)cs

More information

Supplementary Online Material. the flowchart of Supplemental Figure 1, with the fraction of known human loci retained

Supplementary Online Material. the flowchart of Supplemental Figure 1, with the fraction of known human loci retained SOM, page 1 Supplementary Online Material Materials and Methods Identification of vertebrate mirna gene candidates The computational procedure used to identify vertebrate mirna genes is summarized in the

More information

Piloting the Zebrafish Genome Browser

Piloting the Zebrafish Genome Browser DEVELOPMENTAL DYNAMICS 235:747 753, 2006 TECHNIQUES Piloting the Zebrafish Genome Browser Anthony DiBiase, 1 * Rachel A. Harte, 2 Yi Zhou, 1 Leonard Zon, 1 and W. James Kent 2 This correspondence is a

More information

DNA is normally found in pairs, held together by hydrogen bonds between the bases

DNA is normally found in pairs, held together by hydrogen bonds between the bases Bioinformatics Biology Review The genetic code is stored in DNA Deoxyribonucleic acid. DNA molecules are chains of four nucleotide bases Guanine, Thymine, Cytosine, Adenine DNA is normally found in pairs,

More information

Biotechnology Project Lab

Biotechnology Project Lab Only for teaching purposes - not for reproduction or sale Advanced Cell Biology & Biotechnology Biotechnology Project Lab Giovanna Gambarotta COMPETENCES THAT YOU WILL ACQUIRE - compare DNA sequences -

More information

Overview: GQuery Entrez human and amylase Search Pubmed Gene Gene: collected information about gene loci AMY1A Genomic context Summary

Overview: GQuery Entrez human and amylase Search Pubmed Gene Gene: collected information about gene loci AMY1A Genomic context Summary Visualizing Whole Genomes The UCSC Human Genome Browser: Hands-on Exercise What do you do with a whole genome sequence once it is complete? Most genome-wide analyses require having the data, but not necessarily

More information

B I O I N F O R M A T I C S

B I O I N F O R M A T I C S B I O I N F O R M A T I C S Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be SUPPLEMENTARY CHAPTER: DATA BASES AND MINING 1 What

More information

The Human Genome Project

The Human Genome Project The Human Genome Project The Human Genome Project Began in 1990 The Mission of the HGP: The quest to understand the human genome and the role it plays in both health and disease. The true payoff from the

More information

In silico variant analysis: Challenges and Pitfalls

In silico variant analysis: Challenges and Pitfalls In silico variant analysis: Challenges and Pitfalls Fiona Cunningham Variation annotation coordinator EMBL-EBI www.ensembl.org Sequencing -> Variants -> Interpretation Structural variants SNP? In-dels

More information

Chapter 15 The Human Genome Project and Genomics. Chapter 15 Human Heredity by Michael Cummings 2006 Brooks/Cole-Thomson Learning

Chapter 15 The Human Genome Project and Genomics. Chapter 15 Human Heredity by Michael Cummings 2006 Brooks/Cole-Thomson Learning Chapter 15 The Human Genome Project and Genomics Genomics Is the study of all genes in a genome Relies on interconnected databases and software to analyze sequenced genomes and to identify genes Impacts

More information

Bioinformatics Tools. Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine

Bioinformatics Tools. Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Bioinformatics Tools Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Bioinformatics Tools Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Overview This lecture will

More information

BME 110 Midterm Examination

BME 110 Midterm Examination BME 110 Midterm Examination May 10, 2011 Name: (please print) Directions: Please circle one answer for each question, unless the question specifies "circle all correct answers". You can use any resource

More information

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 Pharmacogenetics: A SNPshot of the Future Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 1 I. What is pharmacogenetics? It is the study of how genetic variation affects drug response

More information

Sequence Based Function Annotation

Sequence Based Function Annotation Sequence Based Function Annotation Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Sequence Based Function Annotation 1. Given a sequence, how to predict its biological

More information