Gene-centered databases and Genome Browsers

Size: px
Start display at page:

Download "Gene-centered databases and Genome Browsers"

Transcription

1 COURSE OF BIOINFORMATICS a.a Gene-centered databases and Genome Browsers

2 We searched Accession Number: M60495 AT NCBI Nucleotide

3 Gene has been implemented at NCBI to organize information about genes, serving as a major node in the nexus of genomic map, sequence, expression, protein structure, function, and homology data.

4 Each Gene record is assigned a unique identifier, the GeneID The content of a Gene record is organized in: Nomenclature (official symbol, full name, Alias, ) Summary (quick synopsys of gene function) Map Data (Genomic content, Genomic region, ) Sequence-related data (gene, genome, transcript and protein RefSeq) Function (bibliography, GeneRIF, GO, pathways) Variation (dbsnps) Homology (related genes in other species, ) Other sites (links to external databases)

5 2013 Part 1

6 2013 Part 2

7 2013 Part 3

8 2013 Part 3

9 2013 Part 4

10 2013 Part 5

11 2013 Part 6

12 2013 Part 7

13 2013 Part 8

14 Online Mendelian Inheritance in Man OMIM

15 2013

16 MIM number prefix translation * # + % An asterisk indicates a gene of known sequence. A number symbol indicates that it is a descriptive entry, usually of a phenotype. The reason for the use of the # sign is given in the first paragraph of the entry. Discussion of any gene(s) related to the phenotype resides in another entry(ies) as described in the first paragraph. A plus sign indicates that the entry contains the description of a gene of known sequence and a phenotype. A percent sign indicates that the entry describes a confirmed mendelian phenotype or phenotypic locus for which the underlying molecular basis is not known. No symbol generally indicates a description of a phenotype for which the mendelian basis, although suspected, has not been clearly established or that the separateness of this phenotype from that in another entry is unclear.

17 Homologene

18 2013

19 How to query GENE By chromosomal region By RefSeq status By Taxonomy Et al.

20 Click on GRAPHICS

21

22

23 Why do we need genome sequences? To identify genes To determine gene structure To identify regulatory regions To identify conserved regions To investigate Genome structure To investigate Genetic variability To understand genetic disorders

24 Human Genome Sequence February 2001 first draft of the complete sequence of the human genome (almost completed in 2003)

25 Genomes of model organisms Yeast-1996 C.elegans-1998 Drosophila-2000 Plasmodium-2002 Anophele-2002 Mouse-2002 Rat-2004 Chicken-2004

26 Automation of the Sanger method Acrylamide gel 36 reactions/gel 12 h of electrophoresis bp per read About bp/day 96 capillaries/reactions 1 h di electrophoresis bp per read About bp/day from:

27 Shotgun genome sequencing more than 10 years $3 billion

28 How can we access Genome sequences? Complete Genome sequences are available as it Is Look at the complete sequence of human chromosome 1 at NCBI Nucleotide (Acc. N. NC_ ) Could you comment it?? but.

29 Human Chromosome 1 sequence at Nucleo3de (1)

30 Human Chromosome 1 sequence at Nucleo3de (2)

31 How can we access Genome sequences? Complete Genome sequences are available as it is but. Annotated Genome sequences are available Through different Genome browser

32 Human Chromosome 1 sequence at Nucleo3de (3)

33 Overview of genome Browsers (1): The UCSC genome browser Trytry Try to ask for human chromosome 1

34 Overview of genome Browsers (2): The Ensembl genome browser Try to ask for human chromosome 1

35 Overview of genome Browsers (3): The NCBI genome browser Try to ask for human chromosome 1

36 Overview of genome Browsers (4) Many other genome browser but generally dedicated to some specific aspects: SNP and Linkage disequilibrium maps (HapMap) Copy Number variation (DGV) Genetic variability (1000 Genomes) many more!!

37 Start inves3ga3ng the FLG locus on the human genome

38 UCSC We will add many different UCS tracks trying to deeply characterize the FLG genomic region

39 Where to study

40 Exercise: Find GeneID:546 Which gene it correspond to? What about gene annotation What about genomic region annotation

41