Computational Biology and Bioinformatics
|
|
- Samson Shepherd
- 6 years ago
- Views:
Transcription
1 Computational Biology and Bioinformatics Computational biology Development of algorithms to solve problems in biology Bioinformatics Application of computational biology to the analysis and management of biological data Applied bioinformatics Intelligent use of tools to navigate the sequence space Better experiments can be designed by a careful Bioinformatics analysis before the bench work Bioinformatics 7,850,000 hits
2 DNA sequencing DNA sequences are the most abundant type of sequences (5 * ) Generated by the chain termination method (Sanger sequencing) Based on the action of a DNA polymerase that adds nucleotides to complementary strand Fluorescently labeled ddntp (Dideoxynucleotides) stop synthesis acting as chain terminators. They are included in amounts so as to terminate every time the base appears in the template Requires template DNA, and primer and one ddntp for each base: A,C,G, and T Products are separated by electrophoresis G GG Primer A A A C C C T T T
3 Summary of chain termination sequencing In the past, four different reactions, one for each ddntp, were separated on a gel that could resolve one-base differences. The sequence was then read from the bottom of the gel to the top.
4 Sequence reading of fluorescently labeled reactions Fluorescently labeled reactions scanned by laser as particular point is passed Color picked up by detector Output sent directly to computer In its simplest form a sequence can be represented as a string of nucleotides with a basic tag or identifier after a greater than character > : FASTA format Definition line (commonly called def line ) >U CGGTTGCTTGGGTTTTATAACATCAGTCAGTGACAGGCATTTCCAGAGTTGCCCTGTTCAACAATCGATA GCTGCCTTTGGCCACCAAAATCCCAAACTTAATTAAAGAATTAAATAATTCGAATAATAATTAAGCCCAG TAACCTACGCAGCTTGAGTGCGTAACCGATATCTAGTATACATTTCGATACATCGAAATCATGGTAGTGT TGGAGACGGAGAAGGTAAGACGATGATAGACGGCGAGCCGCATGGGTTCGATTTGCGCTGAGCCGTGGCA GGGAACAACAAAAACAGGGTTGTTGCACAAGAGGGGAGGCGATAGTCGAGCGGAAAAGAGTGCAGTTGGC GTGGCTACATCATCATTGTGTTCACCGATTATTTTTTGCACAATTGCTTAATATTAATTGTACTTGCACG CTATTGTCTACGTCATAGCTATCGCTCATCTCTGTCTGTCTCTATCAAGCTATCTCTCTTTCGCGGTCAC TCGTTCTCTTTTTTCTCTCCTTTCGCATTTGCATACGCATACCACACGTTTTCAGTGTTCTCGCTCTCTC TCTCTTGTCAAGACATCGCGCGCGTGTGTGTGGGTGTGTCTCTAGCACATATACATAAATAGGAGAGCGG More information can be be added to the FASTA definition line >gb U DMU54469 Drosophila melanogaster eukaryotic initiation factor 4E (eif4e) gene. GenBank Accession.version Locus Descritpion
5 Editing Errors
6 Capillary electrophoresis Newer automated sequencers use very thin capillary tubes Run all four fluorescently tagged reactions in same capillary Can have 96 capillaries running at the same time robotic arm and syringe 96 glass capillaries 96 well plate load bar
7 New sequencing technologies 454 Life Sciences: FLX titanium: 1,000,000 reads 400 bp= 400,000,000 high quality bp in 10h run
8 Present and Future of sequencing Sequencing costs Dropping each year Opens possibility of sequencing genomes of individuals Greatly facilitates comparative genomics.
9 DNA RNA protein phenotype genomic DNA databases cdna ESTs UniGene protein sequence databases
10 There are three major public DNA databases EMBL Housed at EBI European Bioinformatics Institute GenBank Housed at NCBI National Center for Biotechnology Information DDBJ Housed in Japan
11 Taxonomy at NCBI: >200,000 species are represented in GenBank
12 The most sequenced organisms in GenBank Homo sapiens 13.1 billion bases Mus musculus 8.4b Rattus norvegicus 6.1b Bos taurus 5.2b Zea mays 4.6b Sus scrofa 3.6b Danio rerio 3.0b Oryza sativa (japonica) 1.5b Strongylocentrotus purpurata 1.4b Nicotiana tabacum 1.1b GenBank release 168.0
13 Sequence databases What is a database? An indexed set of records Records retrieved using a query language Examples of sequence databases Primary databases (archival) GenBank EMBL (European Molecular Biology Laboratory) DDBJ (DNA Data Bank of Japan) Secondary databases (curated) RefSeq EMBL Genome Reviews Protein databases TPA (Third party annotations) The client server model has made access to sequence databases fast and easy Biologist query Web Browser Web Server BLAST Search Engine Database
14 Data flow of submissions between primary databases (Chp. 1) Integrated information retrieval system Is an interface not a database Entrez NIH National Institute of Health (USA) Submissions Updates (Sequin) NCBI National Center for Biotechnology Information GenBank EMBL European Molecular Biology Laboratory International Nucleotide Sequence Database Collaboration Updated every 24 hs EMBL ndex.html EBI Ensambl European Bioinformatics Institute DDBJ Center for Information Biology CIB DNA Data Bank of Japan Submissions Updates Getentry c.jp/getstart-e.html Submissions Updates (Sakura) NIG National Institute of Genetics (JAPAN)
15 Nucleotide sequence flatfiles Header: Locus (10 characters, 1 st letter, arbitrary name, not useful), Length, Molecule type (DNA or RNA), Division code (includes functional divisions as ExpressedSeqTags, SeqTagsSites, WholeGenomeSeq, etc.), Last release date EMBL). Definition: Summary of biological information Accession number: Primary key to reference a record in the database. Used in publications (1+5 or 2+6). Version: The version is increased every time the sequence is updated. For each version there is a dif. GI geninfo identifier, which is specific of GenBank, not very useful Keywords: abandoned because of absence of controlled vocabulary. Source / Organism : taxonomic information top-down. Reference: submission credit and published paper. Feature Table: direct representation of the biological information in the record: Source: org, chromosome, map Gene may include regulatory sequences mrna from start of translation to polyadenilation signal CDS (coding sequences): from start to stop codon, provides exon coordinates. Every CDS is assigned a protein_id.version (3+5) Sequence: actual sequence ending in //
16 Secondary databases Refseq Comprehensive, integrated, and non-redundant set of sequences Genomic DNA RNA Protein explicitly linked 2_6 Format (the underscore is never present in GenBank accessions): Experimental Predicted NT_ (genomic contig) NM_ (mrnas) NP_ (Proteins) [XM_ (model mrnas)] [XP_ (model protein)] Undergo continuous curation: most up to date sequence Each RefSeq is a synthesis of information, not a piece of a primary research: equivalent to a review article Message for Removed Secondary
17 Secondary databases Third Party Annotation (TPA) Includes Reannotations, Combinations of novel and existing primary entries Annotations of trace archives Whole genome Shotgun data Provides GenBank accession. Version numbers and nucleotide locations for all primary entries to which the TPA sequence relates EMBL Genome reviews Includes Add information from UniProt knowledgebase, Gene Ontology Annotation, InterPro, and others Curated versions of entries representing complete genomes Standardize annotations
18 Protein databases Merge 50% = GenPept: translations of all CDS. Not curated Merge 90% = Uniprot (Swiss-Prot/TrEMBL/PIR-PSD) UniParc: most comprehensive, public nonredundant protein database Swiss-Prot (manual)/trembl(computer)/pir-psd GenBank, Patents, Int. Pr. Index (IPI) Protein Data Bank UniProt Knowledgebase: curated subset of UniParc Function Postranslational modifications Domains Catalytic sites Structures Associated diseases Pathways Etc. UniRef: UniProt nonredundant reference database: 95%, 90% and 50% sets. Functional groups Pfam Prosite IternPro Merge 95% = All predicted coding regions
19
20 PubMed is National Library of Medicine's search service 19 million citations in MEDLINE links to participating online journals PubMed tutorial
21 Entrez integrates the scientific literature; DNA and protein sequence databases; 3D protein structure data; population study data sets; assemblies of complete genomes
22 Entrez is a search and retrieval system that integrates NCBI databases
23 BLAST is Basic Local Alignment Search Tool NCBI's sequence similarity search tool supports analysis of DNA and protein databases 100,000 searches per day
24 OMIM is Online Mendelian Inheritance in Man catalog of human genes and genetic disorders created by Dr. Victor McKusick; led by Dr. Ada Hamosh at JHMI
25 Bookshelf is searchable resource of on-line books
26 Taxonomy Browser is browser for the major divisions of living organisms (archaea, bacteria, eukaryota, viruses) taxonomy information such as genetic codes molecular data on extinct organisms practically useful to find a protein or gene from a species
27 Structure site includes Molecular Modeling Database (MMDB) biopolymer structures obtained from the Protein Data Bank (PDB) Cn3D (a 3D-structure viewer) vector alignment search tool (VAST)
28 Accession numbers are labels for sequences NCBI includes databases (such as GenBank) that contain information on DNA, RNA, or protein sequences. You may want to acquire information beginning with a query such as the name of a protein of interest, or the raw nucleotides comprising a DNA sequence of interest. DNA sequences and other molecular data are tagged with accession numbers that are used to identify a sequence or other record relevant to molecular data. Page 26
29 What is an accession number? An accession number is label that used to identify a sequence. It is a string of letters and/or numbers that corresponds to a molecular sequence. Examples (all for retinol-binding protein, RBP4): X02775 NT_ Rs GenBank genomic DNA sequence Genomic contig dbsnp (single nucleotide polymorphism) DNA N An expressed sequence tag (1 of 170) NM_ RefSeq DNA sequence (from a transcript) RNA NP_ AAC02945 Q KT7 RefSeq protein GenBank protein SwissProt protein Protein Data Bank structure record protein Page 27
30 NCBI s important RefSeq project: best representative sequences RefSeq (accessible via the main page of NCBI) provides an expertly curated accession number that corresponds to the most stable, agreed-upon reference version of a sequence. RefSeq identifiers include the following formats: Complete genome Complete chromosome Genomic contig mrna (DNA format) Protein NC_###### NC_###### NT_###### NM_###### e.g. NM_ NP_###### e.g. NP_ Page 27
31 NCBI s RefSeq project: accession for genomic, mrna, protein sequences Accession Molecule Method Note AC_ Genomic Mixed Alternate complete genomic AP_ Protein Mixed Protein products; alternate NC_ Genomic Mixed Complete genomic molecules NG_ Genomic Mixed Incomplete genomic regions NM_ mrna Mixed Transcript products; mrna NM_ mrna Mixed Transcript products; 9-digit NP_ Protein Mixed Protein products; NP_ Protein Curation Protein products; 9-digit NR_ RNA Mixed Non-coding transcripts NT_ Genomic Automated Genomic assemblies NW_ Genomic Automated Genomic assemblies NZ_ABCD Genomic Automated Whole genome shotgun data XM_ mrna Automated Transcript products XP_ Protein Automated Protein products XR_ RNA Automated Transcript products YP_ Protein Auto. & Curated Protein products ZP_ Protein Automated Protein products
32 Access to sequences: Entrez Gene at NCBI Entrez Gene is a great starting point: it collects key information on each gene/protein from major databases. It covers all major organisms. RefSeq provides a curated, optimal accession number for each DNA (NM_ for beta globin DNA corresponding to mrna) or protein (NP_000509) Page 29
33 From the NCBI home page, type beta globin and hit Search
34 Follow the link to Gene
35 Entrez Gene is in the header Note the Official Symbol HBB for beta globin Note the limits option
36
37 By applying limits, there are now far fewer entries
38 Entrez Gene (top of page) Note that links to many other HBB database entries are available
39 Entrez Gene (middle of page): genomic region, bibliography
40 Entrez Gene (middle of page, continued): phenotypes, function
41 Entrez Gene (bottom of page): RefSeq accession numbers
42 Entrez Protein: accession, organism, literature
43 Entrez Protein: features of a protein, and its sequence in the one-letter amino acid code
44 FASTA format: versatile, compact with one header line followed by a string of nucleotides or amino acids in the single letter code
EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science
EECS 730 Introduction to Bioinformatics Sequence Alignment Luke Huan Electrical Engineering and Computer Science http://people.eecs.ku.edu/~jhuan/ Database What is database An organized set of data Can
More informationIntroduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks
Introduction to Bioinformatics CPSC 265 Thanks to Jonathan Pevsner, Ph.D. Textbooks Johnathan Pevsner, who I stole most of these slides from (thanks!) has written a textbook, Bioinformatics and Functional
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics 260.602.01 September 1, 2006 Jonathan Pevsner, Ph.D. pevsner@kennedykrieger.org Teaching assistants Hugh Cahill (hugh@jhu.edu) Jennifer Turney (jturney@jhsph.edu) Meg Zupancic
More informationIntroduction to Bioinformatics. What are the goals of the course? Who is taking this course? Textbook. Web sites. Literature references
Introduction to Bioinformatics Who is taking this course? People with very diverse backgrounds in biology Some people with backgrounds in computer science and biostatistics Most people (will) have a favorite
More informationProtein Bioinformatics Part I: Access to information
Protein Bioinformatics Part I: Access to information 260.655 April 6, 2006 Jonathan Pevsner, Ph.D. pevsner@kennedykrieger.org Outline [1] Proteins at NCBI RefSeq accession numbers Cn3D to visualize structures
More informationChapter 2: Access to Information
Chapter 2: Access to Information Outline Introduction to biological databases Centralized databases store DNA sequences Contents of DNA, RNA, and protein databases Central bioinformatics resources: NCBI
More informationNCBI web resources I: databases and Entrez
NCBI web resources I: databases and Entrez Yanbin Yin Most materials are downloaded from ftp://ftp.ncbi.nih.gov/pub/education/ 1 Homework assignment 1 Two parts: Extract the gene IDs reported in table
More informationTypes of Databases - By Scope
Biological Databases Bioinformatics Workshop 2009 Chi-Cheng Lin, Ph.D. Department of Computer Science Winona State University clin@winona.edu Biological Databases Data Domains - By Scope - By Level of
More informationELE4120 Bioinformatics. Tutorial 5
ELE4120 Bioinformatics Tutorial 5 1 1. Database Content GenBank RefSeq TPA UniProt 2. Database Searches 2 Databases A common situation for alignment is to search through a database to retrieve the similar
More informationIntroduction to Bioinformatics. What are the goals of the course? Who is taking this course? Different user needs, different approaches
Introduction to Bioinformatics Who is taking this course? Monday, November 19, 2012 Jonathan Pevsner pevsner@kennedykrieger.org Bioinformatics M.E:800.707 People with very diverse backgrounds in biology
More informationThe University of California, Santa Cruz (UCSC) Genome Browser
The University of California, Santa Cruz (UCSC) Genome Browser There are hundreds of available userselected tracks in categories such as mapping and sequencing, phenotype and disease associations, genes,
More informationGene-centered resources at NCBI
COURSE OF BIOINFORMATICS a.a. 2014-2015 Gene-centered resources at NCBI We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about genes, serving
More informationIntroduction to BIOINFORMATICS
Introduction to BIOINFORMATICS Antonella Lisa CABGen Centro di Analisi Bioinformatica per la Genomica Tel. 0382-546361 E-mail: lisa@igm.cnr.it http://www.igm.cnr.it/pagine-personali/lisa-antonella/ What
More informationBioinformatics Tools. Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine
Bioinformatics Tools Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Bioinformatics Tools Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Overview This lecture will
More informationData Retrieval from GenBank
Data Retrieval from GenBank Peter J. Myler Bioinformatics of Intracellular Pathogens JNU, Feb 7-0, 2009 http://www.ncbi.nlm.nih.gov (January, 2007) http://ncbi.nlm.nih.gov/sitemap/resourceguide.html Accessing
More informationBioinformatics for Proteomics. Ann Loraine
Bioinformatics for Proteomics Ann Loraine aloraine@uab.edu What is bioinformatics? The science of collecting, processing, organizing, storing, analyzing, and mining biological information, especially data
More informationA Field Guide to GenBank and NCBI Molecular Biology Resources
A Field Guide to GenBank and NCBI Molecular Biology Resources slightly modified from Peter Cooper ftp://ftp.ncbi.nih.gov/pub/cooper/fieldguide/ Eric Sayers ftp://ftp.ncbi.nih.gov/pub/sayers/field_guide/u_penn/
More informationGene-centered databases and Genome Browsers
COURSE OF BIOINFORMATICS a.a. 2015-2016 Gene-centered databases and Genome Browsers We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about
More informationGene-centered databases and Genome Browsers
COURSE OF BIOINFORMATICS a.a. 2016-2017 Gene-centered databases and Genome Browsers We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about
More informationBioinformatics for Cell Biologists
Bioinformatics for Cell Biologists 15 19 March 2010 Developmental Biology and Regnerative Medicine (DBRM) Schedule Monday, March 15 09.00 11.00 Introduction to course and Bioinformatics (L1) D224 Helena
More informationSequence Databases and database scanning
Sequence Databases and database scanning Marjolein Thunnissen Lund, 2012 Types of databases: Primary sequence databases (proteins and nucleic acids). Composite protein sequence databases. Secondary databases.
More informationTwo Mark question and Answers
1. Define Bioinformatics Two Mark question and Answers Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. There are three
More informationIntroduc)on to Databases and Resources Biological Databases and Resources
Introduc)on to Bioinforma)cs Online Course : IBT Introduc)on to Databases and Resources Biological Databases and Resources Learning Objec)ves Introduc)on to Databases and Resources - Understand how bioinforma)cs
More informationSequence Based Function Annotation
Sequence Based Function Annotation Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Sequence Based Function Annotation 1. Given a sequence, how to predict its biological
More informationBIOINF525: INTRODUCTION TO BIOINFORMATICS LAB SESSION 1
BIOINF525: INTRODUCTION TO BIOINFORMATICS LAB SESSION 1 Bioinformatics Databases http://bioboot.github.io/bioinf525_w17/module1/#1.1 Dr. Barry Grant Jan 2017 Overview: The purpose of this lab session is
More informationIntroduction to Bioinformatics for Medical Research. Gideon Greenspan TA: Oleg Rokhlenko. Lecture 1
Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il TA: Oleg Rokhlenko Lecture 1 Introduction to Bioinformatics Introduction to Bioinformatics What is Bioinformatics?
More informationThis software/database/presentation is a "United States Government Work" under the terms of the United States Copyright Act. It was written as part
This software/database/presentation is a "United States Government Work" under the terms of the United States Copyright Act. It was written as part of the author's official duties as a United States Government
More informationSince 2002 a merger and collaboration of three databases: Swiss-Prot & TrEMBL
Since 2002 a merger and collaboration of three databases: Swiss-Prot & TrEMBL PIR-PSD Funded mainly by NIH (US) to be the highest quality, most thoroughly annotated protein sequence database o A high quality
More informationBioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras. Lecture - 5a Protein sequence databases
Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras Lecture - 5a Protein sequence databases In this lecture, we will mainly discuss on Protein Sequence
More informationSAMPLE LITERATURE Please refer to included weblink for correct version.
Edvo-Kit #340 DNA Informatics Experiment Objective: In this experiment, students will explore the popular bioninformatics tool BLAST. First they will read sequences from autoradiographs of automated gel
More informationNUCLEIC ACIDS. DNA (Deoxyribonucleic Acid) and RNA (Ribonucleic Acid): information storage molecules made up of nucleotides.
NUCLEIC ACIDS DNA (Deoxyribonucleic Acid) and RNA (Ribonucleic Acid): information storage molecules made up of nucleotides. Base Adenine Guanine Cytosine Uracil Thymine Abbreviation A G C U T DNA RNA 2
More informationDNAFSMiner: A Web-Based Software Toolbox to Recognize Two Types of Functional Sites in DNA Sequences
DNAFSMiner: A Web-Based Software Toolbox to Recognize Two Types of Functional Sites in DNA Sequences Huiqing Liu Hao Han Jinyan Li Limsoon Wong Institute for Infocomm Research, 21 Heng Mui Keng Terrace,
More informationFUNCTIONAL BIOINFORMATICS
Molecular Biology-2018 1 FUNCTIONAL BIOINFORMATICS PREDICTING THE FUNCTION OF AN UNKNOWN PROTEIN Suppose you have found the amino acid sequence of an unknown protein and wish to find its potential function.
More informationGenome Resources. Genome Resources. Maj Gen (R) Suhaib Ahmed, HI (M)
Maj Gen (R) Suhaib Ahmed, I (M) The human genome comprises DNA sequences mostly contained in the nucleus. A small portion is also present in the mitochondria. The nuclear DNA is present in chromosomes.
More informationCompiled by Mr. Nitin Swamy Asst. Prof. Department of Biotechnology
Bioinformatics Model Answers Compiled by Mr. Nitin Swamy Asst. Prof. Department of Biotechnology Page 1 of 15 Previous years questions asked. 1. Describe the software used in bioinformatics 2. Name four
More informationTIGR THE INSTITUTE FOR GENOMIC RESEARCH
Introduction to Genome Annotation: Overview of What You Will Learn This Week C. Robin Buell May 21, 2007 Types of Annotation Structural Annotation: Defining genes, boundaries, sequence motifs e.g. ORF,
More informationI nternet Resources for Bioinformatics Data and Tools
~i;;;;;;;'s :.. ~,;;%.: ;!,;s163 ~. s :s163:: ~s ;'.:'. 3;3 ~,: S;I:;~.3;3'/////, IS~I'//. i: ~s '/, Z I;~;I; :;;; :;I~Z;I~,;'//.;;;;;I'/,;:, :;:;/,;'L;;;~;'~;~,::,:, Z'LZ:..;;',;';4...;,;',~/,~:...;/,;:'.::.
More informationEntrez Gene: gene-centered information at NCBI
D54 D58 Nucleic Acids Research, 2005, Vol. 33, Database issue doi:10.1093/nar/gki031 Entrez Gene: gene-centered information at NCBI Donna Maglott*, Jim Ostell, Kim D. Pruitt and Tatiana Tatusova National
More informationBiological databases an introduction
Biological databases an introduction By Dr. Erik Bongcam-Rudloff SLU 2017 Biological Databases Sequence Databases Genome Databases Structure Databases Sequence Databases The sequence databases are the
More informationBIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP
Jasper Decuyper BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP MB&C2017 Workshop Bioinformatics for dummies 2 INTRODUCTION Imagine your workspace without the computers Both in research laboratories and in
More informationab initio and Evidence-Based Gene Finding
ab initio and Evidence-Based Gene Finding A basic introduction to annotation Outline What is annotation? ab initio gene finding Genome databases on the web Basics of the UCSC browser Evidence-based gene
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics If the 19 th century was the century of chemistry and 20 th century was the century of physic, the 21 st century promises to be the century of biology...professor Dr. Satoru
More informationComputational gene finding
Computational gene finding Devika Subramanian Comp 470 Outline (3 lectures) Lec 1 Lec 2 Lec 3 The biological context Markov models and Hidden Markov models Ab-initio methods for gene finding Comparative
More informationuser s guide Question 3
Question 3 During a positional cloning project aimed at finding a human disease gene, linkage data have been obtained suggesting that the gene of interest lies between two sequence-tagged site markers.
More informationBiotechnology Explorer
Biotechnology Explorer C. elegans Behavior Kit Bioinformatics Supplement explorer.bio-rad.com Catalog #166-5120EDU This kit contains temperature-sensitive reagents. Open immediately and see individual
More informationHands-On Four Investigating Inherited Diseases
Hands-On Four Investigating Inherited Diseases The purpose of these exercises is to introduce bioinformatics databases and tools. We investigate an important human gene and see how mutations give rise
More informationBiology From gene to protein
Biology 205 5.3.06 From gene to protein Shorthand abbreviation of part of the DNA sequence of the SRY gene >gi 17488858 ref XM_010627.4 Homo sapiens SRY (sex determining region Y chromosome) GGCATGTGAGCGGGAAGCCTAGGCTGCCAGCCGCGAGGACCGCACGGAGGAGGAGCAGG
More informationImportant gene-information's
Sequences, domains and databases. How to gather information on a gene. Jens Bohnekamp, Institute for Biochemistry Important gene-information's Protein sequence Nucleotide sequence Gene structure Protein
More informationBLASTing through the kingdom of life
Information for teachers Description: In this activity, students copy unknown DNA sequences and use them to search GenBank, the main database of nucleotide sequences at the National Center for Biotechnology
More informationWhat is Bioinformatics?
What is Bioinformatics? Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline. - NCBI The ultimate goal of the field is
More informationWeek 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers
Week 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers Web resources: NCBI database: http://www.ncbi.nlm.nih.gov/ Ensembl database: http://useast.ensembl.org/index.html
More informationArray-Ready Oligo Set for the Rat Genome Version 3.0
Array-Ready Oligo Set for the Rat Genome Version 3.0 We are pleased to announce Version 3.0 of the Rat Genome Oligo Set containing 26,962 longmer probes representing 22,012 genes and 27,044 gene transcripts.
More informationBCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers
BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers Web resources: NCBI database: http://www.ncbi.nlm.nih.gov/ Ensembl database: http://useast.ensembl.org/index.html UCSC
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics IMBB 2017 RAB, Kigali - Rwanda May 02 13, 2017 Joyce Nzioki Plan for the Week Introduction to Bioinformatics Raw sanger sequence data Introduction to CLC Bio Quality Control
More informationApplied Bioinformatics
Applied Bioinformatics Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Course overview What is bioinformatics Data driven science: the creation and advancement
More informationWhat You NEED to Know
What You NEED to Know Major DNA Databases NCBI RefSeq EBI DDBJ Protein Structural Databases PDB SCOP CCDC Major Protein Sequence Databases UniprotKB Swissprot PIR TrEMBL Genpept Other Major Databases MIM
More informationBioinformatics Databases
Bioinformatics Databases Dr. Taysir Hassan Abdel Hamid Lecturer, Information Systems Department Faculty of Computer and Information Assiut University taysirhs@aun.edu.eg taysir_soliman@hotmail.com Agenda
More informationuser s guide Question 1
Question 1 How does one find a gene of interest and determine that gene s structure? Once the gene has been located on the map, how does one easily examine other genes in that same region? doi:10.1038/ng966
More informationDigital information cycle. Database. Database. BINF 630: Bioinformatics Methods
Digital information cycle BINF 630: Bioinformatics Methods Iosif Vaisman Email: ivaisman@gmu.edu Creation and capture Storage and management Rights management Search and access Distribution Electronic
More informationWorksheet for Bioinformatics
Worksheet for Bioinformatics ACTIVITY: Learn to use biological databases and sequence analysis tools Exercise 1 Biological Databases Objective: To use public biological databases to search for latest research
More informationMolecular Cloning. Genomic DNA Library: Contains DNA fragments that represent an entire genome. cdna Library:
Molecular Cloning Genomic DNA Library: Contains DNA fragments that represent an entire genome. cdna Library: Made from mrna, and represents only protein-coding genes expressed by a cell at a given time.
More informationCOMPUTER RESOURCES II:
COMPUTER RESOURCES II: Using the computer to analyze data, using the internet, and accessing online databases Bio 210, Fall 2006 Linda S. Huang, Ph.D. University of Massachusetts Boston In the first computer
More informationProtein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl)
Protein Sequence Analysis BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl) Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical
More informationB I O I N F O R M A T I C S
B I O I N F O R M A T I C S Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be SUPPLEMENTARY CHAPTER: DATA BASES AND MINING 1 What
More informationSequencing the Human Genome
The Biotechnology 339 EDVO-Kit # Sequencing the Human Genome Experiment Objective: In this experiment, DNA sequences obtained from automated sequencers will be submitted to Data bank searches using the
More informationStudying the Human Genome. Lesson Overview. Lesson Overview Studying the Human Genome
Lesson Overview 14.3 Studying the Human Genome THINK ABOUT IT Just a few decades ago, computers were gigantic machines found only in laboratories and universities. Today, many of us carry small, powerful
More informationWeb-based tools for Bioinformatics; A (free) introduction to (freely available) NCBI, MUSC and World-wide.
Page 1 of 18 Web-based tools for Bioinformatics; A (free) introduction to (freely available) NCBI, MUSC and World-wide. When and Where---Wednesdays 1-2pm Room 438 Library Admin Building Beginning September
More informationJust the Facts: A Basic Introduction to the Science Underlying NCBI Resources
National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools News About NCBI Site Map
More informationuser s guide Question 3
Question 3 During a positional cloning project aimed at finding a human disease gene, linkage data have been obtained suggesting that the gene of interest lies between two sequence-tagged site markers.
More informationO C. 5 th C. 3 rd C. the national health museum
Elements of Molecular Biology Cells Cells is a basic unit of all living organisms. It stores all information to replicate itself Nucleus, chromosomes, genes, All living things are made of cells Prokaryote,
More informationInvestigating Inherited Diseases
Investigating Inherited Diseases The purpose of these exercises is to introduce bioinformatics databases and tools. We investigate an important human gene and see how mutations give rise to inherited diseases.
More informationSequencing the Human Genome
Revised and Updated Edvo-Kit #339 Sequencing the Human Genome 339 Experiment Objective: In this experiment, students will read DNA sequences obtained from automated DNA sequencing techniques. The data
More informationEnsembl workshop. Thomas Randall, PhD bioinformatics.unc.edu. handouts, papers, datasets
Ensembl workshop Thomas Randall, PhD tarandal@email.unc.edu bioinformatics.unc.edu www.unc.edu/~tarandal/ensembl handouts, papers, datasets Ensembl is a joint project between EMBL - EBI and the Sanger
More informationFACULTY OF BIOCHEMISTRY AND MOLECULAR MEDICINE
FACULTY OF BIOCHEMISTRY AND MOLECULAR MEDICINE BIOMOLECULES COURSE: COMPUTER PRACTICAL 1 Author of the exercise: Prof. Lloyd Ruddock Edited by Dr. Leila Tajedin 2017-2018 Assistant: Leila Tajedin (leila.tajedin@oulu.fi)
More informationComputational Molecular Biology Intro. Alexander (Sacha) Gultyaev
Computational Molecular Biology Intro Alexander (Sacha) Gultyaev a.p.goultiaev@liacs.leidenuniv.nl Biopolymer sequences DNA: double-helical nucleic acid. Monomers: nucleotides C, A, T, G. RNA: (single-stranded)
More informationDatabases/Resources on the web
Databases/Resources on the web Jon K. Lærdahl jonkl@medisin.uio.no A lot of biological databases available on the web... MetaBase, the database of biological databases (1801 entries) - h p://metadatabase.org
More informationBioinformatics overview
Bioinformatics overview Aplicações biomédicas em plataformas computacionais de alto desempenho Aplicaciones biomédicas sobre plataformas gráficas de altas prestaciones Biomedical applications in High performance
More informationEvolutionary Genetics. LV Lecture with exercises 6KP. Databases
Evolutionary Genetics LV 25600-01 Lecture with exercises 6KP Databases HS2018 Bioinformatics - R R Assignment The Minimalistic Approach!2 Bioinformatics - R Possible Exam Questions for R: Q1: The function
More informationRedundancy at GenBank => RefSeq. RefSeq vs GenBank. Databases, cont. Genome sequencing using a shotgun approach. Sequenced eukaryotic genomes
Databases, cont. Redundancy at GenBank => RefSeq http://www.ncbi.nlm.nih.gov/books/bv.fcg i?rid=handbook RefSeq vs GenBank Many sequences are represented more than once in GenBank 2003 RefSeq collection
More informationBIO4342 Lab Exercise: Detecting and Interpreting Genetic Homology
BIO4342 Lab Exercise: Detecting and Interpreting Genetic Homology Jeremy Buhler March 15, 2004 In this lab, we ll annotate an interesting piece of the D. melanogaster genome. Along the way, you ll get
More informationLesson Overview. Studying the Human Genome. Lesson Overview Studying the Human Genome
Lesson Overview 14.3 Studying the Human Genome THINK ABOUT IT Just a few decades ago, computers were gigantic machines found only in laboratories and universities. Today, many of us carry small, powerful
More informationBLASTing through the kingdom of life
Information for teachers Description: In this activity, students copy unknown DNA sequences and use them to search GenBank, the database of nucleotide sequences at the National Center for Biotechnology
More informationIntroduction and Public Sequence Databases. BME 110/BIOL 181 CompBio Tools
Introduction and Public Sequence Databases BME 110/BIOL 181 CompBio Tools Todd Lowe March 29, 2011 Course Syllabus: Admin http://www.soe.ucsc.edu/classes/bme110/spring11 Reading: Chapters 1, 2 (pp.29-56),
More informationComputational gene finding
Computational gene finding Devika Subramanian Comp 470 Outline (3 lectures) Lec 1 Lec 2 Lec 3 The biological context Markov models and Hidden Markov models Ab-initio methods for gene finding Comparative
More informationBIO 152 Principles of Biology III: Molecules & Cells Acquiring information from NCBI (PubMed/Bookshelf/OMIM)
BIO 152 Principles of Biology III: Molecules & Cells Acquiring information from NCBI (PubMed/Bookshelf/OMIM) Note: This material is adapted from Web-based Bioinformatics Tutorials: Exploring Genomes by
More informationBiological databases an introduction
Biological databases an introduction By Dr. Erik Bongcam-Rudloff SGBC-SLU 2016 VALIDATION Experimental Literature Manual or semi-automatic computational analysis EXPERIMENTAL Costs Needs skilled manpower
More informationNCBI Molecular Biology Resources. Entrez & BLAST. Entrez: Database Integration. Database Searching with Entrez. WWW Access. Using Entrez.
NCBI Molecular Biology Resources Using Entrez WWW Access Entrez & BLAST March 2007 Phylogeny Entrez: Database Integration Taxonomy PubMed abstracts Genomes Word weight 3-D Structure VAST Neighbors Related
More informationGuided tour to Ensembl
Guided tour to Ensembl Introduction Introduction to the Ensembl project Walk-through of the browser Variations and Functional Genomics Comparative Genomics BioMart Ensembl Genome browser http://www.ensembl.org
More informationHot Topics. What s New with BLAST?
Hot Topics What s New with BLAST? Slides based on NCBI talk at American Society of Human Genetics October 2005 Hot Topics Outline I. New BLAST Algorithm: Discontiguous MegaBLAST II. New Databases III.
More informationGenome and DNA Sequence Databases. BME 110: CompBio Tools Todd Lowe April 5, 2007
Genome and DNA Sequence Databases BME 110: CompBio Tools Todd Lowe April 5, 2007 Admin Reading: Chapters 2 & 3 Notes available in PDF format on-line (see class calendar page): http://www.soe.ucsc.edu/classes/bme110/spring07/bme110-calendar.html
More informationWhy learn sequence database searching? Searching Molecular Databases with BLAST
Why learn sequence database searching? Searching Molecular Databases with BLAST What have I cloned? Is this really!my gene"? Basic Local Alignment Search Tool How BLAST works Interpreting search results
More informationAgenda. Web Databases for Drosophila. Gene annotation workflow. GEP Drosophila annotation projects 01/01/2018. Annotation adding labels to a sequence
Agenda GEP annotation project overview Web Databases for Drosophila An introduction to web tools, databases and NCBI BLAST Web databases for Drosophila annotation UCSC Genome Browser NCBI / BLAST FlyBase
More informationG4120: Introduction to Computational Biology
G4120: Introduction to Computational Biology Oliver Jovanovic, Ph.D. Columbia University Department of Microbiology Lecture 3 February 13, 2003 Copyright 2003 Oliver Jovanovic, All Rights Reserved. Bioinformatics
More informationIntroduction to 'Omics and Bioinformatics
Introduction to 'Omics and Bioinformatics Chris Overall Department of Bioinformatics and Genomics University of North Carolina Charlotte Acquire Store Analyze Visualize Bioinformatics makes many current
More informationLast Update: 12/31/2017. Recommended Background Tutorial: An Introduction to NCBI BLAST
BLAST Exercise: Detecting and Interpreting Genetic Homology Adapted by T. Cordonnier, C. Shaffer, W. Leung and SCR Elgin from Detecting and Interpreting Genetic Homology by Dr. J. Buhler Recommended Background
More informationChimp BAC analysis: Adapted by Wilson Leung and Sarah C.R. Elgin from Chimp BAC analysis: TWINSCAN and UCSC Browser by Dr. Michael R.
Chimp BAC analysis: Adapted by Wilson Leung and Sarah C.R. Elgin from Chimp BAC analysis: TWINSCAN and UCSC Browser by Dr. Michael R. Brent Prerequisites: BLAST exercise: Detecting and Interpreting Genetic
More informationThe Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica
The Ensembl Database Dott.ssa Inga Prokopenko Corso di Genomica 1 www.ensembl.org Lecture 7.1 2 What is Ensembl? Public annotation of mammalian and other genomes Open source software Relational database
More informationVideos. Lesson Overview. Fermentation
Lesson Overview Fermentation Videos Bozeman Transcription and Translation: https://youtu.be/h3b9arupxzg Drawing transcription and translation: https://youtu.be/6yqplgnjr4q Objectives 29a) I can contrast
More informationSequence Databases. Chapter 2. caister.com/bioinformaticsbooks. Paul Rangel. Sequence Databases
Chapter 2 Paul Rangel Abstract DNA and Protein sequence databases are the cornerstone of bioinformatics research. DNA databases such as GenBank and EMBL accept genome data from sequencing projects around
More information