Protein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl)
|
|
- Rosamond Kelly
- 6 years ago
- Views:
Transcription
1 Protein Sequence Analysis BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl)
2 Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical properties Molecular weight (MW), isoelectric point (pi), amino acid content, hydropathy (hydrophilic v. hydrophobic regions) Does not take into account post-translational modifications of protein, so are usually not 100% accurate Identify sequence motifs and families Signal sequences, transmembrane domains, coiled-coils, posttranslational modification sites, secondary structure (nonhomologous) Domains, functional motifs (homologous)
3 3-D Structure Analysis Visualization Domain structure, global fold, active sites, point mutations, SNPs, splice sites Evaluate structure quality Calculate physical properties Surface areas, distances, side-chain conformations, contact maps Structural alignment (ie similarity to other structures) Prediction Physical properties: binding affinity, pka s stability, specificity 3D structure (homology modeling, fold recognition, de novo) Advanced: protein design, docking of two proteins, active site modeling
4 Sequence Databases SwissProt (ExPASy) Highly curated, updated less frequently TrEMBL (ExPASy) Translated nucleotide sequences Automatic translation, fast but less info UniProt (EBI) Unified Protein Resource Combines SwissProt, TrEMBL, PIR sequences
5 Sequence Analysis Sites For protein sequences and tools to analyze them, the two major centers are: ExPASy : Expert Protein Analysis System Many tools: Databases: SwissProt, TrEMBL NCBI : Entrez Protein and Domains PIR: Protein Information Resource (folded into UniProt consortium; no longer major resource site)
6 More Sequence Databases Non-redundant NR (NCBI), UniRef (PIR/EBI) Reference RefSeq (NCBI) re-annotated by NCBI Domains/Families Pfam protein families (Sanger Center + 4 mirror sites) SMART Simple Modular Architecture Research Tool CDD Conserved protein Domain Database (NCBI), combines Pfam, SMART, and COGs databases InterPro (based on UniProt, at EMBL-EBI) Many others
7 Structure Databases Experimental: PDB: Protein Data Bank Families: SCOP, CATH, Dali database, Homstrad Models/Predictions ModBase SwissModel NOTE: All these databases are described in January Database issue of Nucleic Acids Research (plus other kinds of databases). Also, links to them
8 Protein Sequence Analysis Tools ExPASy Proteomics Tools Calculate physical properties Predict sequence motifs what ExPASy calls Topology : localization, TM domains Signal sequences, postranslational modifications Search pattern and profile collections PredictProtein and Meta-PP A meta-server providing access to many servers with one submission form
9 Secondary Structure Prediction Three good methods: Psipred Sam-T02/T04/T06 PhD (PredictProtein) Compare a couple methods Use the three-state predictions
10 SEQUENCE <--> STRUCTURE <--> FUNCTION Evolutionary selection operates on function Structure is more closely linked to function than is sequence, so structure tends to be more conserved than sequence. Need to search farther in sequence space to find proteins with related structures and functions.
11 Detecting Remote Similarities Remote similarities can more easily be detected by comparing protein sequences DNA sequences change faster than protein sequences (wobble position, redundant codons) 4 letter DNA code vs. 20 letter amino acid code means that matches by chance are more likely in DNA; The protein code has more information in it!
12 Detecting Homology NEAR Evolutionary Distance FAR DNA Sequence Protein Sequence SIMILARITY Protein Structure BLASTn BLASTp PSIBLAST Fold Recognition METHODS
13 Similar Sequences Share Similar Structures Compare all pairs of proteins in the same family (pairs for which homology is very probable) Homologs do not necessarily share much sequence similarity. Proteins with >30% sequence identity almost always share the same fold More structurally similar All other immunoglobulins Sauder et al., (2000) Proteins 40:6
14 BLASTP Heuristics Most sequences will be unrelated Related sequences are likely to have short stretches of identities Use identical (or closely related) short words as seeds for local alignment Like BLASTn with shorter words Will return global alignment if proteins similar over whole length
15 BLASTp Scoring Matrices less divergent BLOSUM 80 BLOSUM62 BLOSUM45 PAM1 PAM120 PAM250 more divergent BLOcks amino acid SUbstitution Matrices Calculated directly from substitution frequencies in local, ungapped alignments of biochemically related sequences Number indicates the highest sequence similarity between sequences used. Percent Accepted Mutations Derived from global alignments of closely related sequences (85% identity) using an evolutionary model to extrapolate to lower identities Number indicates evolutionary distance If in doubt, use BLOSUM. More suited to searching databases using local alignment. No assumed model of evolutionary divergence.
16 Other BLASTp Parameters Gap penalties The harder it is to open/extend a gap, the fewer will be made. If you re looking for close sequences, gap penalties should be higher. Databases NR (non-redundant, translated gene sequences) SwissProt PDB Phylogenetically specific (i.e. Archaea only)
17 PSIBLAST Position-Specific Iterated BLAST Creates a scoring matrix specialized for your sequence Allows more distantly related sequences to be identified Steps 1. Use BLASTp and identify related sequences (E-value threshold) 2. Create a profile from related sequences 3. Search for related sequences using this profile 4. Repeat
18 BLASTing The Protein Universe BLOSUM45 BLOSUM80
19 Evolution And The Protein Universe
20 PSIBLASTing The Protein Universe Iteration 1 Iteration 2
21 Sequence Profiles Align all sequences and count how often each amino acid occurs at every position. Combine with prior information about substitution frequencies using pseudocounts from BLOSUM62 Convert to log odds score to give a Position-Specific Scoring Matrix (PSSM)
22 A Sample PSSM A R N D C Q E G H I L K M F P S T W Y V 1 M K W V W A L L L L A A W A A A S G T W Y A From Pevsner, Bioinformatics and Functional Genomics (ISBN ) John Wiley & Sons, Inc.
23 PSSM Corruption False positives can occur in a PSIBLAST search of the PSSM becomes corrupted. How do PSSMs become corrupted? One sequence that is not homologous to the query gets included in the alignment used to make the PSSM. The PSSM now looks a bit like this spurious sequence and will match well to other similar spurious sequences. These additional spurious sequences that are detected are included in the new alignment, amplifying the corrupting signal. Once a bad sequence is included in the PSSM, the search veers of course, and cannot be corrected.
24 Preventing PSSM Corruption Apply filtering of biased composition regions. (Low complexity filter) Use better methods to estimate the E-value (compositionbased statistics) Increase threshold for judging two sequences to be similar: adjust E value from (default) to a lower value such as E = Manually inspect the output from each iteration and remove suspicious hits.
25 PHI-BLAST: Pattern-Hit-Initiated BLAST Combines matching of regular expressions with local alignments surrounding the match. What other proteins contain a particular sequence pattern and are similar in the vicinity of this pattern? May filter out cases where pattern matches randomly and doesn t indicate homology Pattern matching uses ScanProsite syntax Sequence similarity search is like PSIBLAST
26 Syntax Rules for Patterns [ ] any one of the listed characters allowed { } any character except the listed ones allowed x(n) n positions in which any residue is allowed x(n,m) n to m positions in which any residue is allowed (n,m) Examples: GXW[YF][EA][IVLM] Matches: GTWFEL GKWYAI Does not match: GGWYFEI GWYEI E[LIV]X(0,3)PP[STG] Matches: ELPPS ELPPPSTG EVIPPG Does not match: ELIVPPPPG
27 Gene Discovery with BLAST Start with the sequence of a known protein tblastn Search a DNA database (e.g. HTGS, dbest, or genomic sequence from a specific organism) inspect Search your DNA or protein against a protein database (nr) to confirm you have identified a novel gene blastx or blastp nr Find matches [1] to DNA encoding known proteins [2] to DNA encoding related (novel!) proteins [3] to false positives
Outline. Evolution. Adaptive convergence. Common similarity problems. Chapter 7: Similarity searches on sequence databases
Chapter 7: Similarity searches on sequence databases All science is either physics or stamp collection. Ernest Rutherford Outline Why is similarity important BLAST Protein and DNA Interpreting BLAST Individualizing
More informationBLAST. compared with database sequences Sequences with many matches to high- scoring words are used for final alignments
BLAST 100 times faster than dynamic programming. Good for database searches. Derive a list of words of length w from query (e.g., 3 for protein, 11 for DNA) High-scoring words are compared with database
More informationSequence Databases and database scanning
Sequence Databases and database scanning Marjolein Thunnissen Lund, 2012 Types of databases: Primary sequence databases (proteins and nucleic acids). Composite protein sequence databases. Secondary databases.
More informationWhy learn sequence database searching? Searching Molecular Databases with BLAST
Why learn sequence database searching? Searching Molecular Databases with BLAST What have I cloned? Is this really!my gene"? Basic Local Alignment Search Tool How BLAST works Interpreting search results
More informationProtein Bioinformatics Part I: Access to information
Protein Bioinformatics Part I: Access to information 260.655 April 6, 2006 Jonathan Pevsner, Ph.D. pevsner@kennedykrieger.org Outline [1] Proteins at NCBI RefSeq accession numbers Cn3D to visualize structures
More informationSequence Based Function Annotation. Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University
Sequence Based Function Annotation Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Usage scenarios for sequence based function annotation Function prediction of newly cloned
More informationThe String Alignment Problem. Comparative Sequence Sizes. The String Alignment Problem. The String Alignment Problem.
Dec-82 Oct-84 Aug-86 Jun-88 Apr-90 Feb-92 Nov-93 Sep-95 Jul-97 May-99 Mar-01 Jan-03 Nov-04 Sep-06 Jul-08 May-10 Mar-12 Growth of GenBank 160,000,000,000 180,000,000 Introduction to Bioinformatics Iosif
More informationELE4120 Bioinformatics. Tutorial 5
ELE4120 Bioinformatics Tutorial 5 1 1. Database Content GenBank RefSeq TPA UniProt 2. Database Searches 2 Databases A common situation for alignment is to search through a database to retrieve the similar
More informationIntroduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks
Introduction to Bioinformatics CPSC 265 Thanks to Jonathan Pevsner, Ph.D. Textbooks Johnathan Pevsner, who I stole most of these slides from (thanks!) has written a textbook, Bioinformatics and Functional
More informationFACULTY OF BIOCHEMISTRY AND MOLECULAR MEDICINE
FACULTY OF BIOCHEMISTRY AND MOLECULAR MEDICINE BIOMOLECULES COURSE: COMPUTER PRACTICAL 1 Author of the exercise: Prof. Lloyd Ruddock Edited by Dr. Leila Tajedin 2017-2018 Assistant: Leila Tajedin (leila.tajedin@oulu.fi)
More informationONLINE BIOINFORMATICS RESOURCES
Dedan Githae Email: d.githae@cgiar.org BecA-ILRI Hub; Nairobi, Kenya 16 May, 2014 ONLINE BIOINFORMATICS RESOURCES Introduction to Molecular Biology and Bioinformatics (IMBB) 2014 The larger picture.. Lower
More informationVL Algorithmische BioInformatik (19710) WS2013/2014 Woche 3 - Mittwoch
VL Algorithmische BioInformatik (19710) WS2013/2014 Woche 3 - Mittwoch Tim Conrad AG Medical Bioinformatics Institut für Mathematik & Informatik, Freie Universität Berlin Vorlesungsthemen Part 1: Background
More informationBLAST. Basic Local Alignment Search Tool. Optimized for finding local alignments between two sequences.
BLAST Basic Local Alignment Search Tool. Optimized for finding local alignments between two sequences. An example could be aligning an mrna sequence to genomic DNA. Proteins are frequently composed of
More informationEvolutionary Genetics. LV Lecture with exercises 6KP
Evolutionary Genetics LV 25600-01 Lecture with exercises 6KP HS2017 >What_is_it? AATGATACGGCGACCACCGAGATCTACACNNNTC GTCGGCAGCGTC 2 NCBI MegaBlast search (09/14) 3 NCBI MegaBlast search (09/14) 4 Submitted
More informationCreation of a PAM matrix
Rationale for substitution matrices Substitution matrices are a way of keeping track of the structural, physical and chemical properties of the amino acids in proteins, in such a fashion that less detrimental
More informationComparative Bioinformatics. BSCI348S Fall 2003 Midterm 1
BSCI348S Fall 2003 Midterm 1 Multiple Choice: select the single best answer to the question or completion of the phrase. (5 points each) 1. The field of bioinformatics a. uses biomimetic algorithms to
More informationDynamic Programming Algorithms
Dynamic Programming Algorithms Sequence alignments, scores, and significance Lucy Skrabanek ICB, WMC February 7, 212 Sequence alignment Compare two (or more) sequences to: Find regions of conservation
More informationDatabase Searching and BLAST Dannie Durand
Computational Genomics and Molecular Biology, Fall 2013 1 Database Searching and BLAST Dannie Durand Tuesday, October 8th Review: Karlin-Altschul Statistics Recall that a Maximal Segment Pair (MSP) is
More informationBioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 2. Bioinformatics 1: Biology, Sequences, Phylogenetics
Bioinformatics 1 Biology, Sequences, Phylogenetics Part 2 Sepp Hochreiter gene Central Dogma nucleus DNA 1. transcription (mrna) 2. transport mrna protein 3. translation (ribosom, trna) 4. folding (protein)
More informationIntroduction to BIOINFORMATICS
Introduction to BIOINFORMATICS Antonella Lisa CABGen Centro di Analisi Bioinformatica per la Genomica Tel. 0382-546361 E-mail: lisa@igm.cnr.it http://www.igm.cnr.it/pagine-personali/lisa-antonella/ What
More informationFiles for this Tutorial: All files needed for this tutorial are compressed into a single archive: [BLAST_Intro.tar.gz]
BLAST Exercise: Detecting and Interpreting Genetic Homology Adapted by W. Leung and SCR Elgin from Detecting and Interpreting Genetic Homology by Dr. J. Buhler Prequisites: None Resources: The BLAST web
More informationBasic Bioinformatics: Homology, Sequence Alignment,
Basic Bioinformatics: Homology, Sequence Alignment, and BLAST William S. Sanders Institute for Genomics, Biocomputing, and Biotechnology (IGBB) High Performance Computing Collaboratory (HPC 2 ) Mississippi
More informationTypes of Databases - By Scope
Biological Databases Bioinformatics Workshop 2009 Chi-Cheng Lin, Ph.D. Department of Computer Science Winona State University clin@winona.edu Biological Databases Data Domains - By Scope - By Level of
More informationModern BLAST Programs
Modern BLAST Programs Jian Ma and Louxin Zhang Abstract The Basic Local Alignment Search Tool (BLAST) is arguably the most widely used program in bioinformatics. By sacrificing sensitivity for speed, it
More informationMATH 5610, Computational Biology
MATH 5610, Computational Biology Lecture 2 Intro to Molecular Biology (cont) Stephen Billups University of Colorado at Denver MATH 5610, Computational Biology p.1/24 Announcements Error on syllabus Class
More informationGene Identification in silico
Gene Identification in silico Nita Parekh, IIIT Hyderabad Presented at National Seminar on Bioinformatics and Functional Genomics, at Bioinformatics centre, Pondicherry University, Feb 15 17, 2006. Introduction
More information3D Structure Prediction with Fold Recognition/Threading. Michael Tress CNB-CSIC, Madrid
3D Structure Prediction with Fold Recognition/Threading Michael Tress CNB-CSIC, Madrid MREYKLVVLGSGGVGKSALTVQFVQGIFVDEYDPTIEDSY RKQVEVDCQQCMLEILDTAGTEQFTAMRDLYMKNGQGFAL VYSITAQSTFNDLQDLREQILRVKDTEDVPMILVGNKCDL
More informationBioinformatics for Proteomics. Ann Loraine
Bioinformatics for Proteomics Ann Loraine aloraine@uab.edu What is bioinformatics? The science of collecting, processing, organizing, storing, analyzing, and mining biological information, especially data
More informationExploring Similarities of Conserved Domains/Motifs
Exploring Similarities of Conserved Domains/Motifs Sotiria Palioura Abstract Traditionally, proteins are represented as amino acid sequences. There are, though, other (potentially more exciting) representations;
More informationComputational Biology and Bioinformatics
Computational Biology and Bioinformatics Computational biology Development of algorithms to solve problems in biology Bioinformatics Application of computational biology to the analysis and management
More informationLast Update: 12/31/2017. Recommended Background Tutorial: An Introduction to NCBI BLAST
BLAST Exercise: Detecting and Interpreting Genetic Homology Adapted by T. Cordonnier, C. Shaffer, W. Leung and SCR Elgin from Detecting and Interpreting Genetic Homology by Dr. J. Buhler Recommended Background
More informationWhy Use BLAST? David Form - August 15,
Wolbachia Workshop 2017 Bioinformatics BLAST Basic Local Alignment Search Tool Finding Model Organisms for Study of Disease Can yeast be used as a model organism to study cystic fibrosis? BLAST Why Use
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics Dr. Taysir Hassan Abdel Hamid Lecturer, Information Systems Department Faculty of Computer and Information Assiut University taysirhs@aun.edu.eg taysir_soliman@hotmail.com
More informationComputational analysis of non-coding RNA. Andrew Uzilov BME110 Tue, Nov 16, 2010
Computational analysis of non-coding RNA Andrew Uzilov auzilov@ucsc.edu BME110 Tue, Nov 16, 2010 1 Corrected/updated talk slides are here: http://tinyurl.com/uzilovrna redirects to: http://users.soe.ucsc.edu/~auzilov/bme110/fall2010/
More informationQuery-seeded iterative sequence similarity searching improves selectivity 5 20-fold
Nucleic Acids Research, 2016 1 doi: 10.1093/nar/gkw1207 Query-seeded iterative sequence similarity searching improves selectivity 5 20-fold William R. Pearson 1,*, Weizhong Li 2 and Rodrigo Lopez 2 1 Dept.
More informationProtein Structure Databases, cont. 11/09/05
11/9/05 Protein Structure Databases (continued) Prediction & Modeling Bioinformatics Seminars Nov 10 Thurs 3:40 Com S Seminar in 223 Atanasoff Computational Epidemiology Armin R. Mikler, Univ. North Texas
More informationI nternet Resources for Bioinformatics Data and Tools
~i;;;;;;;'s :.. ~,;;%.: ;!,;s163 ~. s :s163:: ~s ;'.:'. 3;3 ~,: S;I:;~.3;3'/////, IS~I'//. i: ~s '/, Z I;~;I; :;;; :;I~Z;I~,;'//.;;;;;I'/,;:, :;:;/,;'L;;;~;'~;~,::,:, Z'LZ:..;;',;';4...;,;',~/,~:...;/,;:'.::.
More informationChimp Sequence Annotation: Region 2_3
Chimp Sequence Annotation: Region 2_3 Jeff Howenstein March 30, 2007 BIO434W Genomics 1 Introduction We received region 2_3 of the ChimpChunk sequence, and the first step we performed was to run RepeatMasker
More informationRNA Genomics II. BME 110: CompBio Tools Todd Lowe & Andrew Uzilov May 17, 2011
RNA Genomics II BME 110: CompBio Tools Todd Lowe & Andrew Uzilov May 17, 2011 1 TIME Why RNA? An evolutionary perspective The RNA World hypotheses: life arose as self-replicating non-coding RNA (ncrna)
More informationIdentification of Single Nucleotide Polymorphisms and associated Disease Genes using NCBI resources
Identification of Single Nucleotide Polymorphisms and associated Disease Genes using NCBI resources Navreet Kaur M.Tech Student Department of Computer Engineering. University College of Engineering, Punjabi
More informationTranslating Biological Data Sets Into Linked Data
Translating Biological Data Sets Into Linked Data Mark Tomko Simmons College, Boston MA The Broad Institute of MIT and Harvard, Cambridge MA September 28, 2011 Overview Why study biological data? UniProt
More informationCOMPUTER RESOURCES II:
COMPUTER RESOURCES II: Using the computer to analyze data, using the internet, and accessing online databases Bio 210, Fall 2006 Linda S. Huang, Ph.D. University of Massachusetts Boston In the first computer
More informationBioinformatics with basic local alignment search tool (BLAST) and fast alignment (FASTA)
Vol. 6(1), pp. 1-6, April 2014 DOI: 10.5897/IJBC2013.0086 Article Number: 093849744377 ISSN 2141-2464 Copyright 2014 Author(s) retain the copyright of this article http://www.academicjournals.org/jbsa
More informationBIOINFORMATICS IN BIOCHEMISTRY
BIOINFORMATICS IN BIOCHEMISTRY Bioinformatics a field at the interface of molecular biology, computer science, and mathematics Bioinformatics focuses on the analysis of molecular sequences (DNA, RNA, and
More informationAlignment to a database. November 3, 2016
Alignment to a database November 3, 2016 How do you create a database? 1982 GenBank (at LANL, 2000 sequences) 1988 A way to search GenBank (FASTA) Genome Project 1982 GenBank (at LANL, 2000 sequences)
More informationHomology Modelling. Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen
Homology Modelling Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen Why are Protein Structures so Interesting? They provide a detailed picture of interesting biological features,
More informationLeonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015
Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH BIOL 7210 A Computational Genomics 2/18/2015 The $1,000 genome is here! http://www.illumina.com/systems/hiseq-x-sequencing-system.ilmn Bioinformatics bottleneck
More informationApplied Bioinformatics
Applied Bioinformatics Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Course overview What is bioinformatics Data driven science: the creation and advancement
More informationAPPENDIX. Appendix. Table of Contents. Ethics Background. Creating Discussion Ground Rules. Amino Acid Abbreviations and Chemistry Resources
Appendix Table of Contents A2 A3 A4 A5 A6 A7 A9 Ethics Background Creating Discussion Ground Rules Amino Acid Abbreviations and Chemistry Resources Codons and Amino Acid Chemistry Behind the Scenes with
More informationAnnotating Fosmid 14p24 of D. Virilis chromosome 4
Lo 1 Annotating Fosmid 14p24 of D. Virilis chromosome 4 Lo, Louis April 20, 2006 Annotation Report Introduction In the first half of Research Explorations in Genomics I finished a 38kb fragment of chromosome
More informationJPred and Jnet: Protein Secondary Structure Prediction.
JPred and Jnet: Protein Secondary Structure Prediction www.compbio.dundee.ac.uk/jpred ...A I L E G D Y A S H M K... FUNCTION? Protein Sequence a-helix b-strand Secondary Structure Fold What is the difference
More informationAn Introduction to Bioinformatics for Biological Sciences Students
An Introduction to Bioinformatics for Biological Sciences Students Department of Microbiology and Immunology, McGill University Version 2.5 (For the BIOC-300 lab), March 2006 2 AN INTRODUCTION TO BIOINFORMATICS
More informationBIOINFORMATICS AND FUNCTIONAL GENOMICS
BIOINFORMATICS AND FUNCTIONAL GENOMICS third edition Jonathan Pevsner Bioinformatics and Functional Genomics Bioinformatics and Functional Genomics Third Edition Jonathan Pevsner Department of Neurology,
More informationOverview of Health Informatics. ITI BMI-Dept
Overview of Health Informatics ITI BMI-Dept Fellowship Week 5 Overview of Health Informatics ITI, BMI-Dept Day 10 7/5/2010 2 Agenda 1-Bioinformatics Definitions 2-System Biology 3-Bioinformatics vs Computational
More informationPost-assembly Data Analysis
Assembled transcriptome Post-assembly Data Analysis Quantification: the expression level of each gene in each sample DE genes: genes differentially expressed between samples Clustering/network analysis
More informationIntroduction to Molecular Biology Databases
Introduction to Molecular Biology Databases Laboratorio de Bioinformática Centro de Astrobiología INTA-CSIC Centro de Astrobiología PRESENT BIOLOGY RESEARCH Data sources Genome sequencing projects: genome
More informationTeaching Principles of Enzyme Structure, Evolution, and Catalysis Using Bioinformatics
KBM Journal of Science Education (2010) 1 (1): 7-12 doi: 10.5147/kbmjse/2010/0013 Teaching Principles of Enzyme Structure, Evolution, and Catalysis Using Bioinformatics Pablo Sobrado Department of Biochemistry,
More informationHands-On Four Investigating Inherited Diseases
Hands-On Four Investigating Inherited Diseases The purpose of these exercises is to introduce bioinformatics databases and tools. We investigate an important human gene and see how mutations give rise
More informationBioinformatic Methods I Lab 2 LAB 2 ADVANCED BLAST AND COMPARATIVE GENOMICS. [Software needed: web access]
LAB 2 ADVANCED BLAST AND COMPARATIVE GENOMICS [Software needed: web access] There are 4 sections to this lab: BlastP, PSI-Blast, Translated Blast, and Comparative Genomics. Last time we used BLAST to query
More informationThe use of bioinformatic analysis in support of HGT from plants to microorganisms. Meeting with applicants Parma, 26 November 2015
The use of bioinformatic analysis in support of HGT from plants to microorganisms Meeting with applicants Parma, 26 November 2015 WHY WE NEED TO CONSIDER HGT IN GM PLANT RA Directive 2001/18/EC As general
More information(a) (3 points) Which of these plants (use number) show e/e pattern? Which show E/E Pattern and which showed heterozygous e/e pattern?
1. (20 points) What are each of the following molecular markers? (Indicate (a) what they stand for; (b) the nature of the molecular polymorphism and (c) Methods of detection (such as gel electrophoresis,
More informationBig picture and history
Big picture and history (and Computational Biology) CS-5700 / BIO-5323 Outline 1 2 3 4 Outline 1 2 3 4 First to be databased were proteins The development of protein- s (Sanger and Tuppy 1951) led to the
More informationB I O I N F O R M A T I C S
B I O I N F O R M A T I C S Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be SUPPLEMENTARY CHAPTER: DATA BASES AND MINING 1 What
More informationNCBI web resources I: databases and Entrez
NCBI web resources I: databases and Entrez Yanbin Yin Most materials are downloaded from ftp://ftp.ncbi.nih.gov/pub/education/ 1 Homework assignment 1 Two parts: Extract the gene IDs reported in table
More informationAB INITIO PROTEIN STRUCTURE PREDICTION ALGORITHMS
San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 2011 AB INITIO PROTEIN STRUCTURE PREDICTION ALGORITHMS Maciej Kicinski San Jose State University
More informationGene-centered resources at NCBI
COURSE OF BIOINFORMATICS a.a. 2014-2015 Gene-centered resources at NCBI We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about genes, serving
More informationWeb based Bioinformatics Applications in Proteomics. Genbank
Web based Bioinformatics Applications in Proteomics Chiquito Crasto ccrasto@genetics.uab.edu February 9, 2010 Genbank Primary nucleic acid sequence database Maintained by NCBI National Center for Biotechnology
More informationApplication for Automating Database Storage of EST to Blast Results. Vikas Sharma Shrividya Shivkumar Nathan Helmick
Application for Automating Database Storage of EST to Blast Results Vikas Sharma Shrividya Shivkumar Nathan Helmick Outline Biology Primer Vikas Sharma System Overview Nathan Helmick Creating ESTs Nathan
More informationTypically, to be biologically related means to share a common ancestor. In biology, we call this homologous
Typically, to be biologically related means to share a common ancestor. In biology, we call this homologous. Two proteins sharing a common ancestor are said to be homologs. Homologyoften implies structural
More informationBIOINFORMATICS Introduction
BIOINFORMATICS Introduction Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a 1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu What is Bioinformatics? (Molecular) Bio -informatics One idea
More informationAgenda. Web Databases for Drosophila. Gene annotation workflow. GEP Drosophila annotation projects 01/01/2018. Annotation adding labels to a sequence
Agenda GEP annotation project overview Web Databases for Drosophila An introduction to web tools, databases and NCBI BLAST Web databases for Drosophila annotation UCSC Genome Browser NCBI / BLAST FlyBase
More informationKorilog. high-performance sequence similarity search tool & integration with KNIME platform. Patrick Durand, PhD, CEO. BIOINFORMATICS Solutions
KLAST high-performance sequence similarity search tool & integration with KNIME platform Patrick Durand, PhD, CEO Sequence analysis big challenge DNA sequence... Context 1. Modern sequencers produce huge
More informationGetting To Know Your Protein
Getting To Know Your Protein Comparative Protein Analysis: Part II. Protein Domain Identification & Classification Robert Latek, PhD Sr. Bioinformatics Scientist Whitehead Institute for Biomedical Research
More information8/21/2014. From Gene to Protein
From Gene to Protein Chapter 17 Objectives Describe the contributions made by Garrod, Beadle, and Tatum to our understanding of the relationship between genes and enzymes Briefly explain how information
More informationNiceProt View of Swiss-Prot: P18907
Hosted by NCSC US ExPASy Home page Site Map Search ExPASy Contact us Swiss-Prot Mirror sites: Australia Bolivia Canada China Korea Switzerland Taiwan Search Swiss-Prot/TrEMBL for horse alpha Go Clear NiceProt
More informationUCSC Genome Browser. Introduction to ab initio and evidence-based gene finding
UCSC Genome Browser Introduction to ab initio and evidence-based gene finding Wilson Leung 06/2006 Outline Introduction to annotation ab initio gene finding Basics of the UCSC Browser Evidence-based gene
More informationProtein 3D Structure Prediction
Protein 3D Structure Prediction Michael Tress CNIO ?? MREYKLVVLGSGGVGKSALTVQFVQGIFVDE YDPTIEDSYRKQVEVDCQQCMLEILDTAGTE QFTAMRDLYMKNGQGFALVYSITAQSTFNDL QDLREQILRVKDTEDVPMILVGNKCDLEDER VVGKEQGQNLARQWCNCAFLESSAKSKINVN
More informationProtein Bioinformatics PH Final Exam
Name (please print) Protein Bioinformatics PH260.655 Final Exam => take-home questions => open-note => please use either * the Word-file to type your answers or * print out the PDF and hand-write your
More informationStructure & Function. Ulf Leser
Proteins: Structure & Function Ulf Leser This Lecture Introduction Structure Function Databases Predicting Protein Secondary Structure Many figures from Zvelebil, M. and Baum, J. O. (2008). "Understanding
More informationMolecular Databases and Tools
NWeHealth, The University of Manchester Molecular Databases and Tools Afternoon Session: NCBI/EBI resources, pairwise alignment, BLAST, multiple sequence alignment and primer finding. Dr. Georgina Moulton
More informationSequence Databases. Chapter 2. caister.com/bioinformaticsbooks. Paul Rangel. Sequence Databases
Chapter 2 Paul Rangel Abstract DNA and Protein sequence databases are the cornerstone of bioinformatics research. DNA databases such as GenBank and EMBL accept genome data from sequencing projects around
More informationAnnotating 7G24-63 Justin Richner May 4, Figure 1: Map of my sequence
Annotating 7G24-63 Justin Richner May 4, 2005 Zfh2 exons Thd1 exons Pur-alpha exons 0 40 kb 8 = 1 kb = LINE, Penelope = DNA/Transib, Transib1 = DINE = Novel Repeat = LTR/PAO, Diver2 I = LTR/Gypsy, Invader
More informationSAMPLE LITERATURE Please refer to included weblink for correct version.
Edvo-Kit #340 DNA Informatics Experiment Objective: In this experiment, students will explore the popular bioninformatics tool BLAST. First they will read sequences from autoradiographs of automated gel
More informationProduct Applications for the Sequence Analysis Collection
Product Applications for the Sequence Analysis Collection Pipeline Pilot Contents Introduction... 1 Pipeline Pilot and Bioinformatics... 2 Sequence Searching with Profile HMM...2 Integrating Data in a
More informationBLAST Basics. ... Elements of Bioinformatics Spring, Tom Carter. tom/
BLAST Basics...... Elements of Bioinformatics Spring, 2003 Tom Carter http://astarte.csustan.edu/ tom/ March, 2003 1 Sequence Comparison One of the fundamental tasks we would like to do in bioinformatics
More informationNATIONAL OPEN UNIVERSITY OF NIGERIA SCHOOL OF ARTS AND SOCIAL SCIENCES COURSE CODE: BIO 316 COURSE TITLE: INTRODUCTION TO BIOINFORMATICS
NATIONAL OPEN UNIVERSITY OF NIGERIA SCHOOL OF ARTS AND SOCIAL SCIENCES COURSE CODE: BIO 316 COURSE TITLE: INTRODUCTION TO BIOINFORMATICS 1 Course Code : BIO 316 Course Title : Introduction to Bioinformatics
More informationBioinformatic tools for metagenomic data analysis
Bioinformatic tools for metagenomic data analysis MEGAN - blast-based tool for exploring taxonomic content MG-RAST (SEED, FIG) - rapid annotation of metagenomic data, phylogenetic classification and metabolic
More informationExploring the Genetic Basis for Behavior. Instructor s Notes
Exploring the Genetic Basis for Behavior Instructor s Notes Introduction This lab was designed for our 300-level Advanced Genetics course taken by juniors and seniors majoring in Biology or Biochemistry.
More informationGapped BLAST and PSI-BLAST: a new generation of protein database search programs
1997 Oxford University Press Nucleic Acids Research, 1997, Vol. 25, No. 17 3389 3402 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs Stephen F. Altschul*, Thomas L. Madden,
More informationComputer aided modeling of a fructose repressor
Computer aided modeling of a fructose repressor Miia Helanto and Kristiina Kiviharju Abstract The bioconversion of fructose to mannitol is a commercially interesting bioprocess. Fructose can, however be
More informationIntroduction to Microarray Data Analysis and Gene Networks. Alvis Brazma European Bioinformatics Institute
Introduction to Microarray Data Analysis and Gene Networks Alvis Brazma European Bioinformatics Institute A brief outline of this course What is gene expression, why it s important Microarrays and how
More informationHINT-KB: The Human Interactome Knowledge Base
HINT-KB: The Human Interactome Knowledge Base Konstantinos Theofilatos 1, Christos Dimitrakopoulos 1, Dimitrios Kleftogiannis 2, Charalampos Moschopoulos 3, Stergios Papadimitriou 4, Spiros Likothanassis
More informationGlossary of Commonly used Annotation Terms
Glossary of Commonly used Annotation Terms Akela a general use server for the annotation group as well as other groups throughout TIGR. Annotation Notebook a link from the gene list page that is associated
More informationGerm-line vs somatic-variation theories
BME 128 Tuesday April 26 (1) Filling in the gaps Antibody diversity, how is it achieved? - by specialised (!) mechanisms Chp6 (Protein Diversity & Sequence Analysis) - more about the main concepts in this
More informationFrom Gene to Function: In Silico Warfare on the West Nile Virus
Pharmaceutical case study Page 1 of 5 From Gene to Function: In Silico Warfare on the West Nile Virus Anne Marie Quinn, Luke Fisher and Dana Haley-Vicente Accelrys Inc., San Diego, CA Can we produce reliable
More informationTIGR THE INSTITUTE FOR GENOMIC RESEARCH
Introduction to Genome Annotation: Overview of What You Will Learn This Week C. Robin Buell May 21, 2007 Types of Annotation Structural Annotation: Defining genes, boundaries, sequence motifs e.g. ORF,
More informationAnalysis of Microarray Data
Analysis of Microarray Data Lecture 3: Visualization and Functional Analysis George Bell, Ph.D. Senior Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Review
More informationSmall Genome Annotation and Data Management at TIGR
Small Genome Annotation and Data Management at TIGR Michelle Gwinn, William Nelson, Robert Dodson, Steven Salzberg, Owen White Abstract TIGR has developed, and continues to refine, a comprehensive, efficient
More informationBLAST. Subject: The result from another organism that your query was matched to.
BLAST (Basic Local Alignment Search Tool) Note: This is a complete transcript to the powerpoint. It is good to read through this once to understand everything. If you ever need help and just need a quick
More informationBiotechnology Explorer
Biotechnology Explorer C. elegans Behavior Kit Bioinformatics Supplement explorer.bio-rad.com Catalog #166-5120EDU This kit contains temperature-sensitive reagents. Open immediately and see individual
More information