1-D Predictions. Prediction of local features: Secondary structure & surface exposure

Size: px
Start display at page:

Download "1-D Predictions. Prediction of local features: Secondary structure & surface exposure"

Transcription

1 Programme Last week s quiz results Prediction of secondary structure & surface exposure Protein disorder prediction Break get computers upstairs Ex.: Secondary structure prediction Break Summary & discussion Quiz 1

2 Feedback Persons 2

3 Programme Last week s quiz results Prediction of secondary structure & surface exposure Protein disorder prediction Break Ex.: Secondary structure prediction Break Summary & discussion Quiz 3

4 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4

5 Learning Objectives After today s session you should be able to: Explain the meaning and usage of the following local feature terms: Secondary structure Surface accessibility/exposure Transmembrane helix Signal peptide Protein disorder Use different 1-D prediction servers and interpret the results (the exercise). 5

6 Residue Patterns Helices Helix capping Amphiphilic residue patterns C N Sheets Amphiphilic residue patterns Residue preferences at edges vs. middle Special residues Proline Helix breaker Glycine In turns/loops/bends 6

7 1-D predictions Local Structures " Secondary Structure " Trans Membrane Helix Features " Surface Accessibility " Signal Peptides 7

8 Secondary Structure Elements α-helix = H helix = G π-helix = I Extended (ß)-Strand = E Isolated ß-bridge = B Turn = T Bend = S Rest (Coil) = C/. 8

9 Assignment from Structure DSSP ( ) STRIDE ( ) DSSPcont ( ) 9

10 Helices 10

11 Three-State Prediction of Classes Α-helix = H helix = G π-helix = I Extended (ß)-Strand = E Isolated ß-bridge = B Turn = T Bend = S The Rest (Coil) =./C H E C 11

12 Prediction Servers PSIPRED ( PHDProf Jpred 12

13 PSIPRED PSIPRED PREDICTION RESULTS!! Key!! Conf: Confidence (0=low, 9=high)! Pred: Predicted secondary structure (H=helix, E=strand, C=coil)! AA: Target sequence!!! # PSIPRED HFORMAT (PSIPRED V2.3 by David Jones)!! Conf: ! Pred: CCCHHHHHHHHHHHCCCCCCCHHHHHHHHHHHCCCCCCHHHHHHHHHCCCCCCHHHHHHH! AA: MSLLTEVETYVLSIIPSGPLKAEIAQRLEDVFAGKNTDLEVLMEWLKTRPILSPLTKGIL! !! Conf: ! Pred: HHHHHHCCCCHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHCCHHHHHHHHHCCC! AA: GFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDKAVKLYRKLKREITFHGAKEISLSYS! !! Conf: ! Pred: HHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHH! AA: AGALASCMGLIYNRMGAVTTEVAFGLVCATCEQIADSQHRSHRQMVTTTNPLIRHENRMV! !! 13!

14 PSIPRED 14

15 Trans-Membrane Helices 15

16 Transmembrane Helix Predictors TMHMM HMMTOP DAS 16

17 Signal Peptide SignalP Phobius Philius 17

18 Prediction Methods Exemplified by Secondary Structure Predictions 18

19 Amino Acid Statistics VKEFLAKAKEDFLKKWETPSQNTAQLDQFDRIKTLGTGSFGRVMLVKHKESGNHYAMKILDKQKVVKLKQIEHTLNEKRI!.HHHHHHHHHHHHHHHHS...GGGEEEEEEEEE.SS.EEEEEEETTTTEEEEEEEEEHHHHHHTT.HHHHHHHHHH! Helix VKEFLAKAK! KEFLAKAKE! EFLAKAKED!!.!.!.!.!.! Strand QLDQFDRIK! LDQFDRIKT! DQFDRIKTL!!.!.!.!.!.! Coil KKWETPSQN! KWETPSQNT! WETPSQNTA!!.!.!.!.!.! 19

20 Propensities Helix 20

21 BLOSUM Substitution A R N D C Q E G H I L K M F P S T W Y V B Z X * A R N D C Q E G H I L K M F P S T W Y V

22 Position Specific Substitution Matrices (PSSM) 22

23 PSSM A R N D C Q E G H I L K M F P S T W Y V 1 I K E E H V I I Q A E F Y L N P D

24 Neural Networks Benefits Generally applicable Can capture higher order correlations Inputs other than sequence information Drawbacks Needs a lot of data (different solved structures with low sequence identity). Complex methods with several pitfalls. 24

25 Neural Networks Input Layer Weights I K E E H V I I Q A E Window IKEEHVIIQAEFYLNPDQSGEF.. H E C Hidden Layer Output Layer 25

26 NetSurfP Prediction of Real Value Solvent Accessibility By Bent Petersen 26

27 Objective Predict residues as being either buried or exposed (25 % threshold) Two states/classes, Buried/Exposed Predict the Relative Solvent Accessibility Real Value 27

28 Why predict RSA? Residues exposed on surface can be: Involved in PTM s Potential antigenic regions Involved in Protein-Protein interactions Prediction of Disease-SNP s 28

29 What is ASA? Accessible Solvent Area, Å 2 Surface area accessible to a rolling water molecule 29

30 RSA RSA = Relative Solvent Accessibility ACC = Accessible area in protein structure ASA = Accessible Surface Area in Gly-X-Gly or Ala-X-Ala Classification Networks Real value Networks Classification: Buried = RSA < 25 %, Exposed = RSA > 25 %" Real Value: values 0-1, RSA > 1 set to 1" 30

31 Learning / Training dataset Training set: Cull_1764: Max. Seq. ID: 25 % Resolution: 2.0 Å R-Factor: 0.2 Seq. Length AA Excluding non-x-ray entries 31

32 Learning / Training dataset Homology reduced against evaluation set CB513 (302 sequences removed) Final Training set: 1764 sequences amino acids Buried: % ( amino acids) Exposed: % ( amino acids) 32

33 Neural Network - Input Position Specific Scoring Matrices, PSSM A R N D C Q E G H I L K M F P S T W Y V B H 2BEM.A A G 2BEM.A A Y 2BEM.A A V 2BEM.A B E 2BEM.A time iterativ psi-blast against nr70 Secondary Structure predictions B H 2BEM.A A G 2BEM.A A Y 2BEM.A A V 2BEM.A B E 2BEM.A " (sec predictor by Pernille Andersen) 33

34 Method 34

35 Results - Real Value Prediction Training / Evaluation Train Evaluated Method Ahmad et al. (2003) Not Published 0.48 ANN Yuan and Huang (2004) Not Published 0.52 SVR Nguyen and Rajapakse(2006) Not Published 0.66 Two-Stage SVR Dor and Zhou (2007) Not Published ANN NetSurfP ANN 35

36 NetSurfP /usr/cbs/bio/src/netsurfp/netsurfp -h 36

37 NetSurfP Output 37

38 Programme Last week s quiz results Prediction of secondary structure & surface exposure Protein disorder prediction Break Ex.: Secondary structure prediction Break Summary & discussion Quiz 38

39 Protein D iso r d e r Introduction to DisEMBL, IUPred & FoldUnfold 39

40 Protein Folding Initially formed structure is in molten globule state (ensemble). E T Transition state(s), one or more narrow ensembles Molten globule condenses to native fold via transition state. U Unfolded state, ensemble ΔG F Native fold, one structure 40

41 Degrees of Structure 41

42 Structures of Unstructured Regions Estimate: 20% of all proteins contain unstructured regions. 1% of structures in PDB contain unstructured regions. Structural genomics Special structural genomics projects Selection and modification of targets Prediction of crystallisable domains Protein disorder publications in PubMed Iakoucheva & Dunker Structure

43 What s the Fuss About? Properties of Disordered Regions Flexible, i.e. adaptable Accessible Contain Extended Linear Motifs (ELM) Different behaviour in interaction interfaces Very adaptable Many hydrophobic interactions (close packing) No fixed structure without interaction partner Folding upon binding 43

44 DisEMBL Basic notion No consensus on protein disorder definition. Defines three types of disorder The method ANN-based Disorder definitions Loop/Coil (DSSP-assigned residues: T, S, B, I) Hot loops (high B-factor) Missing residues (in X-ray structures, Remark 465 ) 44 Linding et al. Structure 2003

45 DisEMBL Derived propensity scale (implicit) 45

46 DisEMBL Output Ero1-Lα 46

47 IUPred Basic notion: Globular proteins need to make a large number of inter-residue interactions to overcome the loss of entropy upon folding. The method 20 x 20 energy predictor matrix (pairwise interactions). Derived from globular proteins. Quadratic expression in amino acid composition. Definitions Binary definition: Order/disorder Two ranges: long ~ regions/domains Short ~ loops Domain prediction (inverse of long range predictions). 47 Dosztanáyi et al. Bioinformatics 2005

48 IUPred Output Ero1-Lα Position Residue Disorder Tendency 1 E E Q P P

49 FoldUnfold Basic notion Globular proteins need to establish a high number of interactions to compensate for the loss of entropy during the folding process. The method Mean packing density Derived from globular proteins. ANN-based. Definitions Binary definition: Order/disorder Two ranges: Long ~ regions/domains Short ~ loops 49 Galzitskaya et al. Bioinformatics 2006 & Protein Science 2000

50 FoldUnfold Output Ero1-Lα disordered: disordered: disordered: disordered: disordered:

51 Comparison DisEMBL IUPred FoldUnfold Disordered residues:

52 Ero1 example 52

53 Links DisEMBL: IUPred: FoldUnfold 53

54 Programme Last week s quiz results Prediction of secondary structure & surface exposure Protein disorder prediction Break Ex.: Secondary structure prediction Break Summary & discussion Quiz 54

55 Exercise Step

JPred and Jnet: Protein Secondary Structure Prediction.

JPred and Jnet: Protein Secondary Structure Prediction. JPred and Jnet: Protein Secondary Structure Prediction www.compbio.dundee.ac.uk/jpred ...A I L E G D Y A S H M K... FUNCTION? Protein Sequence a-helix b-strand Secondary Structure Fold What is the difference

More information

BMC Structural Biology

BMC Structural Biology BMC Structural Biology BioMed Central Methodology article A generic method for assignment of reliability scores applied to solvent accessibility predictions Bent Petersen 1, Thomas Nordahl Petersen 1,

More information

Sequence Analysis '17 -- lecture Secondary structure 3. Sequence similarity and homology 2. Secondary structure prediction

Sequence Analysis '17 -- lecture Secondary structure 3. Sequence similarity and homology 2. Secondary structure prediction Sequence Analysis '17 -- lecture 16 1. Secondary structure 3. Sequence similarity and homology 2. Secondary structure prediction Alpha helix Right-handed helix. H-bond is from the oxygen at i to the nitrogen

More information

3D Structure Prediction with Fold Recognition/Threading. Michael Tress CNB-CSIC, Madrid

3D Structure Prediction with Fold Recognition/Threading. Michael Tress CNB-CSIC, Madrid 3D Structure Prediction with Fold Recognition/Threading Michael Tress CNB-CSIC, Madrid MREYKLVVLGSGGVGKSALTVQFVQGIFVDEYDPTIEDSY RKQVEVDCQQCMLEILDTAGTEQFTAMRDLYMKNGQGFAL VYSITAQSTFNDLQDLREQILRVKDTEDVPMILVGNKCDL

More information

Protein Folding Problem I400: Introduction to Bioinformatics

Protein Folding Problem I400: Introduction to Bioinformatics Protein Folding Problem I400: Introduction to Bioinformatics November 29, 2004 Protein biomolecule, macromolecule more than 50% of the dry weight of cells is proteins polymer of amino acids connected into

More information

CFSSP: Chou and Fasman Secondary Structure Prediction server

CFSSP: Chou and Fasman Secondary Structure Prediction server Wide Spectrum, Vol. 1, No. 9, (2013) pp 15-19 CFSSP: Chou and Fasman Secondary Structure Prediction server T. Ashok Kumar Department of Bioinformatics, Noorul Islam College of Arts and Science, Kumaracoil

More information

Introduction to Proteins

Introduction to Proteins Introduction to Proteins Lecture 4 Module I: Molecular Structure & Metabolism Molecular Cell Biology Core Course (GSND5200) Matthew Neiditch - Room E450U ICPH matthew.neiditch@umdnj.edu What is a protein?

More information

OPTIMIZING LONG INTRINSIC DISORDER PREDICTORS WITH PROTEIN EVOLUTIONARY INFORMATION

OPTIMIZING LONG INTRINSIC DISORDER PREDICTORS WITH PROTEIN EVOLUTIONARY INFORMATION Journal of Bioinformatics and Computational Biology Imperial College Press OPTIMIZING LONG INTRINSIC DISORDER PREDICTORS WITH PROTEIN EVOLUTIONARY INFORMATION KANG PENG 1, SLOBODAN VUCETIC 1, PREDRAG RADIVOJAC

More information

Structure formation and association of biomolecules. Prof. Dr. Martin Zacharias Lehrstuhl für Molekulardynamik (T38) Technische Universität München

Structure formation and association of biomolecules. Prof. Dr. Martin Zacharias Lehrstuhl für Molekulardynamik (T38) Technische Universität München Structure formation and association of biomolecules Prof. Dr. Martin Zacharias Lehrstuhl für Molekulardynamik (T38) Technische Universität München Motivation Many biomolecules are chemically synthesized

More information

Protein Structure Prediction. christian studer , EPFL

Protein Structure Prediction. christian studer , EPFL Protein Structure Prediction christian studer 17.11.2004, EPFL Content Definition of the problem Possible approaches DSSP / PSI-BLAST Generalization Results Definition of the problem Massive amounts of

More information

Protein Structure Databases, cont. 11/09/05

Protein Structure Databases, cont. 11/09/05 11/9/05 Protein Structure Databases (continued) Prediction & Modeling Bioinformatics Seminars Nov 10 Thurs 3:40 Com S Seminar in 223 Atanasoff Computational Epidemiology Armin R. Mikler, Univ. North Texas

More information

ONLINE BIOINFORMATICS RESOURCES

ONLINE BIOINFORMATICS RESOURCES Dedan Githae Email: d.githae@cgiar.org BecA-ILRI Hub; Nairobi, Kenya 16 May, 2014 ONLINE BIOINFORMATICS RESOURCES Introduction to Molecular Biology and Bioinformatics (IMBB) 2014 The larger picture.. Lower

More information

Protein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl)

Protein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl) Protein Sequence Analysis BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl) Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical

More information

Homology Modelling. Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen

Homology Modelling. Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen Homology Modelling Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen Why are Protein Structures so Interesting? They provide a detailed picture of interesting biological features,

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Secondary Structure Prediction

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Secondary Structure Prediction CMPS 6630: Introduction to Computational Biology and Bioinformatics Secondary Structure Prediction Secondary Structure Annotation Given a macromolecular structure Identify the regions of secondary structure

More information

B CELL EPITOPES AND PREDICTIONS

B CELL EPITOPES AND PREDICTIONS B CELL EPITOPES AND PREDICTIONS OUTLINE What is a B-cell epitope? How can you predict B-cell epitopes? WHAT IS A B-CELL EPITOPE? B-cell epitopes: Accessible structural feature of a pathogen molecule. Antibodies

More information

SNVBox Basic User Documentation. Downloading the SNVBox database. The SNVBox database and SNVGet python script are available here:

SNVBox Basic User Documentation. Downloading the SNVBox database. The SNVBox database and SNVGet python script are available here: SNVBox Basic User Documentation Downloading the SNVBox database The SNVBox database and SNVGet python script are available here: http://wiki.chasmsoftware.org/index.php/download Installing SNVBox: Requirements:

More information

Protein 3D Structure Prediction

Protein 3D Structure Prediction Protein 3D Structure Prediction Michael Tress CNIO ?? MREYKLVVLGSGGVGKSALTVQFVQGIFVDE YDPTIEDSYRKQVEVDCQQCMLEILDTAGTE QFTAMRDLYMKNGQGFALVYSITAQSTFNDL QDLREQILRVKDTEDVPMILVGNKCDLEDER VVGKEQGQNLARQWCNCAFLESSAKSKINVN

More information

Homology Modelling. Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen

Homology Modelling. Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen Homology Modelling Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen Why are Protein Structures so Interesting? They provide a detailed picture of interesting biological features,

More information

Proteins Higher Order Structures

Proteins Higher Order Structures Proteins Higher Order Structures Dr. Mohammad Alsenaidy Department of Pharmaceutics College of Pharmacy King Saud University Office: AA 101 msenaidy@ksu.edu.sa Previously on PHT 426!! Protein Structures

More information

NNvPDB: Neural Network based Protein Secondary Structure Prediction with PDB Validation

NNvPDB: Neural Network based Protein Secondary Structure Prediction with PDB Validation www.bioinformation.net Web server Volume 11(8) NNvPDB: Neural Network based Protein Secondary Structure Prediction with PDB Validation Seethalakshmi Sakthivel, Habeeb S.K.M* Department of Bioinformatics,

More information

Chapter 8. One-Dimensional Structural Properties of Proteins in the Coarse-Grained CABS Model. Sebastian Kmiecik and Andrzej Kolinski.

Chapter 8. One-Dimensional Structural Properties of Proteins in the Coarse-Grained CABS Model. Sebastian Kmiecik and Andrzej Kolinski. Chapter 8 One-Dimensional Structural Properties of Proteins in the Coarse-Grained CABS Model Abstract Despite the significant increase in computational power, molecular modeling of protein structure using

More information

Structural bioinformatics

Structural bioinformatics Structural bioinformatics Why structures? The representation of the molecules in 3D is more informative New properties of the molecules are revealed, which can not be detected by sequences Eran Eyal Plant

More information

Ab Initio SERVER PROTOTYPE FOR PREDICTION OF PHOSPHORYLATION SITES IN PROTEINS*

Ab Initio SERVER PROTOTYPE FOR PREDICTION OF PHOSPHORYLATION SITES IN PROTEINS* COMPUTATIONAL METHODS IN SCIENCE AND TECHNOLOGY 9(1-2) 93-100 (2003/2004) Ab Initio SERVER PROTOTYPE FOR PREDICTION OF PHOSPHORYLATION SITES IN PROTEINS* DARIUSZ PLEWCZYNSKI AND LESZEK RYCHLEWSKI BiolnfoBank

More information

Dynamic Programming Algorithms

Dynamic Programming Algorithms Dynamic Programming Algorithms Sequence alignments, scores, and significance Lucy Skrabanek ICB, WMC February 7, 212 Sequence alignment Compare two (or more) sequences to: Find regions of conservation

More information

Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction

Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction Seong-gon KIM Dept. of Computer & Information Science & Engineering, University of Florida Gainesville,

More information

MATH 5610, Computational Biology

MATH 5610, Computational Biology MATH 5610, Computational Biology Lecture 2 Intro to Molecular Biology (cont) Stephen Billups University of Colorado at Denver MATH 5610, Computational Biology p.1/24 Announcements Error on syllabus Class

More information

Protein-Protein Interactions I

Protein-Protein Interactions I Biochemistry 412 Protein-Protein Interactions I March 23, 2007 Macromolecular Recognition by Proteins Protein folding is a process governed by intramolecular recognition. Protein-protein association is

More information

NetTurnP Neural Network Prediction of Beta-turns by Use of Evolutionary Information and Predicted Protein Sequence Features

NetTurnP Neural Network Prediction of Beta-turns by Use of Evolutionary Information and Predicted Protein Sequence Features Downloaded from orbit.dtu.dk on: Nov 26, 2017 NetTurnP Neural Network Prediction of Beta-turns by Use of Evolutionary Information and Predicted Protein Sequence Features Petersen, Bent; Lundegaard, Claus;

More information

AB INITIO PROTEIN STRUCTURE PREDICTION ALGORITHMS

AB INITIO PROTEIN STRUCTURE PREDICTION ALGORITHMS San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 2011 AB INITIO PROTEIN STRUCTURE PREDICTION ALGORITHMS Maciej Kicinski San Jose State University

More information

The Effect of Using Different Neural Networks Architectures on the Protein Secondary Structure Prediction

The Effect of Using Different Neural Networks Architectures on the Protein Secondary Structure Prediction The Effect of Using Different Neural Networks Architectures on the Protein Secondary Structure Prediction Hanan Hendy, Wael Khalifa, Mohamed Roushdy, Abdel Badeeh Salem Computer Science Department, Faculty

More information

Ensemble Prediction of Intrinsically Disordered Regions in Proteins

Ensemble Prediction of Intrinsically Disordered Regions in Proteins Ensemble Prediction of Intrinsically Disordered Regions in Proteins Ahmed Attia Stanford University Abstract Various methods exist for prediction of disorder in protein sequences. In this paper, the author

More information

Structure & Function. Ulf Leser

Structure & Function. Ulf Leser Proteins: Structure & Function Ulf Leser This Lecture Introduction Structure Function Databases Predicting Protein Secondary Structure Many figures from Zvelebil, M. and Baum, J. O. (2008). "Understanding

More information

Learning to Use PyMOL (includes instructions for PS #2)

Learning to Use PyMOL (includes instructions for PS #2) Learning to Use PyMOL (includes instructions for PS #2) To begin, download the saved PyMOL session file, 4kyz.pse from the Chem 391 Assignments web page: http://people.reed.edu/~glasfeld/chem391/assign.html

More information

Protein-Protein Interactions I

Protein-Protein Interactions I Biochemistry 412 Protein-Protein Interactions I March 11, 2008 Macromolecular Recognition by Proteins Protein folding is a process governed by intramolecular recognition. Protein-protein association is

More information

CSE : Computational Issues in Molecular Biology. Lecture 19. Spring 2004

CSE : Computational Issues in Molecular Biology. Lecture 19. Spring 2004 CSE 397-497: Computational Issues in Molecular Biology Lecture 19 Spring 2004-1- Protein structure Primary structure of protein is determined by number and order of amino acids within polypeptide chain.

More information

BIRKBECK COLLEGE (University of London)

BIRKBECK COLLEGE (University of London) BIRKBECK COLLEGE (University of London) SCHOOL OF BIOLOGICAL SCIENCES M.Sc. EXAMINATION FOR INTERNAL STUDENTS ON: Postgraduate Certificate in Principles of Protein Structure MSc Structural Molecular Biology

More information

FACULTY OF BIOCHEMISTRY AND MOLECULAR MEDICINE

FACULTY OF BIOCHEMISTRY AND MOLECULAR MEDICINE FACULTY OF BIOCHEMISTRY AND MOLECULAR MEDICINE BIOMOLECULES COURSE: COMPUTER PRACTICAL 1 Author of the exercise: Prof. Lloyd Ruddock Edited by Dr. Leila Tajedin 2017-2018 Assistant: Leila Tajedin (leila.tajedin@oulu.fi)

More information

BLAST. compared with database sequences Sequences with many matches to high- scoring words are used for final alignments

BLAST. compared with database sequences Sequences with many matches to high- scoring words are used for final alignments BLAST 100 times faster than dynamic programming. Good for database searches. Derive a list of words of length w from query (e.g., 3 for protein, 11 for DNA) High-scoring words are compared with database

More information

Proteins the primary biological macromolecules of living organisms

Proteins the primary biological macromolecules of living organisms Proteins the primary biological macromolecules of living organisms Protein structure and folding Primary Secondary Tertiary Quaternary structure of proteins Structure of Proteins Protein molecules adopt

More information

BETA STRAND Prof. Alejandro Hochkoeppler Department of Pharmaceutical Sciences and Biotechnology University of Bologna

BETA STRAND Prof. Alejandro Hochkoeppler Department of Pharmaceutical Sciences and Biotechnology University of Bologna Prof. Alejandro Hochkoeppler Department of Pharmaceutical Sciences and Biotechnology University of Bologna E-mail: a.hochkoeppler@unibo.it C-ter NH and CO groups: right, left, right (plane of the slide)

More information

Protein annotation and modelling servers at University College London

Protein annotation and modelling servers at University College London Published online 27 May 2010 Nucleic Acids Research, 2010, Vol. 38, Web Server issue W563 W568 doi:10.1093/nar/gkq427 Protein annotation and modelling servers at University College London D. W. A. Buchan*,

More information

RNA does not adopt the classic B-DNA helix conformation when it forms a self-complementary double helix

RNA does not adopt the classic B-DNA helix conformation when it forms a self-complementary double helix Reason: RNA has ribose sugar ring, with a hydroxyl group (OH) If RNA in B-from conformation there would be unfavorable steric contact between the hydroxyl group, base, and phosphate backbone. RNA structure

More information

DNA & DNA : Protein Interactions BIBC 100

DNA & DNA : Protein Interactions BIBC 100 DNA & DNA : Protein Interactions BIBC 100 Sequence = Information Alphabet = language L,I,F,E LIFE DNA = DNA code A, T, C, G CAC=Histidine CAG=Glutamine GGG=Glycine Protein = Protein code 20 a.a. LIVE EVIL

More information

A Neural Network Method for Identification of RNA-Interacting Residues in Protein

A Neural Network Method for Identification of RNA-Interacting Residues in Protein Genome Informatics 15(1): 105 116 (2004) 105 A Neural Network Method for Identification of RNA-Interacting Residues in Protein Euna Jeong I-Fang Chung Satoru Miyano eajeong@ims.u-tokyo.ac.jp cif@ims.u-tokyo.ac.jp

More information

Sequence Based Function Annotation. Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University

Sequence Based Function Annotation. Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Sequence Based Function Annotation Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Usage scenarios for sequence based function annotation Function prediction of newly cloned

More information

Outline. Evolution. Adaptive convergence. Common similarity problems. Chapter 7: Similarity searches on sequence databases

Outline. Evolution. Adaptive convergence. Common similarity problems. Chapter 7: Similarity searches on sequence databases Chapter 7: Similarity searches on sequence databases All science is either physics or stamp collection. Ernest Rutherford Outline Why is similarity important BLAST Protein and DNA Interpreting BLAST Individualizing

More information

Introduction to Bioinformatics Online Course: IBT

Introduction to Bioinformatics Online Course: IBT Introduction to Bioinformatics Online Course: IBT Multiple Sequence Alignment Building Multiple Sequence Alignment Lec6:Interpreting Your Multiple Sequence Alignment Interpreting Your Multiple Sequence

More information

BCH222 - Greek Key β Barrels

BCH222 - Greek Key β Barrels BCH222 - Greek Key β Barrels Reading C.I. Branden and J. Tooze (1999) Introduction to Protein Structure, Second Edition, pp. 77-78 & 335-336 (look at the color figures) J.S. Richardson (1981) "The Anatomy

More information

Germ-line vs somatic-variation theories

Germ-line vs somatic-variation theories BME 128 Tuesday April 26 (1) Filling in the gaps Antibody diversity, how is it achieved? - by specialised (!) mechanisms Chp6 (Protein Diversity & Sequence Analysis) - more about the main concepts in this

More information

SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring

SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and Linear Motif Scoring University of South Carolina Scholar Commons Faculty Publications Computer Science and Engineering, Department of 2013 SeqNLS: Nuclear Localization Signal Prediction Based on Frequent Pattern Mining and

More information

Chapter 8 DNA Recognition in Prokaryotes by Helix-Turn-Helix Motifs

Chapter 8 DNA Recognition in Prokaryotes by Helix-Turn-Helix Motifs Chapter 8 DNA Recognition in Prokaryotes by Helix-Turn-Helix Motifs 1. Helix-turn-helix proteins 2. Zinc finger proteins 3. Leucine zipper proteins 4. Beta-scaffold factors 5. Others λ-repressor AND CRO

More information

Community-assisted genome annotation: The Pseudomonas example. Geoff Winsor, Simon Fraser University Burnaby (greater Vancouver), Canada

Community-assisted genome annotation: The Pseudomonas example. Geoff Winsor, Simon Fraser University Burnaby (greater Vancouver), Canada Community-assisted genome annotation: The Pseudomonas example Geoff Winsor, Simon Fraser University Burnaby (greater Vancouver), Canada Overview Pseudomonas Community Annotation Project (PseudoCAP) Past

More information

Ranking Beta Sheet Topologies of Proteins

Ranking Beta Sheet Topologies of Proteins , October 20-22, 2010, San Francisco, USA Ranking Beta Sheet Topologies of Proteins Rasmus Fonseca, Glennie Helles and Pawel Winter Abstract One of the challenges of protein structure prediction is to

More information

Alpha-helices, beta-sheets and U-turns within a protein are stabilized by (hint: two words).

Alpha-helices, beta-sheets and U-turns within a protein are stabilized by (hint: two words). 1 Quiz1 Q1 2011 Alpha-helices, beta-sheets and U-turns within a protein are stabilized by (hint: two words) Value Correct Answer 1 noncovalent interactions 100% Equals hydrogen bonds (100%) Equals H-bonds

More information

Creation of a PAM matrix

Creation of a PAM matrix Rationale for substitution matrices Substitution matrices are a way of keeping track of the structural, physical and chemical properties of the amino acids in proteins, in such a fashion that less detrimental

More information

Computational analysis and prediction of protein- RNA interactions

Computational analysis and prediction of protein- RNA interactions Graduate Theses and Dissertations Graduate College 2008 Computational analysis and prediction of protein- RNA interactions Michael Terribilini Iowa State University Follow this and additional works at:

More information

VL Algorithmische BioInformatik (19710) WS2013/2014 Woche 3 - Mittwoch

VL Algorithmische BioInformatik (19710) WS2013/2014 Woche 3 - Mittwoch VL Algorithmische BioInformatik (19710) WS2013/2014 Woche 3 - Mittwoch Tim Conrad AG Medical Bioinformatics Institut für Mathematik & Informatik, Freie Universität Berlin Vorlesungsthemen Part 1: Background

More information

The mechanism(s) of protein folding. What is meant by mechanism. Experimental approaches

The mechanism(s) of protein folding. What is meant by mechanism. Experimental approaches The mechanism(s) of protein folding What is meant by mechanism Computational approaches Experimental approaches Questions: What events occur and in what time sequence when a protein folds Is there a specified

More information

probability Probability probability S+T+Q+H+N

probability Probability probability S+T+Q+H+N The Sequence Attribute Method for Determining Relationships Between Sequence and Protein Disorder Qian Xie 1 Gregory E. Arnold 34 Pedro Romero 2 qian@grover.chem.wsu.edu ge arnold bits@yahoo.com promero@eecs.wsu.edu

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics http://1.51.212.243/bioinfo.html Dr. rer. nat. Jing Gong Cancer Research Center School of Medicine, Shandong University 2011.10.19 1 Chapter 4 Structure 2 Protein Structure

More information

MECHANISMS OF BINDING DIVERSITY IN PROTEIN DISORDER: MOLECULAR RECOGNITION FEATURES MEDIATING PROTEIN INTERACTION NETWORKS.

MECHANISMS OF BINDING DIVERSITY IN PROTEIN DISORDER: MOLECULAR RECOGNITION FEATURES MEDIATING PROTEIN INTERACTION NETWORKS. MECHANISMS OF BINDING DIVERSITY IN PROTEIN DISORDER: MOLECULAR RECOGNITION FEATURES MEDIATING PROTEIN INTERACTION NETWORKS Wei-Lun Hsu Submitted to the faculty of the University Graduate School in partial

More information

IUP: Intrinsically Unstructured Protein predictor - A software tool for analyzing polypeptide sequences

IUP: Intrinsically Unstructured Protein predictor - A software tool for analyzing polypeptide sequences IUP: Intrinsically Unstructured Protein predictor - A software tool for analyzing polypeptide sequences Mary Qu Yang National Human Genome Research Institute National Institutes of Health U.S. Department

More information

Lectures 9 and 10: Random Walks and the Structure of Macromolecules (contd.)

Lectures 9 and 10: Random Walks and the Structure of Macromolecules (contd.) Lectures 9 and 10: Random Walks and the Structure of Macromolecules (contd.) Lecturer: Brigita Urbanc Office: 12 909 (E mail: brigita@drexel.edu) Course website: www.physics.drexel.edu/~brigita/courses/biophys_2011

More information

Comparative Bioinformatics. BSCI348S Fall 2003 Midterm 1

Comparative Bioinformatics. BSCI348S Fall 2003 Midterm 1 BSCI348S Fall 2003 Midterm 1 Multiple Choice: select the single best answer to the question or completion of the phrase. (5 points each) 1. The field of bioinformatics a. uses biomimetic algorithms to

More information

From assembled genome to annotated genome

From assembled genome to annotated genome From assembled genome to annotated genome Procaryotic genomes Eucaryotic genomes Genome annotation servers (web based) 1. RAST 2. NCBI Gene prediction pipeline: Maker Function annotation pipeline: Blast2GO

More information

The String Alignment Problem. Comparative Sequence Sizes. The String Alignment Problem. The String Alignment Problem.

The String Alignment Problem. Comparative Sequence Sizes. The String Alignment Problem. The String Alignment Problem. Dec-82 Oct-84 Aug-86 Jun-88 Apr-90 Feb-92 Nov-93 Sep-95 Jul-97 May-99 Mar-01 Jan-03 Nov-04 Sep-06 Jul-08 May-10 Mar-12 Growth of GenBank 160,000,000,000 180,000,000 Introduction to Bioinformatics Iosif

More information

Comparative Simulation Studies of Native and Single-Site Mutant Human Beta-Defensin-1 Peptides

Comparative Simulation Studies of Native and Single-Site Mutant Human Beta-Defensin-1 Peptides Comparative Simulation Studies of Native and Single-Site Mutant Human Beta-Defensin-1 Peptides Rabab A. Toubar, Artem Zhmurov, Valeri Barsegov, * and Kenneth A. Marx * Department of Chemistry, University

More information

Overview. Secondary Structure. Tertiary Structure

Overview. Secondary Structure. Tertiary Structure Protein Structure Disclaimer: All information and images were taken from outside sources and the author claims no legal ownership of any material. Sources for images are linked on each slide and the information

More information

Virtual bond representation

Virtual bond representation Today s subjects: Virtual bond representation Coordination number Contact maps Sidechain packing: is it an instrumental way of selecting and consolidating a fold? ASA of proteins Interatomic distances

More information

Bioinformatics. ONE Introduction to Biology. Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012

Bioinformatics. ONE Introduction to Biology. Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012 Bioinformatics ONE Introduction to Biology Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012 Biology Review DNA RNA Proteins Central Dogma Transcription Translation

More information

FINDING GENES AND EXPLORING THE GENE PAGE AND RUNNING A BLAST (Exercise 1)

FINDING GENES AND EXPLORING THE GENE PAGE AND RUNNING A BLAST (Exercise 1) FINDING GENES AND EXPLORING THE GENE PAGE AND RUNNING A BLAST (Exercise 1) 1.1 Finding a gene using text search. Note: For this exercise use http://www.plasmodb.org a. Find all possible kinases in Plasmodium.

More information

NIH Public Access Author Manuscript Methods Mol Biol. Author manuscript; available in PMC 2015 January 01.

NIH Public Access Author Manuscript Methods Mol Biol. Author manuscript; available in PMC 2015 January 01. NIH Public Access Author Manuscript Published in final edited form as: Methods Mol Biol. 2014 ; 1137: 119 130. doi:10.1007/978-1-4939-0366-5_9. SPOT-Seq-RNA: Predicting protein-rna complex structure and

More information

Packing of Secondary Structures

Packing of Secondary Structures 7.88 Lecture Notes - 5 7.24/7.88J/5.48J The Protein Folding and Human Disease Packing of Secondary Structures Packing of Helices against sheets Packing of sheets against sheets Parallel Orthogonal Table:

More information

Rapid Kinetics with IR Protein folding examples

Rapid Kinetics with IR Protein folding examples Rapid Kinetics with IR Protein folding examples Time dependent data with FTIR Stop-flow methods - msec limits so far Continuous, micro-flow methods - < 100 µsec Rapid scan FT-IR - msec Multichannel laser

More information

BIOINFORMATICS Introduction

BIOINFORMATICS Introduction BIOINFORMATICS Introduction Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a 1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu What is Bioinformatics? (Molecular) Bio -informatics One idea

More information

STRUCTURAL BIOLOGY. α/β structures Closed barrels Open twisted sheets Horseshoe folds

STRUCTURAL BIOLOGY. α/β structures Closed barrels Open twisted sheets Horseshoe folds STRUCTURAL BIOLOGY α/β structures Closed barrels Open twisted sheets Horseshoe folds The α/β domains Most frequent domain structures are α/β domains: A central parallel or mixed β sheet Surrounded by α

More information

Research Article The Influence of Flanking Secondary Structures on Amino Acid Content and Typical Lengths of 3/10 Helices

Research Article The Influence of Flanking Secondary Structures on Amino Acid Content and Typical Lengths of 3/10 Helices International Journal of Proteomics Volume 214, Article ID 3623, 13 pages http://dx.doi.org/1155/214/3623 Research Article The Influence of Flanking Secondary Structures on Amino Acid Content and Typical

More information

Bioinformatics Approach for Understanding of Protein Disorder

Bioinformatics Approach for Understanding of Protein Disorder Bioinformatics Approach for Understanding of Protein Disorder Slobodan Vucetic Assistant Professor Department of Computer and Information Sciences Center for Information Science and technology Temple University,

More information

INTRINSICALLY DISORDERED PROTEINS IN MOLECULAR RECOGNITION AND STRUCTURAL PROTEOMICS

INTRINSICALLY DISORDERED PROTEINS IN MOLECULAR RECOGNITION AND STRUCTURAL PROTEOMICS INTRINSICALLY DISORDERED PROTEINS IN MOLECULAR RECOGNITION AND STRUCTURAL PROTEOMICS Christopher John Oldfield Submitted to the faculty of the University Graduate School in partial fulfillment of the requirements

More information

Comparative Genomics. Page 1. REMINDER: BMI 214 Industry Night. We ve already done some comparative genomics. Loose Definition. Human vs.

Comparative Genomics. Page 1. REMINDER: BMI 214 Industry Night. We ve already done some comparative genomics. Loose Definition. Human vs. Page 1 REMINDER: BMI 214 Industry Night Comparative Genomics Russ B. Altman BMI 214 CS 274 Location: Here (Thornton 102), on TV too. Time: 7:30-9:00 PM (May 21, 2002) Speakers: Francisco De La Vega, Applied

More information

BIOINFORMATICS IN BIOCHEMISTRY

BIOINFORMATICS IN BIOCHEMISTRY BIOINFORMATICS IN BIOCHEMISTRY Bioinformatics a field at the interface of molecular biology, computer science, and mathematics Bioinformatics focuses on the analysis of molecular sequences (DNA, RNA, and

More information

Protein Synthesis Notes

Protein Synthesis Notes Protein Synthesis Notes Protein Synthesis: Overview Transcription: synthesis of mrna under the direction of DNA. Translation: actual synthesis of a polypeptide under the direction of mrna. Transcription

More information

Determining a substitution matrix for the alignment of disordered proteins

Determining a substitution matrix for the alignment of disordered proteins Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 4-24-2013 Determining a substitution matrix for the alignment of disordered proteins Dong Kim Follow this and

More information

Nucleic Acids, Proteins, and Enzymes

Nucleic Acids, Proteins, and Enzymes 3 Nucleic Acids, Proteins, and Enzymes Chapter 3 Nucleic Acids, Proteins, and Enzymes Key Concepts 3.1 Nucleic Acids Are Informational Macromolecules 3.2 Proteins Are Polymers with Important Structural

More information

Amino Acids. Amino Acid Structure

Amino Acids. Amino Acid Structure Amino Acids Pratt & Cornely Chapter 4 Alpha carbon Sidechain Proteins peptides Amino Acid Structure 1 L amino acids Glycine R/S vs D/L L isoleucine racemization Stereochemisty Common Amino Acids 2 Which

More information

Mining Residue Contacts in Proteins Using Local Structure Predictions

Mining Residue Contacts in Proteins Using Local Structure Predictions IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART B: CYBERNETICS, VOL. 33, NO. 5, OCTOBER 2003 789 Mining Residue Contacts in Proteins Using Local Structure Predictions Mohammed J. Zaki, Associate

More information

From mechanism to medicne

From mechanism to medicne From mechanism to medicne a look at proteins and drug design Chem 342 δ δ δ+ M 2009 δ+ δ+ δ M Drug Design - an Iterative Approach @ DSU Structural Analysis of Receptor Structural Analysis of Ligand-Receptor

More information

A STUDY OF INTELLIGENT TECHNIQUES FOR PROTEIN SECONDARY STRUCTURE PREDICTION. Hanan Hendy, Wael Khalifa, Mohamed Roushdy, Abdel Badeeh Salem

A STUDY OF INTELLIGENT TECHNIQUES FOR PROTEIN SECONDARY STRUCTURE PREDICTION. Hanan Hendy, Wael Khalifa, Mohamed Roushdy, Abdel Badeeh Salem International Journal "Information Models and Analyses" Volume 4, Number 1, 2015 3 A STUDY OF INTELLIGENT TECHNIQUES FOR PROTEIN SECONDARY STRUCTURE PREDICTION Hanan Hendy, Wael Khalifa, Mohamed Roushdy,

More information

Protein Structure Prediction by Constraint Logic Programming

Protein Structure Prediction by Constraint Logic Programming MPRI C2-19 Protein Structure Prediction by Constraint Logic Programming François Fages, Constraint Programming Group, INRIA Rocquencourt mailto:francois.fages@inria.fr http://contraintes.inria.fr/ Molecules

More information

Web based Bioinformatics Applications in Proteomics. Genbank

Web based Bioinformatics Applications in Proteomics. Genbank Web based Bioinformatics Applications in Proteomics Chiquito Crasto ccrasto@genetics.uab.edu February 9, 2010 Genbank Primary nucleic acid sequence database Maintained by NCBI National Center for Biotechnology

More information

Sequence Databases and database scanning

Sequence Databases and database scanning Sequence Databases and database scanning Marjolein Thunnissen Lund, 2012 Types of databases: Primary sequence databases (proteins and nucleic acids). Composite protein sequence databases. Secondary databases.

More information

Supplementary Figure 1.

Supplementary Figure 1. Supplementary Figure 1. Assessment of quaternary structure of soluble RSV F proteins. Soluble variants of F proteins from A2 and B1 RSV strains were expressed in HEK293 cells. The cell culture supernatants

More information

Introduction to Bioinformatics Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Dr. rer. nat. Gong Jing Cancer Research center Medicine School of Shandong University 2012.11.09 1 Chapter 5 Structure 2 Protein Structure When you study a protein, you are usually interested in its function.

More information

Database Searching and BLAST Dannie Durand

Database Searching and BLAST Dannie Durand Computational Genomics and Molecular Biology, Fall 2013 1 Database Searching and BLAST Dannie Durand Tuesday, October 8th Review: Karlin-Altschul Statistics Recall that a Maximal Segment Pair (MSP) is

More information

Chimp Sequence Annotation: Region 2_3

Chimp Sequence Annotation: Region 2_3 Chimp Sequence Annotation: Region 2_3 Jeff Howenstein March 30, 2007 BIO434W Genomics 1 Introduction We received region 2_3 of the ChimpChunk sequence, and the first step we performed was to run RepeatMasker

More information

Residue Contact Prediction for Protein Structure using 2-Norm Distances

Residue Contact Prediction for Protein Structure using 2-Norm Distances Residue Contact Prediction for Protein Structure using 2-Norm Distances Nikita V Mahajan Department of Computer Science &Engg GH Raisoni College of Engineering, Nagpur LGMalik Department of Computer Science

More information

Prot-SSP: A Tool for Amino Acid Pairing Pattern Analysis in Secondary Structures

Prot-SSP: A Tool for Amino Acid Pairing Pattern Analysis in Secondary Structures Mol2Net, 2015, 1(Section F), pages 1-6, Proceedings 1 SciForum Mol2Net Prot-SSP: A Tool for Amino Acid Pairing Pattern Analysis in Secondary Structures Miguel de Sousa 1, *, Cristian R. Munteanu 2 and

More information

Bi 8 Lecture 7. Ellen Rothenberg 26 January Reading: Ch. 3, pp ; panel 3-1

Bi 8 Lecture 7. Ellen Rothenberg 26 January Reading: Ch. 3, pp ; panel 3-1 Bi 8 Lecture 7 PROTEIN STRUCTURE, Functional analysis, and evolution Ellen Rothenberg 26 January 2016 Reading: Ch. 3, pp. 109-134; panel 3-1 (end with free amine) aromatic, hydrophobic small, hydrophilic

More information

Localization Site Prediction for Membrane Proteins by Integrating Rule and SVM Classification

Localization Site Prediction for Membrane Proteins by Integrating Rule and SVM Classification Localization Site Prediction for Membrane Proteins by Integrating Rule and SVM Classification Senqiang Zhou Ke Wang School of Computing Science Simon Fraser University {szhoua@cs.sfu.ca, wangk@cs.sfu.ca}

More information