proteins PREDICTION REPORT Template-based and free modeling by RAPTOR11 in CASP8 Jinbo Xu,* Jian Peng, and Feng Zhao INTRODUCTION

Size: px
Start display at page:

Download "proteins PREDICTION REPORT Template-based and free modeling by RAPTOR11 in CASP8 Jinbo Xu,* Jian Peng, and Feng Zhao INTRODUCTION"

Transcription

1 proteins STRUCTURE O FUNCTION O BIOINFORMATICS PREDICTION REPORT Template-based and free modeling by RAPTOR11 in CASP8 Jinbo Xu,* Jian Peng, and Feng Zhao Toyota Technological Institute at Chicago, Illinois ABSTRACT We developed and tested RAPTOR11 in CASP8 for protein structure prediction. RAPTOR11 contains four modules: threading, model quality assessment, multiple protein alignment, and template-free modeling. RAPTOR11 first threads a target protein to all the templates using three methods and then predicts the quality of the 3D model implied by each alignment using a model quality assessment method. Based upon the predicted quality, RAPTOR11 employs different strategies as follows. If multiple alignments have good quality, RAPTOR11 builds a multiple protein alignment between the target and top templates and then generates a 3D model using MODELLER. If all the alignments have very low quality, RAPTOR11 uses template-free modeling. Otherwise, RAPTOR11 submits a threading-generated 3D model with the best quality. RAPTOR11 was not ready for the first 1/3 targets and was under development during the whole CASP8 season. The template-based and template-free modeling modules in RAPTOR11 are not closely integrated. We are using our template-free modeling technique to refine template-based models. Proteins 2009; 77(Suppl 9): VC 2009 Wiley-Liss, Inc. Key words: CASP; template-based modeling; template-free modeling; protein threading; model quality assessment. INTRODUCTION Computational methods for protein structure prediction can be broadly classified into two categories: template-based modeling and template-free modeling. Although progress has been made for template-based modeling, we are still facing several challenges including identification of correct templates and generation of accurate alignments. Template-based modeling becomes unreliable when a target protein has a low sequence identity (<30%) with its best templates. 1 Pieper et al. 2 have shown that 76% of the models in MODBASE are from alignments in which the sequence and template share less than 30% sequence identity. One of the major bottlenecks with template-free modeling is that the conformation space for even a small protein is too big to be explored efficiently. To overcome this, a number of methods have been proposed including fragment assembly 3,4 and lattice model. 5,6 These methods reduce search space using discrete representation of a protein conformation, which may lead to the loss of prediction accuracy regardless of sampling algorithm and energy function. This discrete nature may exclude native-like conformations from the search space because even a small change in a single backbone angle could result in a totally different fold. Efficient sampling of protein conformations in a continuous space of protein-like conformations is still an important unsolved problem. We have developed RAPTOR11, a new protein structure prediction method, to address the aforementioned issues. RAPTOR11 is much more powerful than our threading program RAPTOR. 7,8 In RAPTOR11, we generate sequence-template alignments using three different threading methods and rank them using a model quality assessment method. Then, we employ multiple templates to model an easy target. To deal with targets without identifiable templates, we developed a novel template-free modeling method that can efficiently sample protein conformations in a continuous Additional Supporting Information may be found in the online version of this article. The authors state no conflict of interest. *Correspondence to: Jinbo Xu, Toyota Technological Institute at Chicago, IL j3xu@tti-c.org Received 10 March 2009; Revised 4 July 2009; Accepted 22 July 2009 Published online 5 August 2009 in Wiley InterScience ( DOI: /prot VC 2009 WILEY-LISS, INC. PROTEINS 133

2 J. Xu et al. space. In this article, we will briefly describe RAPTOR11, summarize its predictions in CASP8, present some specific examples and discuss strength and weakness. METHODS Threading RAPTOR11 has three threading methods with different scoring functions and alignment algorithms. Two of the three methods are core-based, whereas the third one is not. As in the old RAPTOR, a core-based method does not allow gaps in core regions. 5 The difference between the two core-based methods lies in if pairwise statistical potentials are used in their scoring functions. The noncore-based method does not use pairwise statistical potentials in its scoring function. In particular, the corebased pairwise threading method uses a scoring function consisting of gap penalty, mutation score, secondary structure score, singleton score, and pairwise score. The pairwise statistical potential is derived by McConkey et al. 6 and other scoring items are taken from the RAP- TOR. 5 The two nonpairwise methods use a similar scoring function without pairwise potential. We trained three different sets of weight factors for these scoring functions using the method in Ref. 5. The major reason for using the McConkey potential is to introduce diversity. The major difference between the McConkey potential and RAPTOR pairwise interaction potential lies in that their definitions of an inter-residue interaction are different. The McConkey potential also have its own parameters for singleton score. Our original plan is to use these two different potentials separately to generate alternative alignments. However, because of limited computing power, we used only the McConkey potential for CASP8. Some very preliminary studies indicate that the McConkey potential has similar alignment accuracy as RAPTOR, but they can generate alternative alignments for a given protein pair. Tested on the Prosup benchmark, 7 the reference-dependent alignment accuracy of a single threading method is 61.0%. This accuracy can be improved to 68.0% if the three threading methods are combined using our model quality assessment method described below. Model quality assessment Different from many methods that directly evaluate the quality of a 3D model, 9 17 our model assessment method evaluates the absolute and global quality, measured by GDT-TS or TM-score, 8 of a 3D model implied by an alignment without actually building such a 3D model using MODELLER. Our method differs from existing methods in that to the best of our knowledge, our method is the first one exploiting only the evolutionary information in an alignment for model assessment. We do not need to build a 3D model for its quality assessment and thus, can save a lot of model-building time. Trained on the RAPTOR-generated CASP6 data and tested on the CASP7 data, the MAE (mean of absolute errors) of predicted GDT-TS is and the Pearson correlation coefficient of predicted GDT-TS with the real one is This model assessment method is built upon our previous work, 18 which uses Support Vector Machines to predict the number of correctly aligned positions in an alignment. To assess model quality, our method uses a set of alignment-based features such as distribution of per-position sequence similarity score, contact capacity score, and environmental fitness score; distribution of gap lengths in an alignment, secondary structure score, solvent accessibility score and sequence identity. Multiple-template method If there are at least two very good templates for a target protein, we generate a multiple protein alignment and then build a 3D model from this alignment using MODELLER. 9 The multiple-template method has been exploited by several groups such as Joo et al. 10 and Cheng 11 in recent CASP events. The major challenge is to choose good templates and to generate multiple protein alignments. We always use the top two templates and then enumerate all the possible combinations of the remaining top templates. To save computing time, at most five templates are used in any combinations. For a given set of multiple templates, TM-align 8 is used to generate structure alignment between any two templates. Then, T-Coffee 12 is used to combine all the sequencetemplate alignments and structure alignments into a single multiple protein alignment. We used a very conservative strategy to rank models built from multiple templates because sometimes it generates worse models by using multiple templates. A multiple-template-based model is assumed to be better than another one or a single-template-based model if and only if the former has both better ProQ 13 and DFIRE 14 values. However, this ranking method sometimes failed to identify the best models in CASP8. We chose TM-align, T-Coffee, ProQ, and DFIRE because they are easily accessible. In the future, we will systematically compare our method with other similar methods. Template-free modeling We have developed a template-free modeling method, as detailed in Refs. 15, 16. Our method employs conditional (Markov) random fields (CRFs) and directional statistics to model protein sequence-structure relationship. Our method models the backbone angle distribution at each residue using a FB5 distribution 17 and sam- 134 PROTEINS

3 RAPTOR11 in CASP8 for Protein Structure Prediction ples backbone angles from sequence information using CRF. Different from the widely used fragment assembly and lattice model methods that explore protein conformations in a discrete space, our method can explore protein conformations in a continuous space by their probability. The probability of a protein conformation reflects its stability and is estimated from PSI-BLAST sequence profile and PSIPRED-predicted secondary structure. Our template-free modeling module drives conformation optimization by a simple energy function consisting of Sali s DOPE, 19,20 Baker s KMBhbond 21 and later a simplified solvent accessibility potential. 22 Our experimental results in 16 indicate that although sampling in a continuous space and using a very simple energy function, our new method compares favorably with the fragment assembly method (e.g., Robetta) and the lattice model (i.e., TOUCHSTONE II). Multidomain proteins In the case that a target protein is large and may contain multiple domains, we first parse this protein into several possible domains by searching through the Pfam database 23 using HMMER. 24,25 If the whole target can be aligned to a single template, then domain parsing is skipped. In the case that there is a big chunk of the target not aligned to any top templates, we will treat this unaligned chunk as a single target and do protein modeling separately. Except the last several CASP8 targets, the models for multiple domains are not assembled into a single coordinate system. This explains why Zhang s assessment y indicates that our models for multidomain targets may contain atomic clashes when our domain boundary is different from Zhang s. RESULTS AND DISCUSSION Summary Table I summarizes the results of RAPTOR11 in CASP8. CASP8 defined 164 effective domains and classified them into three categories, whereas Shi et al. 26 defined 146 domains and classified them into five categories {. As shown in Columns 2 4, for TBM-HA targets, the difference between the first and the best models by RAPTOR11 are small. In contrast, the best models generated by RAPTOR11 for TBM and FM targets are much better than the first models. This indicates that we still need to improve our model selection method for TBM and FM targets. As shown in Columns 4 6, for TBM-HA targets, the best models generated by RAPTOR11 are not very far away from the best models submitted by all the CASP8 servers. However, for TBM and FM targets, the best models submitted by all CASP8 y { Table I Summarized Results of RAPTOR11 Predictions in CASP8 Category(#) R1 RB RBAll S1 SB CASP8 official domain definition TBM-HA (50) TBM (104) FM (13) Grishin's domain definition and classification CM easy (36) CM medium (45) CM hard (30) FR (30) FM (5) The upper half table contains the results of 164 CASP8 official domains and the lower half contains the results of 146 domains by Grishin s definition ( prodata.swmed.edu/casp8/evaluation/casp8home.htm).r1 represents GDT-TS score sum of the first-ranked models by RAPTOR; RB represents GDT-TS score sum of the best models submitted by RAPTOR; RBAll represents GDT-TS score sum of the best models generated by RAPTOR; S1 represents GDT-TS score sum of the best first models submitted by all servers; SB represents GDT-TS score sum of the best models submitted by all servers. servers are much better than the best generated by RAPTOR11. This means that in addition to improve model selection, we also need to further improve our model generation method for TBM and FM targets. We can have similar observations when Grishin s domain definition and classification is used. What went right? The model quality assessment method helps a lot in improving RAPTOR s performance on the TBM targets, as opposed to RAPTOR in CASP7 that did not perform well in this category. In fact, Randall and Baldi demonstrated that the performance of RAPTOR in CASP7 could be greatly improved by simply re-ranking the top five models using SELECTPro. 27 A typical example is T0429. The third model of RAPTOR11 for this target is much better than other server models, but RAPTOR s old template selection method failed to rank the third model to top one although RAPTOR s first model is still pretty good. Using our new model quality assessment method, we can rank the third model to top one. See Figure S1 in Supporting Information for these two models of T0429. The multiple-template method sometimes helps improve modeling easy targets. This method is likely to improve model quality when the following two conditions are satisfied. One is that some gapped regions in the alignment to one template can be covered by the alignment to another template. The other is that these multiple templates are structurally very similar. In case that either of these two conditions is not satisfied, the multiple-template method may introduce models of worse quality. For example, RAPTOR11 generated the best model for T0486 using four similar templates 2ppyA, 1q52A, 2hw5A, and 2pbpA. The GDT-TS of this model is around higher than the single-template (2ppyA) based model. By using these four templates we PROTEINS 135

4 J. Xu et al. can cover T0486 more than using any single template. See Figures S2-1, S2-2, and S2-3 in Supporting Information for alignments and 3D models for T0486. Our template-free modeling method samples protein conformations in a continuous space without using fragments in the PDB. Our method aims to overcome two major issues with current popular fragment assembly and lattice model methods. One issue is that by sampling in a discrete space, a lattice method may exclude a native structure in the search space since a small change in a backbone angle may result in a totally different fold. The other issue is that there is no 100% guarantee that the local structure of a protein with a new fold can be covered by even medium-sized fragments, as a new fold may be composed of rarely occurring supersecondary structure motifs (Andras Fisher, CASP8 talk). Compared with the Robetta server (see Table III in Ref. 16), our method performs very well on mainly-alpha proteins, e.g., T0460, T0496_D1, and T0496_D2, as shown in Figures S3, S4-1, S4-2, and S4-3 in Supporting Information, respectively. This is not surprising as our CRF model can capture well the local sequence-structure relationship. Our method also works well on small mainly beta proteins. For example, our method is better than Robetta on T0480 and T0510_D3, as shown in Figures S5 and S6 in Supporting Information, respectively. However, our method does not fare well on a relatively large protein (>100 residues) with a few beta strands, e.g., T0482 and T0513_D2. This is probably because our CRF method can only model local sequence-structure relationship, whereas a beta sheet is stabilized by nonlocal hydrogen bonding. Although sampling in a continuous space our method can still efficiently search the conformation space of a small beta protein. However, for a large protein with a few beta sheets, the search space is too big to be explored by our continuous conformation sampling algorithm. It is also worth to note that compared with Robetta, our method works well on T0397_D1 (see Fig. S7 in Supporting Information) and T0496_D1, which, according to Nick Grishin, are the only two CASP8 targets with really new folds. What went wrong? RAPTOR11 contains both template-based and template-free modeling modules, so it needs a rule to tell when to use template-free modeling and when to use templatebased modeling. RAPTOR11 used the predicted GDT-TS to do so, but sometimes this will mislead RAPTOR11 because the predicted GDT-TS is not accurate enough. RAPTOR11 used template-free modeling, if the best, predicted GDT-TS is less than For some targets such as T0496_D1 and T0510_D3, RAPTOR11 correctly submitted their template-free models, which are much better than their template-based models. However, RAPTOR11 incorrectly submitted template-free models for some targets (e.g., T0480 and T0496_D2) although they have better template-based models. When the multiple-template method is used, sometimes RAPTOR11 failed to identify the best 3D models by using ProQ and DFIRE. A better model quality assessment method is urgently needed for this purpose. Another issue is that RAPTOR11 did not update the template database during the whole CASP8 season so that RAPTOR11 missed the best template (2zf8A) for T0514, which was deposited to the PDB in July ACKNOWLEDGMENTS The authors thank Xin Gao for his work in setting up RAPTOR11 web server and running RAPTOR11 for the first 20 CASP8 targets and Tobin Sosnick, Karl Freed, Joe DeBartolo, and Brendan McConkey for their help with development of RAPTOR11. This work is supported by the TTI-C internal research funding and NIH grant R01GM This work was made possible by the facilities of the Shared Hierarchical Academic Research Computing Network (SHARCNET: www. sharcnet.ca), the Open Science Grid Engagement VO and the University of Chicago Computation Institute. The authors are also grateful to Dr. Ming Li, Dr. Ian Foster, Dr. John McGee and Mats Rynge for their help with computational resources. REFERENCES 1. Baker D, Sali A. Protein structure prediction and structural genomics. Science 2001;294: Pieper U, Eswar N, Davis FP, Braberg H, Madhusudhan MS, Rossi A, Marti-Renom M, Karchin R, Webb BM, Eramian D, Shen MY, Kelly L, Melo F, Sali A. MODBASE: a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res 2006;34:D291 D Kim DE, Chivian D, Baker D. Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res 2004;32:W526 W Zhou HY, Skolnick J. Ab initio protein structure prediction using Chunk-TASSER. Biophys J 2007;93: Xu J, Li M, Kim D, Xu Y. RAPTOR: optimal protein threading by linear programming. J Bioinform Comput Biol 2003;1: McConkey BJ, Sobolev V, Edelman M. Discrimination of native protein structures using atom-atom contact scoring. Proc Natl Acad Sci USA 2003;100: Lackner P, Koppensteiner WA, Sippl MJ, Domingues FS. ProSup: a refined tool for protein structure alignment. Protein Eng 2000;13: Zhang Y, Skolnick J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 2005;33: Sali A. Comparative protein modeling by satisfaction of spatial restraints. Mol Med Today 1995;1: Joo K, Lee J, Lee S, Seo JH, Lee SJ, Lee J. High accuracy template based modeling by global optimization. Proteins 2007;69: Cheng J. A multi-template combination algorithm for protein comparative modeling. BMC Struct Biol 2008;8: Poirot O, O Toole E, Notredame C. Tcoffee@igs: a web server for computing, evaluating and combining multiple sequence alignments. Nucleic Acids Res 2003;31: Wallner B, Elofsson A. Can correct protein models be identified? Protein Science 2003;12: PROTEINS

5 RAPTOR11 in CASP8 for Protein Structure Prediction 14. Zhou H, Zhou Y. Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci 2002;11: ; 2003;12: Zhao F, Li SC, Sterner BW, Xu J. Discriminative learning for protein conformation sampling. Proteins 2008;73: Feng Zhao, Jian Peng, Joe DeBartolo, Karl F. Freed, Tobin R. Sosnick and Jinbo Xu. A Probabilistic Graphical Model for Ab Initio Folding. Proc. 13th Annual International Conference on Research in Computational Molecular Biology (RECOMB), Lecture Notes in Computer Science, Vol. 5541, pp , Springer. 17. Kent JT. The fisher-bingham distribution on the sphere. J Roy Stat Soc B 1982;44: Xu J. Protein fold recognition by predicted alignment accuracy. IEEE/ACM Trans Comput Biol Bioinform 2005;2: Fitzgerald JE, Jha AK, Colubri A, Sosnick TR, Freed KF. Reduced C-beta statistical potentials can outperform all-atom potentials in decoy identification. Protein Sci 2007;16: Shen M, Sali A. Statistical potential for assessment and prediction of protein structures. Protein Sci 2006;15: Morozov AV, Kortemme T, Tsemekhman K, Baker D. Close agreement between the orientation dependence of hydrogen bonds observed in protein structures and quantum mechanical calculations. Proc Natl Acad Sci USA 2004;101: Fernandez A, Sosnick TR, Colubri A. Dynamics of hydrogen bond desolvation in protein folding. J Mol Biol 2002;321: Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer EL, Bateman A. The Pfam protein families database. Nucleic Acids Res 2008;36:D281 D Eddy SR. Profile hidden Markov models. Bioinformatics 1998;14: Krogh A, Brown M, Mian IS, Sjolander K, Haussler D. Hidden Markov-models in computational biology Applications to Protein Modeling. J Mol Biol 1994;235: Shi S, Pei J, Sadreyev RI, Kinch LN, Majumdar I, Tong J, Cheng H, Kim B-H, Grishin NV, Analysis of casp8 targets, predictions and assessment methods, Database, vol. 2009, no. 0, pp. bap003+, April [Online]. Available: Randall A, Baldi P. SELECTpro: effective protein model selection using a structure-based energy function resistant to BLUNDERs. BMC Struct Biol 2008;8:52. PROTEINS 137

proteins PREDICTION REPORT Fast and accurate automatic structure prediction with HHpred

proteins PREDICTION REPORT Fast and accurate automatic structure prediction with HHpred proteins STRUCTURE O FUNCTION O BIOINFORMATICS PREDICTION REPORT Fast and accurate automatic structure prediction with HHpred Andrea Hildebrand, Michael Remmert, Andreas Biegert, and Johannes Söding* Gene

More information

A Hidden Markov Model for Identification of Helix-Turn-Helix Motifs

A Hidden Markov Model for Identification of Helix-Turn-Helix Motifs A Hidden Markov Model for Identification of Helix-Turn-Helix Motifs CHANGHUI YAN and JING HU Department of Computer Science Utah State University Logan, UT 84341 USA cyan@cc.usu.edu http://www.cs.usu.edu/~cyan

More information

proteins TASSER_low-zsc: An approach to improve structure prediction using low z-score ranked templates Shashi B. Pandit and Jeffrey Skolnick*

proteins TASSER_low-zsc: An approach to improve structure prediction using low z-score ranked templates Shashi B. Pandit and Jeffrey Skolnick* proteins STRUCTURE O FUNCTION O BIOINFORMATICS TASSER_low-zsc: An approach to improve structure prediction using low z-score ranked templates Shashi B. Pandit and Jeffrey Skolnick* Center for the Study

More information

Protein single-model quality assessment by feature-based probability density functions

Protein single-model quality assessment by feature-based probability density functions Protein single-model quality assessment by feature-based probability density functions Renzhi Cao 1 1, 2, 3, * and Jianlin Cheng 1 Department of Computer Science, University of Missouri, Columbia, MO 65211,

More information

An Overview of Protein Structure Prediction: From Homology to Ab Initio

An Overview of Protein Structure Prediction: From Homology to Ab Initio An Overview of Protein Structure Prediction: From Homology to Ab Initio Final Project For Bioc218, Computational Molecular Biology Zhiyong Zhang Abstract The current status of the protein prediction methods,

More information

Large-scale model quality assessment for improving protein tertiary structure prediction

Large-scale model quality assessment for improving protein tertiary structure prediction Bioinformatics, 31, 2015, i116 i123 doi: 10.1093/bioinformatics/btv235 ISMB/ECCB 2015 Large-scale model quality assessment for improving protein tertiary structure prediction Renzhi Cao 1, Debswapna Bhattacharya

More information

Recursive protein modeling: a divide and conquer strategy for protein structure prediction and its case study in CASP9

Recursive protein modeling: a divide and conquer strategy for protein structure prediction and its case study in CASP9 Recursive protein modeling: a divide and conquer strategy for protein structure prediction and its case study in CASP9 Jianlin Cheng Department of Computer Science, Informatics Institute, C.Bond Life Science

More information

Statistical Machine Learning Methods for Bioinformatics VI. Support Vector Machine Applications in Bioinformatics

Statistical Machine Learning Methods for Bioinformatics VI. Support Vector Machine Applications in Bioinformatics Statistical Machine Learning Methods for Bioinformatics VI. Support Vector Machine Applications in Bioinformatics Jianlin Cheng, PhD Computer Science Department and Informatics Institute University of

More information

Ranking Beta Sheet Topologies of Proteins

Ranking Beta Sheet Topologies of Proteins , October 20-22, 2010, San Francisco, USA Ranking Beta Sheet Topologies of Proteins Rasmus Fonseca, Glennie Helles and Pawel Winter Abstract One of the challenges of protein structure prediction is to

More information

Designing and benchmarking the MULTICOM protein structure prediction system

Designing and benchmarking the MULTICOM protein structure prediction system Li et al. BMC Structural Biology 2013, 13:2 RESEARCH ARTICLE Open Access Designing and benchmarking the MULTICOM protein structure prediction system Jilong Li 1, Xin Deng 1, Jesse Eickholt 1 and Jianlin

More information

proteins Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10 Yang Zhang 1,2 * INTRODUCTION

proteins Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10 Yang Zhang 1,2 * INTRODUCTION proteins STRUCTURE O FUNCTION O BIOINFORMATICS Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10 Yang Zhang 1,2 * 1 Department of Computational Medicine

More information

proteins Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10 Yang Zhang 1,2 * INTRODUCTION

proteins Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10 Yang Zhang 1,2 * INTRODUCTION proteins STRUCTURE O FUNCTION O BIOINFORMATICS Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10 Yang Zhang 1,2 * 1 Department of Computational Medicine

More information

proteins Template-based protein structure prediction the last decade

proteins Template-based protein structure prediction the last decade proteins STRUCTURE O FUNCTION O BIOINFORMATICS Template-based protein structure prediction in CASP11 and retrospect of I-TASSER in the last decade Jianyi Yang, 1,2 Wenxuan Zhang, 1,2 Baoji He, 1,2 Sara

More information

Molecular Modeling 9. Protein structure prediction, part 2: Homology modeling, fold recognition & threading

Molecular Modeling 9. Protein structure prediction, part 2: Homology modeling, fold recognition & threading Molecular Modeling 9 Protein structure prediction, part 2: Homology modeling, fold recognition & threading The project... Remember: You are smarter than the program. Inspecting the model: Are amino acids

More information

proteins PREDICTION REPORT The use of automatic tools and human expertise in template-based modeling of CASP8 target proteins

proteins PREDICTION REPORT The use of automatic tools and human expertise in template-based modeling of CASP8 target proteins proteins STRUCTURE O FUNCTION O BIOINFORMATICS PREDICTION REPORT The use of automatic tools and human expertise in template-based modeling of CASP8 target proteins Česlovas Venclovas* and Mindaugas Margelevičius

More information

proteins Predicted residue residue contacts can help the scoring of 3D models Michael L. Tress* and Alfonso Valencia

proteins Predicted residue residue contacts can help the scoring of 3D models Michael L. Tress* and Alfonso Valencia proteins STRUCTURE O FUNCTION O BIOINFORMATICS Predicted residue residue contacts can help the scoring of 3D models Michael L. Tress* and Alfonso Valencia Structural Biology and Biocomputing Programme,

More information

Comparative Modeling Part 1. Jaroslaw Pillardy Computational Biology Service Unit Cornell Theory Center

Comparative Modeling Part 1. Jaroslaw Pillardy Computational Biology Service Unit Cornell Theory Center Comparative Modeling Part 1 Jaroslaw Pillardy Computational Biology Service Unit Cornell Theory Center Function is the most important feature of a protein Function is related to structure Structure is

More information

Are specialized servers better at predicting protein structures than stand alone software?

Are specialized servers better at predicting protein structures than stand alone software? African Journal of Biotechnology Vol. 11(53), pp. 11625-11629, 3 July 2012 Available online at http://www.academicjournals.org/ajb DOI:10.5897//AJB12.849 ISSN 1684 5315 2012 Academic Journals Full Length

More information

Assessment of Progress Over the CASP Experiments

Assessment of Progress Over the CASP Experiments PROTEINS: Structure, Function, and Genetics 53:585 595 (2003) Assessment of Progress Over the CASP Experiments C eslovas Venclovas, 1,2 Adam Zemla, 1 Krzysztof Fidelis, 1 and John Moult 3 * 1 Biology and

More information

Protein Structure Prediction

Protein Structure Prediction Homology Modeling Protein Structure Prediction Ingo Ruczinski M T S K G G G Y F F Y D E L Y G V V V V L I V L S D E S Department of Biostatistics, Johns Hopkins University Fold Recognition b Initio Structure

More information

MOL204 Exam Fall 2015

MOL204 Exam Fall 2015 MOL204 Exam Fall 2015 Exercise 1 15 pts 1. 1A. Define primary and secondary bioinformatical databases and mention two examples of primary bioinformatical databases and one example of a secondary bioinformatical

More information

Bioinformatics & Protein Structural Analysis. Bioinformatics & Protein Structural Analysis. Learning Objective. Proteomics

Bioinformatics & Protein Structural Analysis. Bioinformatics & Protein Structural Analysis. Learning Objective. Proteomics The molecular structures of proteins are complex and can be defined at various levels. These structures can also be predicted from their amino-acid sequences. Protein structure prediction is one of the

More information

Ab Initio SERVER PROTOTYPE FOR PREDICTION OF PHOSPHORYLATION SITES IN PROTEINS*

Ab Initio SERVER PROTOTYPE FOR PREDICTION OF PHOSPHORYLATION SITES IN PROTEINS* COMPUTATIONAL METHODS IN SCIENCE AND TECHNOLOGY 9(1-2) 93-100 (2003/2004) Ab Initio SERVER PROTOTYPE FOR PREDICTION OF PHOSPHORYLATION SITES IN PROTEINS* DARIUSZ PLEWCZYNSKI AND LESZEK RYCHLEWSKI BiolnfoBank

More information

Function Prediction of Proteins from their Sequences with BAR 3.0

Function Prediction of Proteins from their Sequences with BAR 3.0 Open Access Annals of Proteomics and Bioinformatics Short Communication Function Prediction of Proteins from their Sequences with BAR 3.0 Giuseppe Profiti 1,2, Pier Luigi Martelli 2 and Rita Casadio 2

More information

Hidden Markov Models. Some applications in bioinformatics

Hidden Markov Models. Some applications in bioinformatics Hidden Markov Models Some applications in bioinformatics Hidden Markov models Developed in speech recognition in the late 1960s... A HMM M (with start- and end-states) defines a regular language L M of

More information

Free Modeling with Rosetta in CASP6

Free Modeling with Rosetta in CASP6 PROTEINS: Structure, Function, and Bioinformatics Suppl 7:128 134 (2005) Free Modeling with Rosetta in CASP6 Philip Bradley, 1 Lars Malmström, 1 Bin Qian, 1 Jack Schonbrun, 1 Dylan Chivian, 1 David E.

More information

A large-scale conformation sampling and evaluation server for protein tertiary structure prediction and its assessment in CASP11

A large-scale conformation sampling and evaluation server for protein tertiary structure prediction and its assessment in CASP11 Li et al. BMC Bioinformatics (2015) 16:337 DOI 10.1186/s12859-015-0775-x RESEARCH ARTICLE A large-scale conformation sampling and evaluation server for protein tertiary structure prediction and its assessment

More information

proteins STRUCTURE O FUNCTION O BIOINFORMATICS

proteins STRUCTURE O FUNCTION O BIOINFORMATICS proteins STRUCTURE O FUNCTION O BIOINFORMATICS PREDICTION REPORT Structure prediction for CASP8 with all-atom refinement using Rosetta Srivatsan Raman, 1 Robert Vernon, 1 James Thompson, 2 Michael Tyka,

More information

proteins Structure prediction using sparse simulated NOE restraints with Rosetta in CASP11

proteins Structure prediction using sparse simulated NOE restraints with Rosetta in CASP11 proteins STRUCTURE O FUNCTION O BIOINFORMATICS Structure prediction using sparse simulated NOE restraints with Rosetta in CASP11 Sergey Ovchinnikov, 1,2 Hahnbeom Park, 1,2 David E. Kim, 2,3 Yuan Liu, 1,2

More information

Exploring Suboptimal Sequence Alignments and Scoring Functions in Comparative Protein Structural Modeling

Exploring Suboptimal Sequence Alignments and Scoring Functions in Comparative Protein Structural Modeling Exploring Suboptimal Sequence Alignments and Scoring Functions in Comparative Protein Structural Modeling Presented by Kate Stafford 1,2 Research Mentor: Troy Wymore 3 1 Bioengineering and Bioinformatics

More information

A SOFTWARE PIPELINE FOR PROTEIN STRUCTURE PREDICTION

A SOFTWARE PIPELINE FOR PROTEIN STRUCTURE PREDICTION A SOFTWARE PIPELINE FOR PROTEIN STRUCTURE PREDICTION Michael S. Lee Computational Sciences and Engineering Branch U. S. Army Research Laboratory Department of Cell Biology and Biochemistry U. S. Army Medical

More information

Protein Structure Prediction. christian studer , EPFL

Protein Structure Prediction. christian studer , EPFL Protein Structure Prediction christian studer 17.11.2004, EPFL Content Definition of the problem Possible approaches DSSP / PSI-BLAST Generalization Results Definition of the problem Massive amounts of

More information

SMISS: A protein function prediction server by integrating multiple sources

SMISS: A protein function prediction server by integrating multiple sources SMISS 1 SMISS: A protein function prediction server by integrating multiple sources Renzhi Cao 1, Zhaolong Zhong 1 1, 2, 3, *, and Jianlin Cheng 1 Department of Computer Science, University of Missouri,

More information

Structural Bioinformatics (C3210) Conformational Analysis Protein Folding Protein Structure Prediction

Structural Bioinformatics (C3210) Conformational Analysis Protein Folding Protein Structure Prediction Structural Bioinformatics (C3210) Conformational Analysis Protein Folding Protein Structure Prediction Conformational Analysis 2 Conformational Analysis Properties of molecules depend on their three-dimensional

More information

Protein structure. Wednesday, October 4, 2006

Protein structure. Wednesday, October 4, 2006 Protein structure Wednesday, October 4, 2006 Introduction to Bioinformatics Johns Hopkins School of Public Health 260.602.01 J. Pevsner pevsner@jhmi.edu Copyright notice Many of the images in this powerpoint

More information

3D Structure Prediction with Fold Recognition/Threading. Michael Tress CNB-CSIC, Madrid

3D Structure Prediction with Fold Recognition/Threading. Michael Tress CNB-CSIC, Madrid 3D Structure Prediction with Fold Recognition/Threading Michael Tress CNB-CSIC, Madrid MREYKLVVLGSGGVGKSALTVQFVQGIFVDEYDPTIEDSY RKQVEVDCQQCMLEILDTAGTEQFTAMRDLYMKNGQGFAL VYSITAQSTFNDLQDLREQILRVKDTEDVPMILVGNKCDL

More information

Computational Methods for Protein Structure Prediction and Fold Recognition... 1 I. Cymerman, M. Feder, M. PawŁowski, M.A. Kurowski, J.M.

Computational Methods for Protein Structure Prediction and Fold Recognition... 1 I. Cymerman, M. Feder, M. PawŁowski, M.A. Kurowski, J.M. Contents Computational Methods for Protein Structure Prediction and Fold Recognition........................... 1 I. Cymerman, M. Feder, M. PawŁowski, M.A. Kurowski, J.M. Bujnicki 1 Primary Structure Analysis...................

More information

Large Scale Enzyme Func1on Discovery: Sequence Similarity Networks for the Protein Universe

Large Scale Enzyme Func1on Discovery: Sequence Similarity Networks for the Protein Universe Large Scale Enzyme Func1on Discovery: Sequence Similarity Networks for the Protein Universe Boris Sadkhin University of Illinois, Urbana-Champaign Blue Waters Symposium May 2015 Overview The Protein Sequence

More information

Protein function prediction using sequence motifs: A research proposal

Protein function prediction using sequence motifs: A research proposal Protein function prediction using sequence motifs: A research proposal Asa Ben-Hur Abstract Protein function prediction, i.e. classification of protein sequences according to their biological function

More information

Bioinformatics Practical Course. 80 Practical Hours

Bioinformatics Practical Course. 80 Practical Hours Bioinformatics Practical Course 80 Practical Hours Course Description: This course presents major ideas and techniques for auxiliary bioinformatics and the advanced applications. Points included incorporate

More information

Protein 3D Structure Prediction

Protein 3D Structure Prediction Protein 3D Structure Prediction Michael Tress CNIO ?? MREYKLVVLGSGGVGKSALTVQFVQGIFVDE YDPTIEDSYRKQVEVDCQQCMLEILDTAGTE QFTAMRDLYMKNGQGFALVYSITAQSTFNDL QDLREQILRVKDTEDVPMILVGNKCDLEDER VVGKEQGQNLARQWCNCAFLESSAKSKINVN

More information

CS273: Algorithms for Structure Handout # 5 and Motion in Biology Stanford University Tuesday, 13 April 2004

CS273: Algorithms for Structure Handout # 5 and Motion in Biology Stanford University Tuesday, 13 April 2004 CS273: Algorithms for Structure Handout # 5 and Motion in Biology Stanford University Tuesday, 13 April 2004 Lecture #5: 13 April 2004 Topics: Sequence motif identification Scribe: Samantha Chui 1 Introduction

More information

BIOINFORMATICS ORIGINAL PAPER doi: /bioinformatics/btn069

BIOINFORMATICS ORIGINAL PAPER doi: /bioinformatics/btn069 Vol. 24 no. 7 28, pages 924 93 BIOINFORMATICS ORIGINAL PAPER doi:.93/bioinformatics/btn69 Structural bioinformatics A comprehensive assessment of sequence-based and template-based methods for protein contact

More information

Textbook Reading Guidelines

Textbook Reading Guidelines Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Last updated: January 16, 2013 Textbook Reading Guidelines Preface: Read the whole preface, and especially: For the students with Life Science

More information

Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction

Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction Seong-gon KIM Dept. of Computer & Information Science & Engineering, University of Florida Gainesville,

More information

Ph.D. in Information and Computer Science (Area: Bioinformatics), University of California, Irvine, August, (Advisor: Dr.

Ph.D. in Information and Computer Science (Area: Bioinformatics), University of California, Irvine, August, (Advisor: Dr. Jianlin Cheng Assistant Professor School of Electrical Engineering and Computer Science University of Central Florida Orlando, FL 32816 Phone: (407) 968-9746 Email: jianlin.cheng@gmail.com Web: http://www.eecs.ucf.edu/~jcheng

More information

Sequence Analysis '17 -- lecture Secondary structure 3. Sequence similarity and homology 2. Secondary structure prediction

Sequence Analysis '17 -- lecture Secondary structure 3. Sequence similarity and homology 2. Secondary structure prediction Sequence Analysis '17 -- lecture 16 1. Secondary structure 3. Sequence similarity and homology 2. Secondary structure prediction Alpha helix Right-handed helix. H-bond is from the oxygen at i to the nitrogen

More information

Computational Methods for Protein Structure Prediction

Computational Methods for Protein Structure Prediction Computational Methods for Protein Structure Prediction Ying Xu 2017/12/6 1 Outline introduction to protein structures the problem of protein structure prediction why it is possible to predict protein structures

More information

Protein Tertiary Model Assessment Using Granular Machine Learning Techniques

Protein Tertiary Model Assessment Using Granular Machine Learning Techniques Georgia State University ScholarWorks @ Georgia State University Computer Science Dissertations Department of Computer Science 3-21-2012 Protein Tertiary Model Assessment Using Granular Machine Learning

More information

STRUM: Structure-based stability change prediction on. single-point mutation

STRUM: Structure-based stability change prediction on. single-point mutation STRUM: Structure-based stability change prediction on single-point mutation Lijun Quan, Qiang Lv, Yang Zhang Supplemental Information Figure S1. Histogram distribution of TM-score of the I-TASSER models

More information

Homology Modelling. Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen

Homology Modelling. Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen Homology Modelling Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen Why are Protein Structures so Interesting? They provide a detailed picture of interesting biological features,

More information

4/10/2011. Rosetta software package. Rosetta.. Conformational sampling and scoring of models in Rosetta.

4/10/2011. Rosetta software package. Rosetta.. Conformational sampling and scoring of models in Rosetta. Rosetta.. Ph.D. Thomas M. Frimurer Novo Nordisk Foundation Center for Potein Reseach Center for Basic Metabilic Research Breif introduction to Rosetta Rosetta docking example Rosetta software package Breif

More information

RosettainCASP4:ProgressinAbInitioProteinStructure Prediction

RosettainCASP4:ProgressinAbInitioProteinStructure Prediction PROTEINS: Structure, Function, and Genetics Suppl 5:119 126 (2001) DOI 10.1002/prot.1170 RosettainCASP4:ProgressinAbInitioProteinStructure Prediction RichardBonneau, 1 JerryTsai, 1 IngoRuczinski, 1 DylanChivian,

More information

APPLYING FEATURE-BASED RESAMPLING TO PROTEIN STRUCTURE PREDICTION

APPLYING FEATURE-BASED RESAMPLING TO PROTEIN STRUCTURE PREDICTION APPLYING FEATURE-BASED RESAMPLING TO PROTEIN STRUCTURE PREDICTION Trent Higgs 1, Bela Stantic 1, Md Tamjidul Hoque 2 and Abdul Sattar 13 1 Institute for Integrated and Intelligent Systems (IIIS), Grith

More information

proteins Massive integration of diverse protein quality assessment methods to improve template

proteins Massive integration of diverse protein quality assessment methods to improve template proteins STRUCTURE O FUNCTION O BIOINFORMATICS Massive integration of diverse protein quality assessment methods to improve template based modeling in CASP11 Renzhi Cao, 1 Debswapna Bhattacharya, 1 Badri

More information

I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure

I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure W306 W310 Nucleic Acids Research, 2005, Vol. 33, Web Server issue doi:10.1093/nar/gki375 I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure Emidio Capriotti,

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Secondary Structure Prediction

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Secondary Structure Prediction CMPS 6630: Introduction to Computational Biology and Bioinformatics Secondary Structure Prediction Secondary Structure Annotation Given a macromolecular structure Identify the regions of secondary structure

More information

Protein Domain Boundary Prediction from Residue Sequence Alone using Bayesian Neural Networks

Protein Domain Boundary Prediction from Residue Sequence Alone using Bayesian Neural Networks Protein Domain Boundary Prediction from Residue Sequence Alone using Bayesian s DAVID SACHEZ SPIROS H. COURELLIS Department of Computer Science Department of Computer Science California State University

More information

Protein structure prediction in 2002 Jack Schonbrun, William J Wedemeyer and David Baker*

Protein structure prediction in 2002 Jack Schonbrun, William J Wedemeyer and David Baker* sb120317.qxd 24/05/02 16:00 Page 348 348 Protein structure prediction in 2002 Jack Schonbrun, William J Wedemeyer and David Baker* Central issues concerning protein structure prediction have been highlighted

More information

Distributions of Beta Sheets in Proteins With Application to Structure Prediction

Distributions of Beta Sheets in Proteins With Application to Structure Prediction PROTEINS: Structure, Function, and Genetics 48:85 97 (2002) Distributions of Beta Sheets in Proteins With Application to Structure Prediction Ingo Ruczinski, 1,2 * Charles Kooperberg, 2 Richard Bonneau,

More information

CAFASP3: The Third Critical Assessment of Fully Automated Structure Prediction Methods

CAFASP3: The Third Critical Assessment of Fully Automated Structure Prediction Methods PROTEINS: Structure, Function, and Genetics 53:503 516 (2003) CAFASP3: The Third Critical Assessment of Fully Automated Structure Prediction Methods Daniel Fischer, 1 * Leszek Rychlewski, 2 Roland L. Dunbrack,

More information

Structure Prediction. Modelado de proteínas por homología GPSRYIV. Master Dianas Terapéuticas en Señalización Celular: Investigación y Desarrollo

Structure Prediction. Modelado de proteínas por homología GPSRYIV. Master Dianas Terapéuticas en Señalización Celular: Investigación y Desarrollo Master Dianas Terapéuticas en Señalización Celular: Investigación y Desarrollo Modelado de proteínas por homología Federico Gago (federico.gago@uah.es) Departamento de Farmacología Structure Prediction

More information

proteins CASP PROGRESS REPORTS CASP8 results in context of previous experiments Andriy Kryshtafovych, 1 * Krzysztof Fidelis, 1 and John Moult 2

proteins CASP PROGRESS REPORTS CASP8 results in context of previous experiments Andriy Kryshtafovych, 1 * Krzysztof Fidelis, 1 and John Moult 2 proteins STRUCTURE O FUNCTION O BIOINFORMATICS CASP PROGRESS REPORTS CASP8 results in context of previous experiments Andriy Kryshtafovych, 1 * Krzysztof Fidelis, 1 and John Moult 2 1 Genome Center, University

More information

An insilico Approach: Homology Modelling and Characterization of HSP90 alpha Sangeeta Supehia

An insilico Approach: Homology Modelling and Characterization of HSP90 alpha Sangeeta Supehia 259 Journal of Pharmaceutical, Chemical and Biological Sciences ISSN: 2348-7658 Impact Factor (SJIF): 2.092 December 2014-February 2015; 2(4):259-264 Available online at http://www.jpcbs.info Online published

More information

Prediction of Protein Structure by Emphasizing Local Side- Chain/Backbone Interactions in Ensembles of Turn

Prediction of Protein Structure by Emphasizing Local Side- Chain/Backbone Interactions in Ensembles of Turn PROTEINS: Structure, Function, and Genetics 53:486 490 (2003) Prediction of Protein Structure by Emphasizing Local Side- Chain/Backbone Interactions in Ensembles of Turn Fragments Qiaojun Fang and David

More information

CFSSP: Chou and Fasman Secondary Structure Prediction server

CFSSP: Chou and Fasman Secondary Structure Prediction server Wide Spectrum, Vol. 1, No. 9, (2013) pp 15-19 CFSSP: Chou and Fasman Secondary Structure Prediction server T. Ashok Kumar Department of Bioinformatics, Noorul Islam College of Arts and Science, Kumaracoil

More information

Motif Discovery from Large Number of Sequences: a Case Study with Disease Resistance Genes in Arabidopsis thaliana

Motif Discovery from Large Number of Sequences: a Case Study with Disease Resistance Genes in Arabidopsis thaliana Motif Discovery from Large Number of Sequences: a Case Study with Disease Resistance Genes in Arabidopsis thaliana Irfan Gunduz, Sihui Zhao, Mehmet Dalkilic and Sun Kim Indiana University, School of Informatics

More information

Homology Modelling. Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen

Homology Modelling. Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen Homology Modelling Thomas Holberg Blicher NNF Center for Protein Research University of Copenhagen Why are Protein Structures so Interesting? They provide a detailed picture of interesting biological features,

More information

Critical Assessment of Methods of Protein Structure Prediction (CASP)-Round V

Critical Assessment of Methods of Protein Structure Prediction (CASP)-Round V PROTEINS: Structure, Function, and Genetics 53:334 339 (2003) Critical Assessment of Methods of Protein Structure Prediction (CASP)-Round V John Moult, 1 Krzysztof Fidelis, 2 Adam Zemla, 2 and Tim Hubbard

More information

A Protein Secondary Structure Prediction Method Based on BP Neural Network Ru-xi YIN, Li-zhen LIU*, Wei SONG, Xin-lei ZHAO and Chao DU

A Protein Secondary Structure Prediction Method Based on BP Neural Network Ru-xi YIN, Li-zhen LIU*, Wei SONG, Xin-lei ZHAO and Chao DU 2017 2nd International Conference on Artificial Intelligence: Techniques and Applications (AITA 2017 ISBN: 978-1-60595-491-2 A Protein Secondary Structure Prediction Method Based on BP Neural Network Ru-xi

More information

Structural bioinformatics

Structural bioinformatics Structural bioinformatics Why structures? The representation of the molecules in 3D is more informative New properties of the molecules are revealed, which can not be detected by sequences Eran Eyal Plant

More information

Molecular Modeling Lecture 8. Local structure Database search Multiple alignment Automated homology modeling

Molecular Modeling Lecture 8. Local structure Database search Multiple alignment Automated homology modeling Molecular Modeling 2018 -- Lecture 8 Local structure Database search Multiple alignment Automated homology modeling An exception to the no-insertions-in-helix rule Actual structures (myosin)! prolines

More information

Textbook Reading Guidelines

Textbook Reading Guidelines Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Last updated: May 1, 2009 Textbook Reading Guidelines Preface: Read the whole preface, and especially: For the students with Life Science

More information

Homology Modeling of the Chimeric Human Sweet Taste Receptors Using Multi Templates

Homology Modeling of the Chimeric Human Sweet Taste Receptors Using Multi Templates 2013 International Conference on Food and Agricultural Sciences IPCBEE vol.55 (2013) (2013) IACSIT Press, Singapore DOI: 10.7763/IPCBEE. 2013. V55. 1 Homology Modeling of the Chimeric Human Sweet Taste

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Changhui (Charles) Yan Old Main 401 F http://www.cs.usu.edu www.cs.usu.edu/~cyan 1 How Old Is The Discipline? "The term bioinformatics is a relatively recent invention, not

More information

RECURSIVE PROTEIN MODELING: A DIVIDE AND CONQUER STRATEGY FOR PROTEIN STRUCTURE PREDICTION AND ITS CASE STUDY IN CASP9

RECURSIVE PROTEIN MODELING: A DIVIDE AND CONQUER STRATEGY FOR PROTEIN STRUCTURE PREDICTION AND ITS CASE STUDY IN CASP9 Journal of Bioinformatics and Computational Biology Vol. 10, No. 3 (2012) 1242003 (18 pages) #.c The Authors DOI: 10.1142/S0219720012420036 RECURSIVE PROTEIN MODELING: A DIVIDE AND CONQUER STRATEGY FOR

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics http://1.51.212.243/bioinfo.html Dr. rer. nat. Jing Gong Cancer Research Center School of Medicine, Shandong University 2011.10.19 1 Chapter 4 Structure 2 Protein Structure

More information

Pacific Symposium on Biocomputing 5: (2000)

Pacific Symposium on Biocomputing 5: (2000) HOW UNIVERSAL ARE FOLD RECOGNITION PARAMETERS. A COMPREHENSIVE STUDY OF ALIGNMENT AND SCORING FUNCTION PARAMETERS INFLUENCE ON RECOGNITION OF DISTANT FOLDS. KRZYSZTOF A. OLSZEWSKI Molecular Simulations

More information

CONSENSUS CONTACT PREDICTION BY LINEAR PROGRAMMING

CONSENSUS CONTACT PREDICTION BY LINEAR PROGRAMMING 323 CONSENSUS CONTACT PREDICTION BY LINEAR PROGRAMMING Xin Gao 1, Dongbo Bu 1,2, Shuai Cheng Li 1,MingLi 1, and Jinbo Xu 3 1 David R. Cheriton School of Computer Science University of Waterloo, Waterloo

More information

Critical Assessment of Methods of Protein Structure Prediction (CASP) Round 6

Critical Assessment of Methods of Protein Structure Prediction (CASP) Round 6 PROTEINS: Structure, Function, and Bioinformatics Suppl 7:3 7 (2005) Critical Assessment of Methods of Protein Structure Prediction (CASP) Round 6 John Moult, 1 * Krzysztof Fidelis, 2 Burkhard Rost, 4

More information

Distributions of Beta Sheets in Proteins with Application to Structure Prediction

Distributions of Beta Sheets in Proteins with Application to Structure Prediction Distributions of Beta Sheets in Proteins with Application to Structure Prediction Ingo Ruczinski Department of Biostatistics Johns Hopkins University Email: ingo@jhu.edu http://biostat.jhsph.edu/ iruczins

More information

BMB/Bi/Ch 170 Fall 2017 Problem Set 1: Proteins I

BMB/Bi/Ch 170 Fall 2017 Problem Set 1: Proteins I BMB/Bi/Ch 170 Fall 2017 Problem Set 1: Proteins I Please use ray-tracing feature for all the images you are submitting. Use either the Ray button on the right side of the command window in PyMOL or variations

More information

Improving Protein Structure Prediction Using Multiple Sequence-Based Contact Predictions

Improving Protein Structure Prediction Using Multiple Sequence-Based Contact Predictions Article Improving Protein Structure Prediction Using Multiple Sequence-Based Contact Predictions Sitao Wu, 3,4 Andras Szilagyi, 3,5 and Yang Zhang 1,2,3, * 1 Center for Computational Medicine and Bioinformatics

More information

proteins Structure prediction for CASP7 targets using extensive all-atom refinement with

proteins Structure prediction for CASP7 targets using extensive all-atom refinement with J_ID: Z7E Customer A_ID: 21636 Cadmus Art: PRT 21636 Date: 14-AUGUST-07 Stage: I Page: 1 proteins STRUCTURE FUNCTIN BIINFRMATICS Structure prediction for CASP7 targets using extensive all-atom refinement

More information

A Novel Splice Site Prediction Method using Support Vector Machine

A Novel Splice Site Prediction Method using Support Vector Machine Journal of Computational Information Systems 9: 20 (2013) 8053 8060 Available at http://www.jofcis.com A Novel Splice Site Prediction Method using Support Vector Machine Dan WEI 1,2, Huiling ZHANG 2, Yanjie

More information

Gene Prediction Chengwei Luo, Amanda McCook, Nadeem Bulsara, Phillip Lee, Neha Gupta, and Divya Anjan Kumar

Gene Prediction Chengwei Luo, Amanda McCook, Nadeem Bulsara, Phillip Lee, Neha Gupta, and Divya Anjan Kumar Gene Prediction Chengwei Luo, Amanda McCook, Nadeem Bulsara, Phillip Lee, Neha Gupta, and Divya Anjan Kumar Gene Prediction Introduction Protein-coding gene prediction RNA gene prediction Modification

More information

Klinisk kemisk diagnostik BIOINFORMATICS

Klinisk kemisk diagnostik BIOINFORMATICS Klinisk kemisk diagnostik - 2017 BIOINFORMATICS What is bioinformatics? Bioinformatics: Research, development, or application of computational tools and approaches for expanding the use of biological,

More information

Practical lessons from protein structure prediction

Practical lessons from protein structure prediction 1874 1891 Nucleic Acids Research, 2005, Vol. 33, No. 6 doi:10.1093/nar/gki327 SURVY AND SUMMARY Practical lessons from protein structure prediction Krzysztof Ginalski 1,2,3, Nick V. Grishin 3,4, Adam Godzik

More information

The FALC-Loop web server for protein loop modeling

The FALC-Loop web server for protein loop modeling W210 W214 Nucleic Acids Research, 2011, Vol. 39, Web Server issue Published online 16 May 2011 doi:10.1093/nar/gkr352 The FALC-Loop web server for protein loop modeling Junsu Ko 1, Dongseon Lee 1, Hahnbeom

More information

Accurate template-based modeling in CASP12 using the IntFOLD4-TS, ModFOLD6, and ReFOLD methods

Accurate template-based modeling in CASP12 using the IntFOLD4-TS, ModFOLD6, and ReFOLD methods Received: 31 May 2017 Revised: 12 July 2017 Accepted: 25 July 2017 DOI: 10.1002/prot.25360 RESEARCH ARTICLE Accurate template-based modeling in CASP12 using the IntFOLD4-TS, ModFOLD6, and ReFOLD methods

More information

proteins NEW FOLDS: ASSESSMENT Assessment of CASP8 structure predictions for template free targets

proteins NEW FOLDS: ASSESSMENT Assessment of CASP8 structure predictions for template free targets proteins STRUCTURE O FUNCTION O BIOINFORMATICS NEW FOLDS: ASSESSMENT Assessment of CASP8 structure predictions for template free targets Moshe Ben-David, 1 Orly Noivirt-Brik, 1 Aviv Paz, 1 Jaime Prilusky,

More information

Bioinformation by Biomedical Informatics Publishing Group

Bioinformation by Biomedical Informatics Publishing Group Algorithm to find distant repeats in a single protein sequence Nirjhar Banerjee 1, Rangarajan Sarani 1, Chellamuthu Vasuki Ranjani 1, Govindaraj Sowmiya 1, Daliah Michael 1, Narayanasamy Balakrishnan 2,

More information

Supplement for the manuscript entitled

Supplement for the manuscript entitled Supplement for the manuscript entitled Prediction and Analysis of Nucleotide Binding Residues Using Sequence and Sequence-derived Structural Descriptors by Ke Chen, Marcin Mizianty and Lukasz Kurgan FEATURE-BASED

More information

Correcting Sampling Bias in Structural Genomics through Iterative Selection of Underrepresented Targets

Correcting Sampling Bias in Structural Genomics through Iterative Selection of Underrepresented Targets Correcting Sampling Bias in Structural Genomics through Iterative Selection of Underrepresented Targets Kang Peng Slobodan Vucetic Zoran Obradovic Abstract In this study we proposed an iterative procedure

More information

Bioinformatics Tools. Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine

Bioinformatics Tools. Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Bioinformatics Tools Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Bioinformatics Tools Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Overview This lecture will

More information

BIOINFORMATICS Introduction

BIOINFORMATICS Introduction BIOINFORMATICS Introduction Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a 1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu What is Bioinformatics? (Molecular) Bio -informatics One idea

More information

Designing Filters for Fast Protein and RNA Annotation. Yanni Sun Dept. of Computer Science and Engineering Advisor: Jeremy Buhler

Designing Filters for Fast Protein and RNA Annotation. Yanni Sun Dept. of Computer Science and Engineering Advisor: Jeremy Buhler Designing Filters for Fast Protein and RNA Annotation Yanni Sun Dept. of Computer Science and Engineering Advisor: Jeremy Buhler 1 Outline Background on sequence annotation Protein annotation acceleration

More information

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748 CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/19/07 CAP5510 1 HMM for Sequence Alignment 2/19/07 CAP5510 2

More information

Introduction. CS482/682 Computational Techniques in Biological Sequence Analysis

Introduction. CS482/682 Computational Techniques in Biological Sequence Analysis Introduction CS482/682 Computational Techniques in Biological Sequence Analysis Outline Course logistics A few example problems Course staff Instructor: Bin Ma (DC 3345, http://www.cs.uwaterloo.ca/~binma)

More information

Molecular Structures

Molecular Structures Molecular Structures 1 Molecular structures 2 Why is it important? Answers to scientific questions such as: What does the structure of protein X look like? Can we predict the binding of molecule X to Y?

More information