doi: /j.jmb J. Mol. Biol. (2006) 357,

Size: px
Start display at page:

Download "doi: /j.jmb J. Mol. Biol. (2006) 357,"

Transcription

1 doi: /j.jmb J. Mol. Biol. (2006) 357, Efficient Restraints for Protein Protein Docking by Comparison of Observed Amino Acid Substitution Patterns with those Predicted from Local Environment Vijayalakshmi Chelliah, Tom L. Blundell and Juan Fernández-Recio* Department of Biochemistry University of Cambridge 80 Tennis Court Road, Cambridge CB2 1GA, UK *Corresponding author The discovery that the functions of most eukaryotic gene products are mediated through multi-protein complexes makes the prediction of protein interactions one of the most important current challenges in structural biology. Rigid-body docking methods can generate a large number of alternative candidates, but it is difficult to discriminate the near-native interactions from the large number of false positives. Many different scoring functions have been developed for this purpose, but in most cases, experimental and biological information is still required for accurate predictions. We explore here the use of evolutionary restraints in evaluating rigid-body docking geometries. In order to identify potential interface residues we identify functional residues based on the comparison of observed amino acid substitutions with those predicted from local environment. The interface residues identified by this method are correctly located in 85% of the cases. These predicted interface residues are used to define distance restraints that help to score rigid-body docking solutions. We have developed the pydockrst software, which uses the percentage of satisfied distance restraints, together with the electrostatics and desolvation binding energy, to identify correct docking orientations. This methodology dramatically improves the docking results when compared to the use of energy criteria alone, and is able to find the correct orientation within the top 20 docking solutions in 80% of the cases. q 2006 Elsevier Ltd. All rights reserved. Keywords: protein protein docking; distance restraints; environmentspecific substitution tables Introduction Most biological processes, such as signal transduction, gene expression control, enzyme inhibition, antibody antigen recognition and even the assembly of multi-domain proteins, involve the formation of specific multi-protein complexes. In spite of their biological and therapeutical importance, only slow progress has been made in defining the structures of such multi-protein assemblies. Present address: Dr J. Fernandez-Recio, Molecular Modelling and Bioinformatics, Parc Cientific de Barcelona, C/Joseph Samitier 1-5, Barcelona, Spain. Abbreviations used: ESST, environment-specific substitution table; AIR, ambiguous interaction restraint; FGF, fibroblast growth factor; FGFR, fibroblast growth factor receptor; CAPRI, critical assessment of predicted interaction(s); ASA, accessible surface area. address of the corresponding author: juan@mmb.pcb.ub.es Indeed, a detailed knowledge of their structures at atomic level by X-ray crystallography or NMR proves to be challenging in most cases. As a consequence, the number of computer tools to generate such assemblies by docking the structures of individual subunits is growing fast. 1 7 Although the ultimate goal is fully automatic docking prediction, current approaches generate a large number of alternative docked solutions (false positives), where the near-native structures are difficult to discriminate. That is why, when analysing the results of a docking run, it is often necessary to include all available biological information and mutational data about the binding mode, which helps to detect the near-native solutions with more accuracy. Indeed, in the recent CAPRI experiment (critical assessment of predicted interactions ), many of the correct predictions were made as a /$ - see front matter q 2006 Elsevier Ltd. All rights reserved.

2 1670 Protein Protein Docking Restraints using ESSTs result of the inclusion of available biochemical and mutational data. In most cases, the inclusion of available information to sort, select or filter docking solutions is done by hand, quite often after a visual analysis of the docking solutions, on a case-to-case basis. The program HADDOCK (high ambiguity driven docking), developed by Dominguez and co-workers, 8 has been recently reported to use automatically biochemical and/or biophysical interaction data, such as chemical shift perturbation data obtained from NMR titration experiments or mutagenesis, to drive the docking predictions. In this approach, the information on the interacting residues is introduced as ambiguous interaction restraints (AIRs), classifying such residues into two types: active and passive. For instance, when using NMR titration data, the authors of HADDOCK defined the active residues as all solvent-accessible residues showing a significant chemical shift perturbation upon complex formation. The passive residues corresponded to the solvent-accessible residues that showed a less significant chemical shift perturbation and/or that were surface neighbours of the active residues. The total energy function used to calculate the structures was a sum of electrostatic, van der Waals, and AIR energy terms. This approach has been shown to integrate optimally restraints from experimental data in docking predictions, as the performance of HADDOCK in the CAPRI competition confirms. However, biochemical or mutational information about protein protein interactions is not always available, and in any case, it is difficult to apply at a large, proteomics scale. An additional source of information comes from sequence conservation that can be derived from multiple sequence alignments, based on the assumption that interface residues will be more conserved than the rest of surface residues. In general, it is difficult to identify the reasons for residue conservation, and in particular, to assess which residues are conserved because they are part of a functional interface. For instance, Mirny and Shakhnovich 9 performed an analysis on the molecular evolution of five of the most populated protein folds: immunoglobulin fold; oligonucleotide-binding fold; Rossmann fold, alpha/beta plait; and TIM barrels, in order to distinguish between functional and structural reasons for amino acid conservation. Moreover, several groups have reported the identification of interface residues from sequence and structural conservation in certain protein families. 10,11 These predicted functional sites have been used to filter the solutions generated by protein protein docking. 12 However, such a protocol was applied only to protease-inhibitor protein protein complexes, where the enzyme active site, which is more likely to be detected from sequence conservation analysis, is located in the protein protein interface. Actually, a recent analysis in a larger and more varied set of protein protein interfaces suggested that interface conservation is not sufficiently different from other surface patches to allow prediction of the interface by conservation alone. 13 The identification of interface residues from evolutionary information in non enzyme inhibitor protein protein interactions is thus a challenging task. The degree of conservation of amino acid residues has been shown to be strongly dependent on the environment in which they occur in the folded protein, and substitution tables that give the likely replacements of amino acids in particular local environments have been derived. 14,15 These environment-specific substitution tables (ESSTs) have been recently used to develop Crescendo, a method to distinguish restraints placed on substitutions due to protein structure from restraints deriving from functions mediated by interactions with other molecules. 16 For each position, a divergence score was defined by the difference between the predictions from the environmentspecific substitution tables and the overall amino acid substitution pattern. The clusters of high scoring alignment positions apparently subjected to these additional restraints in evolution correlated well with the functional sites in proteins defined by experimental methods. Crescendo was able to identify functional sites in a set of well-characterised protein families. We now report the application of this functional site prediction method, Crescendo, to the detection of protein protein interaction sites involved in mediating functions and the selection of the correct docking solution from many produced using physical chemical parameters. The residues predicted in this way to be functional are used to define distance restraints in order to score protein protein orientations generated by rigid-body docking. The restraints imposed by evolutionary information are highly ambiguous, i.e. they indicate if a given residue is likely to be in the interface, but there is no information on the specific matching residues in the partner molecule. These types of restraints are similar to those used by HADDOCK, and we initially tested some of the Crescendo functional site predictions with that program. However, the restraints that can be imposed from evolutionary information are much less accurate in nature, so there is always the danger that incorrect restraints could guide the generation of incorrect docking conformations. In addition, the costly energy minimization with all the distance restraints makes the procedure computationally expensive. A more efficient approach that exploits the pydockrst program is described here. First, rigidbody docking solutions are generated by FFT-based docking and then evaluated by the binding energy optimised for rigid-body docking landscapes. We demonstrate the value of a complete scoring function, which includes an additional pseudoenergy term defined by the percentage of satisfaction of the restraints imposed by Crescendo. Our results depend on the quality of the solutions found by the FFT-based docking search, but the distance restraints imposed by the evolutionary information

3 Protein Protein Docking Restraints using ESSTs 1671 clearly help to discriminate the near-native docking solutions, especially in those cases with significant number of false positives caused by the limitations of the rigid-body approach or by a poor energy description. Results and Discussion Prediction of interface residues using environment-specific substitution tables: comparison to known binding sites We have used Crescendo, the functional site prediction method of Chelliah et al., 16 in order to identify the residues likely to be involved in a protein protein interaction. This approach uses residue conservation to predict binding sites but has the advantage of distinguishing those amino acid restraints that result from retention of structure alone in divergent evolving orthologous families from those restraints resulting from functional interactions. For this benchmark, we tried to generate a collection of hetero-complexes as varied as possible, with different examples from every category: protease inhibitor, cell cycle/signal transduction, hormone receptor, etc. We did not include antibody antigen for reasons discussed later. From the protein protein complex sets both available in the literature 4,17,18 and also previously collected by us for internal analyses, we selected those complexes that passed the following criteria in order to run the Crescendo method. Each case was required to have a sufficient number of homologous sequences, and such sequences needed to be quite divergent (i.e. less than 80% identity), since the non-functional residues can also be conserved in the short-term during evolution. Therefore, the number of sequences that were required for the method to run efficiently depended on the divergence of the sequences within the family. In our test set, the number of included sequences ranged from four in EPO receptor or ten in Raf to 128 in RGS or 135 in RhoGAP. For instance, we excluded from our analysis the cyclophilin HIV capsid complex (PDB code 1ak4) because, in spite of obtaining 78 homologous sequences for HIV capsid protein, only two of such sequences were less than 80% identical. The major challenge of the Crescendo method with respect to protein protein interactions is to choose those proteins that are true orthologues, i.e. carry out an identical function in different organisms. Paralogues by definition will have evolved by gene duplication to carry out parallel functions, and therefore will have different binding partners. A careful analysis of the sequences collected by Blast may reveal the inclusion of obvious paralogues that can be removed. However, as the functions of the proteins whose sequences were collected are often not known, we have investigated their phylogeny by generating phylogenetic trees using neighbour-joining method as implemented in TraceSuiteII, 19 and used the subgroup branch of the tree that has the largest number of sequences. In addition, most of the chosen families have crystal structures of the complex and of the unbound forms of the subunits. Finally, we did not consider those cases with known multiple active sites, since our method currently focuses only on the largest site. For this reason, we excluded the actin profilin complex (PDB code 2btf), in which the functional site predictions for actin would be mainly located in the ATP/ADP binding site. The predictions in actin would also be affected by the existence of other known binding sites to gelsolin (PDB code 1eqy, 1h1v), deoxyribonuclease I (PDB code 1atn), or tetramethylrhodamine-5-maleimide (PDB code 1j6z). Table 1 shows the binding site predictions for the selected benchmark set of proteins and their success rates when compared to the known binding sites. The predicted interaction sites were initially formed by the largest cluster of residues with high divergence score values. As can be seen in Table 1 (column 1 of Success (%)), in 60% of the cases the predicted sites were correctly located (that is, more than 50% of the predicted residues were in the interface). Interestingly, when we considered only those residues within the largest cluster and positive Z-score values (column 2), the success rate increased (predicted sites correctly located in 70% of the cases). When we considered only the solventexposed residues (relative accessible surface area, ASAR7%) within the largest cluster (column 3), the success rate was also better than the original predictions (correct predictions in 80% of the cases). Finally, when we considered only those residues that satisfied the three criteria (largest cluster, positive Z-scores, and solvent-accessible), the results improved further: the binding sites were correctly located in 85% of the cases (column 4). In the last column of Table 1 (% coverage) it is shown the percentage of real interface residues that are predicted by the residues that satisfied the three criteria (largest cluster, positive Z-scores, and solvent-accessible). Only in one case, the chymotrypsin inhibitor, did the predicted functional site fail completely. This is not unexpected as protease inhibitors have evolved to achieve highly specific binding and so there are few proteins that have evolved under the restraint of binding to a particular orthologue. Indeed, the interfaces of protease inhibitors are often characterized by hypervariability of amino acids, and some inhibitors even show greater variability in the interface residues than in the non-interacting residues. 24 Therefore, we do not expect good conservation of the interface residues among the members of the family and so the evolutionary information may not be helpful in this case. On the other hand the good predictions for chymotrypsin are explained because the inhibitor binds in the active site that has been selected for in evolution. Other cases where the functional site predictions are expected to fail are the antibody and major

4 1672 Protein Protein Docking Restraints using ESSTs Table 1. Prediction of interface residues from the comparison of observed amino acid substitution patterns and those ones predicted from local environment Name PDB subunit PDB reference complex Largest cluster c Largest cluster and Z score O0.0 d Success (%) a Largest cluster and ASA R7% e Largest cluster and Z-score and ASA f Coverage (%) b Largest cluster and Z-score and ASA f Rap 1gua_A 1gua Raf 1c1y_B 1gua Rho 1tx4_B 1tx RhoGAP 1ow3 1tx Ras 1wq1_R 1wq RasGAP 1wer 1wq Galpha 1agr_A 1agr RGS 1agr_E 1agr Piliassembly 1pdk_A 1pdk Fimbrial 1pdk_B 1pdk Gh 3hhr_A 3hhr Ghr 1hgu 3hhr EPO 1eer_C 1eer EPO receptor 1buy 1eer FGF1 1e0o_B 1e0o FGFR2 2afg 1e0o FGF2 1fq9_C 1fq FGFR1 2fgf 1fq Chymotrypsin 5cha 1cbw BPTI 1bpi 1cbw a Percentage of predicted residues that are correctly located in the known interface. b Percentage of the known interface residues that are correctly predicted. c Predicted residues as defined by Crescendo as those within the largest cluster. d Residues within the largest cluster and positive Z-score of the divergence score values. e Residues within the largest cluster and accessible (relative ASA R7%). f Residues within the largest cluster that are both accessible and have positive Z-score. histocompatibility complex (MHC) molecules of the immune system, which rely on a high mutation rate at their binding surfaces in order to interact with foreign antigens. Neither has evolved to bind a particular antigen. In addition, the functional site predictions will not work on the antigen proteins either, since they have not been subjected to evolutionary pressure in order to bind the molecules of the immune system. Poor predictive results might also arise from the existence of alternative binding sites, some of which will be detected by Crescendo. This is the case of FGF2 (41.7% of correctly predicted residues), which also binds heparin sulphate. Figure 1 shows the predicted interaction sites for the subunits of the FGF1 FGFR2, Rho RhoGAP and Ras RasGAP complexes. The largest contours of divergence score values for receptor and ligand molecules correspond quite well with the actual protein protein interfaces in the X-ray complex structures. It is remarkable that for FGF1, in spite of having the heparin binding site quite conserved, the functional site prediction method detected the larger, more significant binding site to FGFR2. Use of predicted interface residues as restraints to drive docking in HADDOCK In order to evaluate the use of our functional site predictions in protein protein docking, we have studied the interaction between the proteins FGF1 and FGFR2 with HADDOCK. 8 The coordinates of the unbound FGF1 (PDB code 2afg) and the bound FGFR2 (PDB code 1e0o) were used. Ambiguous Figure 1. Contour showing the functional site prediction for (a) FGFR2 FGF1 (PDB code 1e0o); (b) Rho RhoGAP (PDB code 1tx4); and (c) Ras RasGAP (PDB code 1wq1). The FGFR2, Rho and Ras receptor molecules are represented in green, with predictions shown in orange contour. The FGF1, Rho- GAP and RasGAP ligand molecules are represented in purple, with predictions shown in blue contour.

5 Protein Protein Docking Restraints using ESSTs 1673 interaction restraints (AIR) were introduced using the residues predicted as functional by Crescendo. 16 Active and passive residues were defined according to their Z-scores of the divergence score values. Solvent-accessible residues (relative ASA of sidechain R7%) within the largest cluster with positive Z-scores were considered as active residues and those with negative Z-scores were considered as passive. Thus, we selected nine active residues (E87, L89, E90, N92, Y94, N95, L131, L133, P134) and two passive residues (R88, H93) for FGF1; and 12 active residues (L166, H167, A168, V169, A171, V222, P223, D247, E250, R251, S252, H254) and three passive residues (P170, A172, V249) for FGFR2 (Figure 2). The program HADDOCK, with the restraints derived from Crescendo, was able to find a reasonable solution with RMSD value of 4.0 Å, which ranked 11 after the first rigid-body docking step and rose to rank 1 after the final water refinement step. This proves that Crescendo is able to detect those residues that are important for the FGF1/FGFR2 binding, and therefore, can be used to guide the docking and obtain successful results with HADDOCK. The use of functional site predictions by Crescendo as restraints for HADDOCK yielded excellent results: near-native solution was successfully ranked 1. Whereas this proves that the restraints worked well for this case, we were aware that HADDOCK was actually optimised for accurately defined restraints (from NMR or from mutational data), so it is likely that any small inaccuracy in the initial restraints might drive the docking generation to the incorrect binding modes. In addition, HADDOCK is computationally quite expensive for our purpose of evaluating the use of the functional site prediction method for docking in a significant variety of cases. For these reasons, we have developed pydockrst, a faster approach for the evaluation of docking solutions with the restraints derived from functional site predictions. Use of interaction restraints to score rigid-body docking solutions: pydockrst Figure 2. Active and passive residues used in HADDOCK for the FGF1/FGFR2 interaction, as defined using the predicted functional residues. The receptor and ligands are in blue and yellow. The active and the passive residues of the receptor are coloured green and purple, respectively. The active and the passive residues of the ligand are coloured red and cyan, respectively. pydockrst is a computer algorithm for scoring rigid-body docking solutions according to the percentage fulfilment of certain user-defined distance restraints. If all the restraints are satisfied, i.e. all restraint residues are in the 6 Å vicinity of the partner molecule, a restraint energy value of K100.0 kcal/mol is added to the total energy. If no restraint is satisfied at all, the restraint energy is 0.0 kcal/mol. Intermediate levels of restraint satisfaction will correspond to proportional restraint energy values. In order to evaluate the pydockrst protocol, we thought it would be interesting to compare its results with those of HADDOCK. Thus, we have applied pydockrst to the three examples used for HADDOCK benchmarking, using the same restraint residues. 8 For the rigid-body docking generation, we used two different FFT-based programs: ZDOCK and FTDOCK. 26 For each complex, the two sets of rigid-body docking solutions (total number of 2000 and 10,000, respectively) were evaluated with electrostatics and desolvation energy, as can be seen in Figures 3 and 4. The restraint residues were then used to establish a restraint energy value for all the docking solutions, and finally, the total scoring of the docking solutions included both the energy and the restraint values. The results of pydockrst are similar to those reported for HADDOCK for the same complexes. 8 For the EIN HPr complex, pydockrst finds a nearnative solution as the lowest-energy conformation (i.e. rank 1) both using ZDOCK and FTDOCK (Figures 3(a) and 4(a)). The near-native solutions found by these two methods were very similar (4.7 Å and 4.6 Å RMSD, respectively). The use of interaction restraint residues helped to improve the funnel landscape, but it did not make any difference

6 1674 Protein Protein Docking Restraints using ESSTs Figure 3. Rigid-body docking energy landscapes for the complexes (a) EIN/HPr, (b) E2A/HPr and (c) gp120/cd4. The 2000 docking solutions generated by ZDOCK2.1 are rescored by (i) pydockser energy, equation (2) (top diagrams); (ii) restraints from NMR data, equation (3) (middle diagrams); and (iii) pydockser energy plus restraints, computed by pydockrst (bottom diagrams). The native orientation, generated by optimally superimposing the docking receptor and ligand docking molecules onto the X-ray structure of the native complex, has been included for informative purposes (open circle). in the rank of the near-native solution found by the two methods, as the binding energy values for this solution were excellent. It is interesting, however, to note that the use of restraints improved the ranking of the complex solution that was artificially added for comparison purposes. In that case, the energy value was slightly worse, probably because of some clashing of the side-chains of the unbound subunits when they are optimally superimposed onto the complex structure. The use of restraints helped to compensate for the poor energy value of this optimal, artificial docking solution. In the case of the E2A HPr complex, the overall landscape of the docking solutions is good, but ZDOCK fails to find correct structures (Figure 3(b)). Fortunately, the artificially added complex conformation is ranked 1 after the inclusion of the restraint residues (Figure 3(b)). On the other hand, we observed that although FTDOCK generates a few more near-native solutions that have similar total scoring value to the optimal complex structure, it also generates some false positives, so that the nearnative solution (4.1 Å) ranks 4. Finally, in the case of the gp120 CD4 complex, both docking methods generate a number of near-native solutions, which are ranked 1 after total scoring (energycrestraints), as can be seen in Figures 3(c) and 4(c)). Interestingly, the use of a few restraint residues from mutational data helped to bring the near-native solution as the rank 1 for this complex. Use of predicted interface residues as restraints to score docking solutions with pydockrst Having checked the performance of pydockrst when using interaction restraints from NMR and mutational data, we proceeded to evaluate the use of restraints from the predicted interaction residues obtained by Crescendo. We used FTDOCK as the rigid-body docking generator, as it provides a larger number of docking solutions, and it can be easily integrated as part of the pydock distribution. The dramatic effect of using restraints derived from the predicted interface residues on the scoring of the rigid-body docking landscapes can be checked in Table 2. For each complex, the rank of the near-native solution before (column headed Rank by energy) and after introducing the restraints (column headed Crescendo) is shown. In most of

7 Protein Protein Docking Restraints using ESSTs 1675 Figure 4. Rigid-body docking energy landscapes for the complexes (a) EIN/HPr, (b) E2A/HPr and (c) gp120/cd4. The 10,000 docking solutions generated by FTDOCK are rescored by (i) pydockser energy, equation (2) (top diagrams); (ii) restraints from NMR data, equation (3) (middle diagrams); and (iii) pydockser energy, equation (2) plus restraints, computed by pydockrst (bottom diagrams). A native orientation, generated by optimally superimposing the receptor and ligand docking molecules onto the X-ray structure of the native complex, has been included for informative purposes (open circle). Table 2. Results from rigid-body docking and scoring by energy and distance restraints derived from sequence conservation predictions Complex PDB files Near native docking solution Rank by energycrestraints a Name PDB Receptor Ligand RMSD b energy c Rank by Crescendo Rap Raf 1gua 1guaA 1c1yB Rho RhoGAP 1tx4 1tx4B 1ow Ras RasGAP 1wq1 1wq1R 1wer Galpha RGS 1agr 1agrA 1agrE Piliassembly 1pdk 1pdkA 1pdkB Fimbrial Gh ghr 3hhr 3hhrA 1hgu Epo epor 1eer 1eerC 1buy Fgf1 fgfr2 1e0o 1e0oB 2afg Fgf2 fgfr1 1fq9 1fq9C 2fgf Chymotrypsin BPTI 1cbw 5cha 1bpi a Rank of the best near-native docking solution after scoring by energycrestraints. Restraint residues as defined by different criteria: Crescendo, automatically defined by Crescendo (largest clustercaccessiblecz-score O0); set 1, Crescendo-defined residues that are located at real interface; set 2, all the real interface residues; set 3, 50% of the real interface residues; set 4, 10% of the real interface residues; set 5, 50% of the real interface residuescsame number of non-interface residues; set 6, 10% of real interface residuescsame number of non-interface residues; set 7, 10% real interfacecninefold number of non-interface residues. b RMSD of the ligand C a atoms of the best near-native docking solution with respect to the known complex structure, after superimposing the coordinates of the receptor molecule onto the known complex structure. c Rank of the best near-native docking solution after scoring by electrostaticscdesolvation energy alone

8 1676 Protein Protein Docking Restraints using ESSTs the cases, the drop in rank of the near-native solution is striking. In some cases, FTDOCK was not able to find a rigid-body docking solution close enough to the native complex (for instance, less than 10 Å RMSD). This is a consequence of the current way of generating docking solutions based on FFT sampling. Actually, when we manually included the native orientation formed by superimposing the docking receptor and ligand molecules onto the complex X-ray structure (Table 3), such native orientation ranked much lower than the near-native docking solution in several cases (e.g. PDB codes 1pdk, 3hhr, 1eer, 1e0o), which suggests that a major limitation here is the sub-optimal sampling of the FFT-based method. The overall performance of the approach is shown in Figure 5. The number of cases where a near-native solution is found within a certain number of predictions increases dramatically when the restraints are included. As can be seen in Figure 5(a), when the binding energy (electrostaticscdesolvation) alone is considered, we find a near-native docking solution within the 50 lowest energy docking poses in only one out of the ten cases (10% success). However, when the restraints from Crescendo are included, we find a near-native docking solution within the 50 lowest scoring docking solutions in five out of ten cases (50% success). As we have previously discussed, FTDOCK did not generate correct near-native geometries for some cases. Thus, if we add the known native orientation to the pool of docking solutions and they are scored again by the binding energy (electrostaticscdesolvation), we find a nearnative or native orientation within the 20 lowest energy docking solutions in only one out of the ten cases (10% success), as can be seen in Figure 5(b). However, if we include the restraints from Crescendo in the scoring function, we find a near-native or native orientation within the 20 lowest scoring docking solutions in as many as eight out of the ten cases (80% success). We have also computed the total number of near-native docking orientations found within a certain number of low-energy docking solutions, as a way to evaluate the docking landscapes (Figure 5(c)). As can be seen, the introduction of restraints allows discrimination of a much larger number of near-native docking solutions. We explored the question of whether the quality of the restraints affects to the docking results. For the different cases, the accuracy of the restraint residues (percentage of those that are correctly located at the interface; Table 1, column Largest cluster and Z-score and ASA under Success (5)) shows very little correlation with the rank of the near-native solution (Table 2, column headed Crescendo). The coverage of the real interface by the restraint residues (Table 1, column Coverage (%)) also shows no correlation with the docking results. Except for BPTI, accuracy values range from 41.7 to 100.0, and coverage from 13.5 to It is possible that within this range of values, the restraints are good enough to get quasi-optimal results for most of the cases, being thus the small differences in the ranking of the near-native solution ultimately defined by many other factors Table 3. Results for docking and restraint energy scoring when the native docking orientation is included in the docking set Complex PDB files Native docking orientation Rank by energycrestraints a Name PDB Receptor Ligand RMSD b energy c Rank by Crescendo Rap Raf 1gua 1guaA 1c1yB Rho RhoGAP 1tx4 1tx4B 1ow Ras RasGAP 1wq1 1wq1R 1wer Galpha RGS 1agr 1agrA 1agrE Piliassembly 1pdk 1pdkA 1pdkB Fimbrial Gh ghr 3hhr 3hhrA 1hgu Epo epor 1eer 1eerC 1buy Fgf1 fgfr2 1e0o 1e0oB 2afg Fgf2 fgfr1 1fq9 1fq9C 2fgf Chymotrypsin BPTI 1cbw 5cha 1bpi a Rank of the native docking orientation after scoring by energycrestraints. Restraint residues as defined by different criteria: Crescendo, automatically defined by Crescendo (largest clustercaccessible Cz-score O0); set 1, Crescendo-defined residues that are located at real interface; set 2, all the real interface residues; set 3, 50% of the real interface residues; set 4, 10% of the real interface residues; set 5, 50% of the real interface residuescsame number of non-interface residues; set 6, 10% of real interface residuescsame number of non-interface residues; set 7, 10% real interfacecninefold number of non-interface residues. b RMSD of the ligand C a atoms of the native docking orientation with respect to the known complex structure, after superimposing the coordinates of the receptor molecule onto the known complex structure. The native docking orientation is manually generated by superimposing the docking receptor and ligand molecules onto the coordinates of the X-ray complex structure. c Rank of the native docking orientation after scoring by electrostaticscdesolvation energy alone. The native docking orientation, generated as described in the previous paragraph, is added to the pool of docking solutions and scored as the rest of them

9 Protein Protein Docking Restraints using ESSTs 1677 Figure 5. The overall performance of the reported docking approach. (a) Number of cases where a nearnative solution (!10 Å RMSD from the native structure) is found within the N number of lowest energy predictions (N as defined in abscissas). For three cases, the docking program FTDOCK did not find any solution less than 10 Å RMSD from the real complex structure. (b) This plot is similar to that of (a), but including the native orientation taken from the X-ray structure. (c) The total number of near-native solutions (!10 Å RMSD) found within the N number of lowest energy predictions. In blue is shown the performance of the initial binding energy, without any restraints. In red is shown the performance of pydockrst after including the restraints (binding affinity, binding mechanism, quality of the unbound structures, unbound-bound deformation, etc.). Therefore, in order to assess the general effect of the quality of the restraints in docking, we used different restraint sets where the percentage of real interface residues (accuracy) and their coverage of the real interface varied. Restraint set 1 is formed by the residues defined by Crescendo and located at the real interface (accuracy, 100%; coverage: varies for each case, listed in Table 1). Restraint set 2 is formed by all the real interface residues (accuracy, 100%; coverage, 100%). Restraint set 3 is formed by 50% of the real interface residues randomly selected (accuracy, 100%; coverage, 50%). Restraint set 4 is formed by 10% of the real interface residues randomly selected (accuracy, 100%; coverage, 10%). Restraint set 5 is formed by 50% of the real interface residues randomly selected (as in set 3) plus the same number of residues randomly selected from those not located at the interface (accuracy, 50%; coverage, 50%). Restraint set 6 is formed by 10% of the real interface residues randomly selected (as in set 4) plus the same number of residues selected from those not located at the interface (accuracy, 50%; coverage, 10%). Restraint set 7 is formed by 10% of the real interface residues randomly selected (as in sets 4 and 6) plus a ninefold number of non-interface residues (accuracy, 10%; coverage, 10%). In Table 2 and Figure 5 are shown the pydockrst results after applying the different restraint sets. As expected, if all interface residues are used as distance restraints in pydockrst, the docking results are excellent. Obviously, this is not a real-life situation, but it proves that the method is making optimal use of the restraints. Interestingly, if we manage to get (from sequence, evolutionary information or from experimental, mutation data) a set of restraints where at least 50% of them are located at the interface, and that represents at least 10% of the binding site (case of restraint set 6), the docking results obtained are much better than those obtained with no restraints at all. Moreover, with as low as 10% of the real binding sites, provided that there are no restraint residues outside the interface (case of restraint set 4), the docking results are certainly impressive. It seems that for success in pydockrst it is not so important to use a large number of the interface residues (high-coverage) as restraints, but to avoid too many false-positive restraints (low-accuracy), not located in the real interface. In the case of the obtained from Crescendo. The results of using different restraint sets are also shown (in parentheses are the accuracy/coverage values of the restraints, that is the percentage of restraint residues correctly located at the real interface, and the percentage of real binding residues covered by the restraint residues): in orange restraint set 1 (100/see Table 1); in black restraint set 2 (100/100); in yellow restraint set 3 (100/50); in grey restraint set 4 (100/ 10); in green restraint set 5 (50/50); in magenta restraint set 6 (50/10); in cyan restraint set 7 (10/10). See the main text for an explanation about the different restraint sets.

10 1678 Protein Protein Docking Restraints using ESSTs Figure 6. Rigid-body docking energy landscapes for (a) Galpha RGS and (b) Rho RhoGAP interaction. The 10,000 rigid-body docking solutions generated by FTDOCK are rescored by (i) the combined electrostatic and desolvation energy from pydockser (equation (2)); (ii) the restraint energy (equation (3)) derived from the functional site predictions; and (iii) the total energy (electrostaticcdesolvationcrestraint) calculated by pydockrst. A hypothetical optimal solution with native orientation, generated by superimposing the receptor and ligand docking molecules onto the X-ray structure of the native complex, has been included for informative purposes (open circle). restraints defined by Crescendo, the coverage was always over 10%, and the accuracy over 50% in most of the cases, which explains the good docking results obtained. Our analysis indicates that in a real-case scenario, a few residues known to be at the interface by evolutionary methods like Crescendo, or by other experimental data (mutational, NMR, etc.), would be enough for obtaining optimal docking results with pydockrst. We can illustrate the effect of including the restraints from Crescendo with two examples: Galpha RGS and Rho RhoGAP. As can be seen in Figure 6, the docking energy landscapes based solely on the binding energy (electrostaticsc desolvation) are far from desirable: a large number of incorrect docking solutions are found with lower energy than the near-native orientations (false positives). The use of distance restraints from the predicted binding residues (as implemented in pydockrst) helps dramatically to improve the docking energy landscapes, which now become clearly funnel-shaped favouring the near-native orientations against other docking solutions. As an example of the quality of the predictions, Figure 7 shows the lowest-scoring solution (i.e. rank 1) obtained by pydockrst for the two docking cases: Galpha RGS and Rho RhoGAP. The predictions are quite similar to the known crystallographic complex structures (PDB codes 1agr and 1tx4, respectively). The performance of pydockrst is comparable to that of HADDOCK. For the FGF1/FGFR2 interaction, the near-native solution obtained by HADDOCK (RMSD, 4.0 Å) was ranked 1, whereas the near-native solution obtained by pydockrst (RMSD, 6.41 Å) was ranked 20. However, when we included manually the native orientation taken from the complex structure, pydockrst ranked it as 1, which means that the sub-optimal ranking of the near-native solution in pydockrst was due to the poor quality of the solution generated by FTDOCK, not to the distance restraints from the functional site prediction or to the energy function. It seems that pydockrst is quite appropriate for this type of highly ambiguous distance restraints obtained from evolutionary information, with the added advantage that it is very much faster in computational time. In contrast to the work of Aloy et al., 12 we have used the predicted functional residues to introduce distance restraints instead of distance constraints. That is, we have scored each docking solution according to the percentage of satisfied distance restraints instead of removing those solutions that do not satisfy at least one distance constraint, as in the case of Aloy et al. 12 Other significant differences are in the definition of the predicted functional residues, and in the larger and more varied set of protein protein cases, which includes more than protease inhibitor complexes. In most of the cases, the structure of at least one of the subunits is taken from the complex structure. It would be ideal to use the structure of the free molecules for both subunits, but it is very difficult to

11 Protein Protein Docking Restraints using ESSTs 1679 Figure 7. (a) Crystal structure of Galpha RGS (1agr) superimposed with the best solution (RMSD, 3.69 Å; rank 1). (b) Crystal structure of Rho RhoGAP (1tx4) superimposed with the best solution (RMSD, 1.25 Å; rank 1). Red is the crystal structure and blue is the near-native solution. The restraint residues are shown in CPK, green for the receptor and yellow for the ligand. find cases where the structure of both the free subunits and the complex are available and where they have enough homologous sequences to apply Crescendo. Such cases are mostly protease inhibitor complexes, and that is why most of the studies so far have been centred on this type of complex. The cases here represent a variety of examples of biological interest for protein interaction prediction and therefore are a more representative, and probably more challenging set than used by many other researchers. Conclusions We have shown here that Crescendo, a functional sitepredictionmethodusingtheenvironment-specific substitution tables, can identify residues in protein protein interfaces with great accuracy. These residues predicted from evolutionary information can be used to define interaction restraints that are very helpful for discriminating near-native solutions generated from rigid-body docking landscapes. Although the methodology described here has certain limitations for a fully automated application (e.g. it is not only necessary to find a significant number of orthologous sequences within the protein family, but these should also be divergent in sequence), we have demonstrated the potential of the approach. The application of this methodology on a proteomic scale would help to identify potential binding sites that can be confirmed by highthroughput experiments, and would also reduce the number of docking trials in a fast search of the nearnative complex structure. Materials and Methods Prediction of interface residues using environment-specific substitution tables The environment-specific substitution tables reflect the pattern of substitutions that is characteristic of an amino acid in a particular local environment, usually defined by local secondary structure, side-chain hydrogen bonding and solvent accessibility. Further restraints, arising from the binding of substrates, cofactors, subunits and other molecules, are not taken into account while deriving the environment-specific substitution tables. Thus, the substitution patterns of the functional residues are poorly predicted by the environment-specific substitution tables. So, comparison of the substitution patterns derived from the environment-specific substitution tables with the amino acid substitutions that occur during evolution in families of orthologous proteins should identify the functional residues, since they should be more conserved than predicted from the substitution tables. The input of the functional site prediction method starts with the multiple sequence alignment (with both sequence of known structures and unknown structures). The orthologous structures and sequences of the representative structure (structure of interest) are collected as described by Chelliah et al. 16 In this respect it is essential that the sequences should be true orthologues with respect to the function predicted. The amino acid distribution termed as observed substitution pattern at each position t of the multiple sequence alignment is calculated. For each of the protein sequences of a known structure, the predicted substitution pattern of each of the 20 amino acids, at each position t is derived from the environment-specific substitution table, by taking its residue type and the environment in which it occurs. Taking the average over the number of structures available in the family, the predicted substitution pattern at each position for each of the 20 amino acids is calculated. The sequence-based score, termed the divergence score (converted to Z-score) quantifies the overall difference, or divergence, between the observed and predicted substitution probabilities. 16 The overview of the functional site prediction method is shown in Figure 8. The automatic prediction of functional residues on the structure of the interacting proteins is performed as follows. The divergence score values per residue are mapped onto a three-dimensional grid of points and contoured using kin3dcont as recently described. 16 A maximum Z-score is chosen such that the number of grid points above this Z-score is greater than or equal to Then the cut-off is set:

12 1680 Protein Protein Docking Restraints using ESSTs Figure 8. Overview of the functional site prediction method. The broken lines indicate the multiple sequence alignment. The broken lines in brown, dark blue and light blue denote the representative structure (or structure of interest), homologous structures and homologous sequences of the representative structure, respectively. cut-off Z meankðmaximum Z-scoreÞs (1) where s is the standard deviation. All grid points with Z-scores above the cut-off are clustered, and the clusters ranked in size. Clusters separated by less than 5 Å are merged. A residue is predicted to be functional if it has an atom within 0.8 Å of a grid point in the largest cluster. These assignments are used to determine the percentage of correctly predicted functional residues. Use of predicted interface residues in HADDOCK We have included the predicted interface residues as restraints for running HADDOCK as follows. We selected the residues that are inside the largest cluster defined from the grid mapping of divergence scores (see the previous section). Among them, the active residues were defined as the solvent-accessible ones (relative ASA R7%) that had positive Z-scores for the divergence score. Passive residues were defined as the solventaccessible ones inside the largest cluster that had negative Z-scores for the divergence score. We took 2000 docking solutions after the first rigid-body docking step in HADDOCK. Then, the 50 lowest energy solutions were refined in two consecutive steps: without and with explicit water molecules, as described in the original HADDOCK protocol. The rigid-body docking step took 50 h (for 2000 solutions) in ten Xeon 2.4 GHz CPUs. The subsequent refinement steps took 120 h (for 50 solutions without water refinement and 50 solutions with water refinement) in ten Xeon 2.4 GHz CPUs. pydockrst: implementation of restraints to score rigid-body docking solutions We have developed a protocol to evaluate rigid-body docking solutions, as part of the suite of docking programs called pydock, which will be described in more detail in a forthcoming publication. The rigid-body docking solutions are generated by the FFT-based programs ZDOCK and FTDOCK. 26 Then, the docking solutions are automatically evaluated with the module pydockser, by equation (2): E bind Z 0:5 E ele CE des (2) where E ele is the binding electrostatics energy (Coulombic potential with distance-dependent dielectric constant ez4r, truncated to a maximum and minimum value of C1.0 and K1.0 kcal/mol, respectively); and E des is the

13 Protein Protein Docking Restraints using ESSTs 1681 desolvation energy upon binding, based on atomic solvation parameters previously optimised for rigidbody docking. 4,18 Then, the docking solutions are evaluated with the module pydockrst. Interaction restraints are defined by specific residues that are likely to be in the interface. These restraint residues can be defined from NMR data, mutational experiments, biological information, or from functional site predictions as described here (see the following sub-section). An interaction restraint by a given residue A is satisfied if at least one atom of the partner molecule can be found at %6 Åfrom the centre of mass of this residue A. For each docking solution, the method computes the percentage of satisfied restraints with respect to the total number of possible restraints, and this number is converted to energy by equation (3): restraint energy Z ðk1:0 kcal=molþ! (3) ðpercentage of satisfied restraintsþ The final energy is formed by the sum of the binding and the restraint energies (E total ZE bind Crestraint energy; as defined by equations (2) and (3)). The method is implemented in python, uses the MMTK library, 27 and is part of the pydock suite of docking tools (forthcoming publication). In order to evaluate the performance of the pydockrst program, we have tested it on three cases that have been recently run in HADDOCK. 8 For consistency, we used in pydockrst the same 3D structures as those reportedly used in HADDOCK. The first docking case was the N terminus domain of Enzyme I (EIN; free form pdb 1ZYM) in complex with the histidine-containing phosphocarrier protein (HPr; free form pdb 1POH). The structure of the EIN/HPr complex has been solved by NMR (PDB code 3EZA). The second docking case was the Enzyme IIA glucose (E2A; free form PDB code 1F3G) in complex with HPr. The structure of the E2A/HPr complex has been solved by NMR (PDB code 1GGR). The third docking case was the HIV protein gp120 in complex with the protein CD4. The structures of the subunits were taken from the X-ray structure of the gp120/ CD4 complex (PDB code 1GC1). The interaction restraint residues that we used for pydockrst were the active residues used as AIRs for HADDOCK, as follows. For the EIN/HPr complex, we used 16 amino acids of EIN (E67, E68, K69, A71, I72, D82, E83, E84, G110, Q111, S113, A114, E116, E117, L118 and Y122) and nine amino acids of HPr (H15,T16,R17,Q21,K24,K49,Q51,T52andG54),as derived from NMR data. For the E2A/HPr complex, the restraint residues were 11 amino acids of E2A (D38, V40, I45, V46, K69, F71, S78, E80, D94, V96 and S141) and nine amino acids of HPr (H15, T16, R17, A20, F48, Q51, T52, G54 and T56), as derived from NMR data. For the gp120/cd4 complex, the restraint residues were four from gp120 (D368, E370, W427 and D457) and seven from CD4 (K29, K35, F43, L44, K46, G47 and R59), as derived from mutational data. In order to test the functional site predictions in pydockrst, the restraint residues were formed by the accessible (relative ASA of side-chain R7%) residues from the largest cluster, with positive Z-scores of the divergence score values. Acknowledgements We are grateful to Alexander Bonvin for the use of HADDOCK; to Zhipping Weng for the use of ZDOCK; and to M.J.E. Sternberg for the use of FTDOCK. V.C. is a recipient of Cambridge University Nehru and Overseas Research Scholarships. J.F.-R. is a recipient of a Marie Curie Research Fellowship from the European Commission. References 1. Camacho, C. J. & Vajda, S. (2002). Protein protein association kinetics and protein docking. Curr. Opin. Struct. Biol. 12, Elcock, A. H., Sept, D. & McCammon, J. A. (2001). Computer simulation of protein protein interactions. J. Phys. Chem. ser. B, 105, Fernandez-Recio, J., Totrov, M. & Abagyan, R. (2002). Soft protein protein docking in internal coordinates. Protein Sci. 11, Fernandez-Recio, J., Totrov, M. & Abagyan, R. (2004). Identification of protein protein interaction sites from docking energy landscapes. J. Mol. Biol. 335, Smith, G. R. & Sternberg, M. J. (2002). Prediction of protein protein interactions by docking methods. Curr. Opin. Struct. Biol. 12, Sternberg, M. J., Gabb, H. A. & Jackson, R. M. (1998). Predictive docking of protein-protein and protein DNA complexes. Curr. Opin. Struct. Biol. 8, Wodak, S. J. & Janin, J. (1978). Computer analysis of protein protein interaction. J. Mol. Biol. 124, Dominguez, C., Boelens, R. & Bonvin, A. M. (2003). HADDOCK: a protein protein docking approach based on biochemical or biophysical information. J Am. Chem. Soc. 125, Mirny, L. A. & Shakhnovich, E. I. (1999). Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function. J. Mol. Biol. 291, Lichtarge, O. & Sowa, M. E. (2002). Evolutionary predictions of binding surfaces and interactions. Curr. Opin. Struct. Biol. 12, Valdar, W. S. & Thornton, J. M. (2001). Protein protein interfaces: analysis of amino acid conservation in homodimers. Proteins: Struct. Funct. Genet. 42, Aloy, P., Querol, E., Aviles, F. X. & Sternberg, M. J. (2001). Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. J. Mol. Biol. 311, Caffrey, D. R., Somaroo, S., Hughes, J. D., Mintseris, J. & Huang, E. S. (2004). Are protein protein interfaces more conserved in sequence than the rest of the protein surface? Protein Sci. 13, Overington, J., Donnelly, D., Johnson, M. S., Sali, A. & Blundell, T. L. (1992). Environment-specific amino acid substitution tables: tertiary templates and prediction of protein folds. Protein Sci. 1, Overington, J., Johnson, M. S., Sali, A. & Blundell, T. L. (1990). Tertiary structural constraints on protein evolutionary diversity: templates, key residues and structure prediction. Proc. Roy. Soc. ser. B, 241, Chelliah, V., Chen, L., Blundell, T. L. & Lovell, S. C. (2004). Distinguishing structural and functional restraints in evolution in order to identify interaction sites. J. Mol. Biol. 342,

Prediction of Protein-Protein Binding Sites and Epitope Mapping. Copyright 2017 Chemical Computing Group ULC All Rights Reserved.

Prediction of Protein-Protein Binding Sites and Epitope Mapping. Copyright 2017 Chemical Computing Group ULC All Rights Reserved. Prediction of Protein-Protein Binding Sites and Epitope Mapping Epitope Mapping Antibodies interact with antigens at epitopes Epitope is collection residues on antigen Continuous (sequence) or non-continuous

More information

Comparative Modeling Part 1. Jaroslaw Pillardy Computational Biology Service Unit Cornell Theory Center

Comparative Modeling Part 1. Jaroslaw Pillardy Computational Biology Service Unit Cornell Theory Center Comparative Modeling Part 1 Jaroslaw Pillardy Computational Biology Service Unit Cornell Theory Center Function is the most important feature of a protein Function is related to structure Structure is

More information

Protein-Protein Interaction Analysis by Docking

Protein-Protein Interaction Analysis by Docking Algorithms 2009, 2, 429-436; doi:10.3390/a2010429 OPEN ACCESS algorithms ISSN 1999-4893 www.mdpi.com/journal/algorithms Article Protein-Protein Interaction Analysis by Docking Florian Fink *, Stephan Ederer

More information

Structural Bioinformatics (C3210) Conformational Analysis Protein Folding Protein Structure Prediction

Structural Bioinformatics (C3210) Conformational Analysis Protein Folding Protein Structure Prediction Structural Bioinformatics (C3210) Conformational Analysis Protein Folding Protein Structure Prediction Conformational Analysis 2 Conformational Analysis Properties of molecules depend on their three-dimensional

More information

Protein Structure Prediction

Protein Structure Prediction Homology Modeling Protein Structure Prediction Ingo Ruczinski M T S K G G G Y F F Y D E L Y G V V V V L I V L S D E S Department of Biostatistics, Johns Hopkins University Fold Recognition b Initio Structure

More information

Molecular recognition

Molecular recognition Lecture 9 Molecular recognition Antoine van Oijen BCMP201 Spring 2008 Structural principles of binding 4 fundamental functions of proteins: 1) Binding 2) Catalysis 3) Switching 4) Structural All involve

More information

Protein-Protein Interactions I

Protein-Protein Interactions I Biochemistry 412 Protein-Protein Interactions I March 23, 2007 Macromolecular Recognition by Proteins Protein folding is a process governed by intramolecular recognition. Protein-protein association is

More information

Protein-Protein Interactions I

Protein-Protein Interactions I Biochemistry 412 Protein-Protein Interactions I March 11, 2008 Macromolecular Recognition by Proteins Protein folding is a process governed by intramolecular recognition. Protein-protein association is

More information

Overview. Building macromolecular assemblies by information-driven docking. Data-driven HADDOCKing. Data-driven docking with HADDOCK

Overview. Building macromolecular assemblies by information-driven docking. Data-driven HADDOCKing. Data-driven docking with HADDOCK Building macromolecular assemblies by information-driven docking Challenges & perspectives Alexandre M.J.J. Bonvin Bijvoet Center for Biomolecular Research Faculty of Science, Utrecht University the Netherlands

More information

Worked Example of Humanized Fab D3h44 in Complex with Tissue Factor

Worked Example of Humanized Fab D3h44 in Complex with Tissue Factor Worked Example of Humanized Fab D3h44 in Complex with Tissue Factor Here we provide an example worked in detail from antibody sequence and unbound antigen structure to a docked model of the antibody antigen

More information

Protein-Protein Docking

Protein-Protein Docking Protein-Protein Docking Goal: Given two protein structures, predict how they form a complex Thomas Funkhouser Princeton University CS597A, Fall 2005 Applications: Quaternary structure prediction Protein

More information

Bioinformatics & Protein Structural Analysis. Bioinformatics & Protein Structural Analysis. Learning Objective. Proteomics

Bioinformatics & Protein Structural Analysis. Bioinformatics & Protein Structural Analysis. Learning Objective. Proteomics The molecular structures of proteins are complex and can be defined at various levels. These structures can also be predicted from their amino-acid sequences. Protein structure prediction is one of the

More information

Bacteriophages get a foothold on their prey

Bacteriophages get a foothold on their prey Bacteriophages get a foothold on their prey Long and thin, the receptor-binding needle of bacteriophage T4 Bacterial viruses, bacteriophages or phages, have served as a tool to decipher principles of molecular

More information

Molecular Modeling 9. Protein structure prediction, part 2: Homology modeling, fold recognition & threading

Molecular Modeling 9. Protein structure prediction, part 2: Homology modeling, fold recognition & threading Molecular Modeling 9 Protein structure prediction, part 2: Homology modeling, fold recognition & threading The project... Remember: You are smarter than the program. Inspecting the model: Are amino acids

More information

Suppl. Figure 1: RCC1 sequence and sequence alignments. (a) Amino acid

Suppl. Figure 1: RCC1 sequence and sequence alignments. (a) Amino acid Supplementary Figures Suppl. Figure 1: RCC1 sequence and sequence alignments. (a) Amino acid sequence of Drosophila RCC1. Same colors are for Figure 1 with sequence of β-wedge that interacts with Ran in

More information

Docking. Why? Docking : finding the binding orientation of two molecules with known structures

Docking. Why? Docking : finding the binding orientation of two molecules with known structures Docking : finding the binding orientation of two molecules with known structures Docking According to the molecules involved: Protein-Ligand docking Protein-Protein docking Specific docking algorithms

More information

What s New in Discovery Studio 2.5.5

What s New in Discovery Studio 2.5.5 What s New in Discovery Studio 2.5.5 Discovery Studio takes modeling and simulations to the next level. It brings together the power of validated science on a customizable platform for drug discovery research.

More information

Title: A topological and conformational stability alphabet for multi-pass membrane proteins

Title: A topological and conformational stability alphabet for multi-pass membrane proteins Supplementary Information Title: A topological and conformational stability alphabet for multi-pass membrane proteins Authors: Feng, X. 1 & Barth, P. 1,2,3 Correspondences should be addressed to: P.B.

More information

2/23/16. Protein-Protein Interactions. Protein Interactions. Protein-Protein Interactions: The Interactome

2/23/16. Protein-Protein Interactions. Protein Interactions. Protein-Protein Interactions: The Interactome Protein-Protein Interactions Protein Interactions A Protein may interact with: Other proteins Nucleic Acids Small molecules Protein-Protein Interactions: The Interactome Experimental methods: Mass Spec,

More information

Introduction to BioLuminate

Introduction to BioLuminate Introduction to BioLuminate Janet Paulsen Stanford University 10/16/17 BioLuminate has Broad Functionality Antibody design Structure prediction from sequence, humanization Protein design Residue scanning

More information

4/10/2011. Rosetta software package. Rosetta.. Conformational sampling and scoring of models in Rosetta.

4/10/2011. Rosetta software package. Rosetta.. Conformational sampling and scoring of models in Rosetta. Rosetta.. Ph.D. Thomas M. Frimurer Novo Nordisk Foundation Center for Potein Reseach Center for Basic Metabilic Research Breif introduction to Rosetta Rosetta docking example Rosetta software package Breif

More information

SUPPLEMENTARY INFORMATION Figures. Supplementary Figure 1 a. Page 1 of 30. Nature Chemical Biology: doi: /nchembio.2528

SUPPLEMENTARY INFORMATION Figures. Supplementary Figure 1 a. Page 1 of 30. Nature Chemical Biology: doi: /nchembio.2528 SUPPLEMENTARY INFORMATION Figures Supplementary Figure 1 a b c Page 1 of 0 11 Supplementary Figure 1: Biochemical characterisation and binding validation of the reversible USP inhibitor 1. a, Biochemical

More information

Evaluating Protein-Protein Docking Web Servers. catalysis and gene expression. Proteins often carry out their functions through interactions with

Evaluating Protein-Protein Docking Web Servers. catalysis and gene expression. Proteins often carry out their functions through interactions with Nina Ly Winter 2011 Evaluating Protein-Protein Docking Web Servers Proteins are involved in many cellular processes such as signal transduction, enzyme catalysis and gene expression. Proteins often carry

More information

Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions

Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions Introduction Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions Jamie Duke, Rochester Institute of Technology Mentor: Carlos Camacho, University

More information

Protein design. CS/CME/Biophys/BMI 279 Oct. 20 and 22, 2015 Ron Dror

Protein design. CS/CME/Biophys/BMI 279 Oct. 20 and 22, 2015 Ron Dror Protein design CS/CME/Biophys/BMI 279 Oct. 20 and 22, 2015 Ron Dror 1 Optional reading on course website From cs279.stanford.edu These reading materials are optional. They are intended to (1) help answer

More information

Molecular design principles underlying β-strand swapping. in the adhesive dimerization of cadherins

Molecular design principles underlying β-strand swapping. in the adhesive dimerization of cadherins Supplementary information for: Molecular design principles underlying β-strand swapping in the adhesive dimerization of cadherins Jeremie Vendome 1,2,3,5, Shoshana Posy 1,2,3,5,6, Xiangshu Jin, 1,3 Fabiana

More information

Convergence and combination of methods in protein protein docking Sandor Vajda and Dima Kozakov

Convergence and combination of methods in protein protein docking Sandor Vajda and Dima Kozakov Available online at Convergence and combination of methods in protein protein docking Sandor Vajda and Dima Kozakov The analysis of results from Critical Assessment of Predicted Interactions (CAPRI), the

More information

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology. G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY Methods or systems for genetic

More information

proteins DockRank: Ranking docked conformations using partner-specific sequence homologybased protein interface prediction

proteins DockRank: Ranking docked conformations using partner-specific sequence homologybased protein interface prediction proteins STRUCTURE O FUNCTION O BIOINFORMATICS DockRank: Ranking docked conformations using partner-specific sequence homologybased protein interface prediction Li C. Xue, 1 Rafael A. Jordan, 2,3 Yasser

More information

Ligand docking. CS/CME/Biophys/BMI 279 Oct. 22 and 27, 2015 Ron Dror

Ligand docking. CS/CME/Biophys/BMI 279 Oct. 22 and 27, 2015 Ron Dror Ligand docking CS/CME/Biophys/BMI 279 Oct. 22 and 27, 2015 Ron Dror 1 Outline Goals of ligand docking Defining binding affinity (strength) Computing binding affinity: Simplifying the problem Ligand docking

More information

CSE : Computational Issues in Molecular Biology. Lecture 19. Spring 2004

CSE : Computational Issues in Molecular Biology. Lecture 19. Spring 2004 CSE 397-497: Computational Issues in Molecular Biology Lecture 19 Spring 2004-1- Protein structure Primary structure of protein is determined by number and order of amino acids within polypeptide chain.

More information

Textbook Reading Guidelines

Textbook Reading Guidelines Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Last updated: May 1, 2009 Textbook Reading Guidelines Preface: Read the whole preface, and especially: For the students with Life Science

More information

The relationship between relative solvent accessible surface area (rasa) and irregular structures in protean segments (ProSs)

The relationship between relative solvent accessible surface area (rasa) and irregular structures in protean segments (ProSs) www.bioinformation.net Volume 12(9) Hypothesis The relationship between relative solvent accessible surface area (rasa) and irregular structures in protean segments (ProSs) Divya Shaji Graduate School

More information

BIOINFORMATICS Introduction

BIOINFORMATICS Introduction BIOINFORMATICS Introduction Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a 1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu What is Bioinformatics? (Molecular) Bio -informatics One idea

More information

Protein design. CS/CME/BioE/Biophys/BMI 279 Oct. 24, 2017 Ron Dror

Protein design. CS/CME/BioE/Biophys/BMI 279 Oct. 24, 2017 Ron Dror Protein design CS/CME/BioE/Biophys/BMI 279 Oct. 24, 2017 Ron Dror 1 Outline Why design proteins? Overall approach: Simplifying the protein design problem Protein design methodology Designing the backbone

More information

Information Page. A List of Secondary Structure Prediction Servers

Information Page. A List of Secondary Structure Prediction Servers Introduction to Bioinformatics Information Page Chapter 5, page 1/4, 2012/11/09 A List of Secondary Structure Prediction Servers Name APSSP PredictProtein GOR IV Jpred 3 CFSSP NNPREDICT JUFO BCM Search

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi: 10.1038/nature06147 SUPPLEMENTARY INFORMATION Figure S1 The genomic and domain structure of Dscam. The Dscam gene comprises 24 exons, encoding a signal peptide (SP), 10 IgSF domains, 6 fibronectin

More information

Recapitulation of Protein Family Divergence using Flexible Backbone Protein Design

Recapitulation of Protein Family Divergence using Flexible Backbone Protein Design doi:10.1016/j.jmb.2004.11.062 J. Mol. Biol. (2005) 346, 631 644 Recapitulation of Protein Family Divergence using Flexible Backbone Protein Design Christopher T. Saunders 1 and David Baker 2 * 1 Department

More information

Structure-Guided Deimmunization CMPS 3210

Structure-Guided Deimmunization CMPS 3210 Structure-Guided Deimmunization CMPS 3210 Why Deimmunization? Protein, or biologic therapies are proving to be useful, but can be much more immunogenic than small molecules. Like a drug compound, a biologic

More information

proteins STRUCTURE O FUNCTION O BIOINFORMATICS

proteins STRUCTURE O FUNCTION O BIOINFORMATICS proteins STRUCTURE O FUNCTION O BIOINFORMATICS Achieving reliability and high accuracy in automated protein docking: ClusPro, PIPER, SDU, and stability analysis in CAPRI rounds 13 19 Dima Kozakov, 1,2

More information

Can self-inhibitory peptides be derived from protein interfaces?

Can self-inhibitory peptides be derived from protein interfaces? Can self-inhibitory peptides be derived from protein interfaces? Ora Schueler-Furman RosettaCon 2010 Outline of today s talk 1. Short introduction on peptide-protein interactions and their modeling 2.

More information

Across-proteome modeling of dimer structures for the bottom-up assembly of protein-protein interaction networks

Across-proteome modeling of dimer structures for the bottom-up assembly of protein-protein interaction networks Maheshwari and Brylinski BMC Bioinformatics (2017) 18:257 DOI 10.1186/s12859-017-1675-z RESEARCH ARTICLE Across-proteome modeling of dimer structures for the bottom-up assembly of protein-protein interaction

More information

BMB/Bi/Ch 170 Fall 2017 Problem Set 1: Proteins I

BMB/Bi/Ch 170 Fall 2017 Problem Set 1: Proteins I BMB/Bi/Ch 170 Fall 2017 Problem Set 1: Proteins I Please use ray-tracing feature for all the images you are submitting. Use either the Ray button on the right side of the command window in PyMOL or variations

More information

CPORT: A Consensus Interface Predictor and Its Performance in Prediction-Driven Docking with HADDOCK

CPORT: A Consensus Interface Predictor and Its Performance in Prediction-Driven Docking with HADDOCK CPORT: A Consensus Interface Predictor and Its Performance in Prediction-Driven Docking with HADDOCK Sjoerd J. de Vries, Alexandre M. J. J. Bonvin* Faculty of Science, Bijvoet Center for Biomolecular Research,

More information

Protein design. CS/CME/BioE/Biophys/BMI 279 Oct. 24, 2017 Ron Dror

Protein design. CS/CME/BioE/Biophys/BMI 279 Oct. 24, 2017 Ron Dror Protein design CS/CME/BioE/Biophys/BMI 279 Oct. 24, 2017 Ron Dror 1 Outline Why design proteins? Overall approach: Simplifying the protein design problem < this step is really key! Protein design methodology

More information

static MM_Index snap(mm_index corect, MM_Index ligct, int imatch0, int *moleatoms, i

static MM_Index snap(mm_index corect, MM_Index ligct, int imatch0, int *moleatoms, i BIOLUMINATE static MM_Index snap(mm_index corect, MM_Index ligct, int imatch0, int *moleatoms, int *refcoreatoms){int ncoreat = :vector mappings; PhpCoreMapping mapping; for COMMON(glidelig).

More information

Supplementary Fig. 1. Initial electron density maps for the NOX-D20:mC5a complex obtained after SAD-phasing. (a) Initial experimental electron

Supplementary Fig. 1. Initial electron density maps for the NOX-D20:mC5a complex obtained after SAD-phasing. (a) Initial experimental electron Supplementary Fig. 1. Initial electron density maps for the NOX-D20:mC5a complex obtained after SAD-phasing. (a) Initial experimental electron density map obtained after SAD-phasing and density modification

More information

Computation and Mechanics in Living Matter: The Protein Problem

Computation and Mechanics in Living Matter: The Protein Problem Computation and Mechanics in Living Matter: The Protein Problem Sandipan Dutta & TT (IBS) Albert Libchaber (Rockefeller) Jean-Pierre Eckmann (Geneva) 1 In living information-processing systems, computational

More information

Exploring Suboptimal Sequence Alignments and Scoring Functions in Comparative Protein Structural Modeling

Exploring Suboptimal Sequence Alignments and Scoring Functions in Comparative Protein Structural Modeling Exploring Suboptimal Sequence Alignments and Scoring Functions in Comparative Protein Structural Modeling Presented by Kate Stafford 1,2 Research Mentor: Troy Wymore 3 1 Bioengineering and Bioinformatics

More information

From Computational Biophysics to Systems Biology (CBSB11) Celebrating Harold Scheraga s 90 th Birthday

From Computational Biophysics to Systems Biology (CBSB11) Celebrating Harold Scheraga s 90 th Birthday Forschungszentrum Jülich GmbH Institute for Advanced Simulation (IAS) Jülich Supercomputing Centre (JSC) From Computational Biophysics to Systems Biology (CBSB11) Celebrating Harold Scheraga s 90 th Birthday

More information

Teaching Principles of Enzyme Structure, Evolution, and Catalysis Using Bioinformatics

Teaching Principles of Enzyme Structure, Evolution, and Catalysis Using Bioinformatics KBM Journal of Science Education (2010) 1 (1): 7-12 doi: 10.5147/kbmjse/2010/0013 Teaching Principles of Enzyme Structure, Evolution, and Catalysis Using Bioinformatics Pablo Sobrado Department of Biochemistry,

More information

STRUCTURAL PREDICTION OF PROTEIN-RNA INTERACTION BY COMPUTATIONAL DOCKING WITH PROPENSITY-BASED STATISTICAL POTENTIALS

STRUCTURAL PREDICTION OF PROTEIN-RNA INTERACTION BY COMPUTATIONAL DOCKING WITH PROPENSITY-BASED STATISTICAL POTENTIALS STRUCTURAL PREDICTIO OF PROTEI-RA ITERACTIO BY COMPUTATIOAL DOCKIG WITH PROPESITY-BASED STATISTICAL POTETIALS LAURA PÉREZ-CAO 1, ALBERT SOLEROU 1, CARLES POS 1,2, JUA FERÁDEZ-RECIO 1 1 Life Sciences Department,

More information

DrugScore PPI Knowledge-Based Potentials Used as Scoring and Objective Function in Protein-Protein Docking

DrugScore PPI Knowledge-Based Potentials Used as Scoring and Objective Function in Protein-Protein Docking DrugScore PPI Knowledge-Based Potentials Used as Scoring and Objective Function in Protein-Protein Docking Dennis M. Krüger 1, José Ignacio Garzón 2, Pablo Chacón 2, Holger Gohlke 1 * 1 Institute for Pharmaceutical

More information

Homology Modeling of the Chimeric Human Sweet Taste Receptors Using Multi Templates

Homology Modeling of the Chimeric Human Sweet Taste Receptors Using Multi Templates 2013 International Conference on Food and Agricultural Sciences IPCBEE vol.55 (2013) (2013) IACSIT Press, Singapore DOI: 10.7763/IPCBEE. 2013. V55. 1 Homology Modeling of the Chimeric Human Sweet Taste

More information

Design of Multi-Specificity in Protein Interfaces

Design of Multi-Specificity in Protein Interfaces Design of Multi-Specificity in Protein Interfaces Elisabeth L. Humphris 1,2, Tanja Kortemme 1,2,3* 1 Graduate Group in Biophysics, University of California San Francisco, San Francisco, California, United

More information

Structural bioinformatics

Structural bioinformatics Structural bioinformatics Why structures? The representation of the molecules in 3D is more informative New properties of the molecules are revealed, which can not be detected by sequences Eran Eyal Plant

More information

Simulating Biological Systems Protein-ligand docking

Simulating Biological Systems Protein-ligand docking Volunteer Computing and Protein-Ligand Docking Michela Taufer Global Computing Lab University of Delaware Michela Taufer - 10/15/2007 1 Simulating Biological Systems Protein-ligand docking protein ligand

More information

Building an Antibody Homology Model

Building an Antibody Homology Model Application Guide Antibody Modeling: A quick reference guide to building Antibody homology models, and graphical identification and refinement of CDR loops in Discovery Studio. Francisco G. Hernandez-Guzman,

More information

Prediction of protein disorder Zsuzsanna Dosztányi

Prediction of protein disorder Zsuzsanna Dosztányi Prediction of protein disorder Zsuzsanna Dosztányi MTA-ELTE Momentum Bioinformatics Group Department of Biochemistry, Eotvos Lorand University, Budapest, Hungary dosztanyi@caeser.elte.hu IDPs Intrinsically

More information

ZDOCK: An Initial-Stage Protein-Docking Algorithm

ZDOCK: An Initial-Stage Protein-Docking Algorithm PROTEINS: Structure, Function, and Genetics 52:80 87 (2003) ZDOCK: An Initial-Stage Protein-Docking Algorithm Rong Chen, 1 Li Li, 1 and Zhiping Weng 1,2 * 1 Bioinformatics Program, Boston University, Boston,

More information

Navigating BioLuminate for PyMOL Users: A Practical Approach. 2nd European Life Science Bootcamp Ana Negri, PhD

Navigating BioLuminate for PyMOL Users: A Practical Approach. 2nd European Life Science Bootcamp Ana Negri, PhD Navigating BioLuminate for PyMOL Users: A Practical Approach 2nd European Life Science Bootcamp Ana Negri, PhD Agenda Introduction to BioLuminate PyMOL-like features in the BioLuminate interface: Basic

More information

RMSD-BASED CLUSTERING AND QR FACTORIZATION. Rob Swift, UCI Tuesday, August 2, 2011

RMSD-BASED CLUSTERING AND QR FACTORIZATION. Rob Swift, UCI Tuesday, August 2, 2011 RMSD-BASED CLUSTERING AND QR FACTORIZATION Rob Swift, UCI Tuesday, August 2, 2011 Outline Biomolecular flexibility, a few examples CADD pipeline overview RMSD clustering QR factorization Our evolving ideas

More information

Docking and Design with AutoDock. David S. Goodsell Arthur J. Olson The Scripps Research Institute

Docking and Design with AutoDock. David S. Goodsell Arthur J. Olson The Scripps Research Institute Docking and Design with AutoDock David S. Goodsell Arthur J. Olson The Scripps Research Institute Rapid automated docking using: Grid-based energy evaluation Torsion-only conformation search AutoDock History

More information

Ligand docking and binding site analysis with pymol and autodock/vina

Ligand docking and binding site analysis with pymol and autodock/vina International Journal of Basic and Applied Sciences, 4 (2) (2015) 168-177 www.sciencepubco.com/index.php/ijbas Science Publishing Corporation doi: 10.14419/ijbas.v4i2.4123 Research Paper Ligand docking

More information

Molecular Structures

Molecular Structures Molecular Structures 1 Molecular structures 2 Why is it important? Answers to scientific questions such as: What does the structure of protein X look like? Can we predict the binding of molecule X to Y?

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary Table 1. Crystallographic statistics CRM1-SNUPN complex Space group P6 4 22 a=b=250.4, c=190.4 Data collection statistics: CRM1-selenomethionine SNUPN MAD data Peak Inflection Remote Native

More information

Integrating atom-based and residue-based scoring functions for protein protein docking

Integrating atom-based and residue-based scoring functions for protein protein docking Integrating atom-based and residue-based scoring functions for protein protein docking Thom Vreven, Howook Hwang, and Zhiping Weng* Program in Bioinformatics and Integrative Biology, University of Massachusetts

More information

Chapter 14 Regulation of Transcription

Chapter 14 Regulation of Transcription Chapter 14 Regulation of Transcription Cis-acting sequences Distance-independent cis-acting elements Dissecting regulatory elements Transcription factors Overview transcriptional regulation Transcription

More information

Molecular Modeling Lecture 8. Local structure Database search Multiple alignment Automated homology modeling

Molecular Modeling Lecture 8. Local structure Database search Multiple alignment Automated homology modeling Molecular Modeling 2018 -- Lecture 8 Local structure Database search Multiple alignment Automated homology modeling An exception to the no-insertions-in-helix rule Actual structures (myosin)! prolines

More information

X-ray structures of fructosyl peptide oxidases revealing residues responsible for gating oxygen access in the oxidative half reaction

X-ray structures of fructosyl peptide oxidases revealing residues responsible for gating oxygen access in the oxidative half reaction X-ray structures of fructosyl peptide oxidases revealing residues responsible for gating oxygen access in the oxidative half reaction Tomohisa Shimasaki 1, Hiromi Yoshida 2, Shigehiro Kamitori 2 & Koji

More information

Genetic Algorithm for Predicting Protein Folding in the 2D HP Model

Genetic Algorithm for Predicting Protein Folding in the 2D HP Model Genetic Algorithm for Predicting Protein Folding in the 2D HP Model A Parameter Tuning Case Study Eyal Halm Leiden Institute of Advanced Computer Science, University of Leiden Niels Bohrweg 1 2333 CA Leiden,

More information

Protein-Protein Complex Structure Predictions by Multimeric Threading and Template Recombination

Protein-Protein Complex Structure Predictions by Multimeric Threading and Template Recombination Article Protein-Protein Complex Structure Predictions by Multimeric Threading and Template Recombination Srayanta Mukherjee 1,3 and Yang Zhang 1,2,3, * 1 Center for Computational Medicine and Bioinformatics

More information

CMSE 520 BIOMOLECULAR STRUCTURE, FUNCTION AND DYNAMICS

CMSE 520 BIOMOLECULAR STRUCTURE, FUNCTION AND DYNAMICS CMSE 520 BIOMOLECULAR STRUCTURE, FUNCTION AND DYNAMICS (Computational Structural Biology) OUTLINE Review: Molecular biology Proteins: structure, conformation and function(5 lectures) Generalized coordinates,

More information

Computational Methods for Protein Structure Prediction and Fold Recognition... 1 I. Cymerman, M. Feder, M. PawŁowski, M.A. Kurowski, J.M.

Computational Methods for Protein Structure Prediction and Fold Recognition... 1 I. Cymerman, M. Feder, M. PawŁowski, M.A. Kurowski, J.M. Contents Computational Methods for Protein Structure Prediction and Fold Recognition........................... 1 I. Cymerman, M. Feder, M. PawŁowski, M.A. Kurowski, J.M. Bujnicki 1 Primary Structure Analysis...................

More information

Solving Structure Based Design Problems using Discovery Studio 1.7 Building a Flexible Docking Protocol

Solving Structure Based Design Problems using Discovery Studio 1.7 Building a Flexible Docking Protocol Solving Structure Based Design Problems using Discovery Studio 1.7 Building a Flexible Docking Protocol C. M. (Venkat) Venkatachalam Fellow, Life Sciences Dipesh Risal Marketing, Life Sciences Overview

More information

Fig. 1. A schematic illustration of pipeline from gene to drug : integration of virtual and real experiments.

Fig. 1. A schematic illustration of pipeline from gene to drug : integration of virtual and real experiments. 1 Integration of Structure-Based Drug Design and SPR Biosensor Technology in Discovery of New Lead Compounds Alexis S. Ivanov Institute of Biomedical Chemistry RAMS, 10, Pogodinskaya str., Moscow, 119121,

More information

Molecular Structures

Molecular Structures Molecular Structures 1 Molecular structures 2 Why is it important? Answers to scientific questions such as: What does the structure of protein X look like? Can we predict the binding of molecule X to Y?

More information

Outline. Background. Background. Background and past work Introduction to present work ResultsandDiscussion Conclusions Perspectives BIOINFORMATICS

Outline. Background. Background. Background and past work Introduction to present work ResultsandDiscussion Conclusions Perspectives BIOINFORMATICS Galactosemia Foundation 2012 Conference Breakout Session G July 20th, 2012, 14:15 15:15 Computational Biology Strategy for the Development of Ligands of GALK Enzyme as Potential Drugs for People with Classic

More information

Chapter 4. Antigen Recognition by B-cell and T-cell Receptors

Chapter 4. Antigen Recognition by B-cell and T-cell Receptors Chapter 4 Antigen Recognition by B-cell and T-cell Receptors Antigen recognition by BCR and TCR B cells 2 separate functions of immunoglobulin (Ig) bind pathogen & induce immune responses recruit cells

More information

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1 Supplementary Figure 1 Domain architecture and conformational states of the decapping complex, as revealed by structural studies. (a) Domain organization of Schizosaccharomyces pombe (Sp) and Saccharomyces

More information

Sequence Determinants of a Conformational Switch in a Protein

Sequence Determinants of a Conformational Switch in a Protein Sequence Determinants of a Conformational Switch in a Protein Thomas A. Anderson, Matthew H. J. Cordes, and Robert T. Sauer Kyle Skalenko 4/17/09 Introduction Protein folding is guided by the following

More information

STRUCTURAL BIOLOGY. α/β structures Closed barrels Open twisted sheets Horseshoe folds

STRUCTURAL BIOLOGY. α/β structures Closed barrels Open twisted sheets Horseshoe folds STRUCTURAL BIOLOGY α/β structures Closed barrels Open twisted sheets Horseshoe folds The α/β domains Most frequent domain structures are α/β domains: A central parallel or mixed β sheet Surrounded by α

More information

Current Status and Future Prospects p. 46 Acknowledgements p. 46 References p. 46 Hammerhead Ribozyme Crystal Structures and Catalysis

Current Status and Future Prospects p. 46 Acknowledgements p. 46 References p. 46 Hammerhead Ribozyme Crystal Structures and Catalysis Ribozymes and RNA Catalysis: Introduction and Primer What are Ribozymes? p. 1 What is the Role of Ribozymes in Cells? p. 1 Ribozymes Bring about Significant Rate Enhancements p. 4 Why Study Ribozymes?

More information

Bioinformation by Biomedical Informatics Publishing Group

Bioinformation by Biomedical Informatics Publishing Group Algorithm to find distant repeats in a single protein sequence Nirjhar Banerjee 1, Rangarajan Sarani 1, Chellamuthu Vasuki Ranjani 1, Govindaraj Sowmiya 1, Daliah Michael 1, Narayanasamy Balakrishnan 2,

More information

Computer Aided Drug Design. Protein structure prediction. Qin Xu

Computer Aided Drug Design. Protein structure prediction. Qin Xu Computer Aided Drug Design Protein structure prediction Qin Xu http://cbb.sjtu.edu.cn/~qinxu/cadd.htm Course Outline Introduction and Case Study Drug Targets Sequence analysis Protein structure prediction

More information

Unit title: Protein Structure and Function (SCQF level 8)

Unit title: Protein Structure and Function (SCQF level 8) Higher National Unit specification General information Unit code: H92J 35 Superclass: RH Publication date: May 2015 Source: Scottish Qualifications Authority Version: 01 Unit purpose This Unit is designed

More information

Recent Progress of Interprotein s Research Activities. Interprotein Corporation

Recent Progress of Interprotein s Research Activities. Interprotein Corporation Recent Progress of Interprotein s Research Activities - INTENDD and AI-guided INTENDD - Interprotein Corporation 1 Interprotein Corporation Location: Osaka, Japan Year Established: 2001 CEO & President:

More information

Nature Methods: doi: /nmeth Supplementary Figure 1. Construction of a sensitive TetR mediated auxotrophic off-switch.

Nature Methods: doi: /nmeth Supplementary Figure 1. Construction of a sensitive TetR mediated auxotrophic off-switch. Supplementary Figure 1 Construction of a sensitive TetR mediated auxotrophic off-switch. A Production of the Tet repressor in yeast when conjugated to either the LexA4 or LexA8 promoter DNA binding sequences.

More information

Problem Set #2

Problem Set #2 20.320 Problem Set #2 Due on September 30rd, 2011 at 11:59am. No extensions will be granted. General Instructions: 1. You are expected to state all of your assumptions, and provide step-by-step solutions

More information

SUPPLEMENTARY INFORMATION. Supplementary Figures 1-8

SUPPLEMENTARY INFORMATION. Supplementary Figures 1-8 SUPPLEMENTARY INFORMATION Supplementary Figures 1-8 Supplementary Figure 1. TFAM residues contacting the DNA minor groove (A) TFAM contacts on nonspecific DNA. Leu58, Ile81, Asn163, Pro178, and Leu182

More information

Supporting Information. A general chemiluminescence strategy. for measuring aptamer-target binding and target concentration

Supporting Information. A general chemiluminescence strategy. for measuring aptamer-target binding and target concentration Supporting Information A general chemiluminescence strategy for measuring aptamer-target binding and target concentration Shiyuan Li, Duyu Chen, Qingtong Zhou, Wei Wang, Lingfeng Gao, Jie Jiang, Haojun

More information

ProBiS: a web server for detection of structurally similar protein binding sites

ProBiS: a web server for detection of structurally similar protein binding sites W436 W440 Nucleic Acids Research, 2010, Vol. 38, Web Server issue Published online 26 May 2010 doi:10.1093/nar/gkq479 ProBiS: a web server for detection of structurally similar protein binding sites Janez

More information

Antibody-Antigen recognition. Structural Biology Weekend Seminar Annegret Kramer

Antibody-Antigen recognition. Structural Biology Weekend Seminar Annegret Kramer Antibody-Antigen recognition Structural Biology Weekend Seminar 10.07.2005 Annegret Kramer Contents Function and structure of antibodies Features of antibody-antigen interfaces Examples of antibody-antigen

More information

VALLIAMMAI ENGINEERING COLLEGE

VALLIAMMAI ENGINEERING COLLEGE VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur 603 203 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER BM6005 BIO INFORMATICS Regulation 2013 Academic Year 2018-19 Prepared

More information

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005 Bioinformatics is the recording, annotation, storage, analysis, and searching/retrieval of nucleic acid sequence (genes and RNAs), protein sequence and structural information. This includes databases of

More information

A Protein Secondary Structure Prediction Method Based on BP Neural Network Ru-xi YIN, Li-zhen LIU*, Wei SONG, Xin-lei ZHAO and Chao DU

A Protein Secondary Structure Prediction Method Based on BP Neural Network Ru-xi YIN, Li-zhen LIU*, Wei SONG, Xin-lei ZHAO and Chao DU 2017 2nd International Conference on Artificial Intelligence: Techniques and Applications (AITA 2017 ISBN: 978-1-60595-491-2 A Protein Secondary Structure Prediction Method Based on BP Neural Network Ru-xi

More information

Structure and Function of the First Full-Length Murein Peptide Ligase (Mpl) Cell Wall Recycling Protein

Structure and Function of the First Full-Length Murein Peptide Ligase (Mpl) Cell Wall Recycling Protein Paper Presentation PLoS ONE 2011 Structure and Function of the First Full-Length Murein Peptide Ligase (Mpl) Cell Wall Recycling Protein Debanu Das, Mireille Herve, Julie Feuerhelm, etc. and Dominique

More information

15 Structure Prediction of Protein Complexes

15 Structure Prediction of Protein Complexes 15 Structure Prediction of Protein Complexes 15.1 Introduction Protein protein interactions are critical for biological function. They directly and indirectly influence the biological systems of which

More information

A Method for Integrative Structure Determination of Protein- Protein Complexes

A Method for Integrative Structure Determination of Protein- Protein Complexes Bioinformatics Advance Access published October 23, 2012 A Method for Integrative Structure Determination of Protein- Protein Complexes Dina Schneidman-Duhovny 1*, Andrea Rossi 2, Agustin Avila-Sakar 4,

More information

systemsdock Operation Manual

systemsdock Operation Manual systemsdock Operation Manual Version 2.0 2016 April systemsdock is being developed by Okinawa Institute of Science and Technology http://www.oist.jp/ Integrated Open Systems Unit http://openbiology.unit.oist.jp/_new/

More information