Docking. Why? Docking : finding the binding orientation of two molecules with known structures

Size: px
Start display at page:

Download "Docking. Why? Docking : finding the binding orientation of two molecules with known structures"

Transcription

1 Docking : finding the binding orientation of two molecules with known structures Docking According to the molecules involved: Protein-Ligand docking Protein-Protein docking Specific docking algorithms usually designed to deal with one of these problems but not with both (different contact area, flexibility, level of representation, etc.) Eran Eyal June 2011 Local docking Global docking Why? Understanding interactions, roles of specific amino acids, design of mutations and changes of activity. Comparison of affinities of different molecules Drug design

2 Types of compatibilities Geometrical compatability preventing overlap between atoms, maximum shape compatability between the ligand and the binding site Chemical compatability maximum number of chemical favorable interactions and minimum number of unfavorable interactions Ligand-Protein docking Finding the place and the orientation of the interactions The general problem includes a search for the location of the binding site and a search to figure out the exact orientation of the ligand in the binding site. A program that do both makes a Global docking Sometimes the location of the binding site is known. In this case we only need to orient the ligand in the binding site. In this case the problem is called Local docking Global docking is more demanding in terms of computational time and the results are less accurate. When the location of the binding site is unknown we have several possibilities: 1. Looking for the binding sites by separate programs (such as detection of cavities, conservation etc.) and then applying a local docking program 2. Global docking

3 Computational time Docking programs are often restricted by the computational time, due to the enormous number of possibilities that should be examined Docking algorithms should consider the computational time and often the price is the accuracy of the scoring function Efficiency is especially required for drug design Rigidity vs flexibility Most of the early algorithms assumed that the docked molecules do not change conformations. This assumption allows to treat the molecules as rigid bodies, making the algorithm simpler and faster This assumption is obviously problematic and was proven to be wrong in several cases Newer algorithms try to face the flexibility problems with variety of ways. Other methods try to handle the flexibility problem indirectly or at least to minimize the damage of not incorporating flexibility. Docking procedures that perform rigid body search are termed rigid docking Docking procedures that consider possible conformational changes are termed flexible docking

4 Bound and unbound docking Components of the problem In bound docking the goal is to reproduce a known complex where the starting coordinates of the individual molecules are taken from the crystal of the complex In the unbound docking, which is a significantly more difficult problem, the starting coordinates are taken from the unbound molecules Algorithms to dock molecules need: A. System representation B. Searching procedure C. Scoring function D. Clustering procedure The parameters of the problem for docking of 2 rigid bodies are 3 angles (rotations) and 3 distances (translations) y z y z (0,0,1,0,0,0) x x (0,0,0,0,0,0) (0,0,0,0,0,30) (0,-1,0,0,0,0)

5 y z Usually the ligand is not rigid and few other parameters are required x N p = N fb (0,0,0,30,0, 0) Number of parameters needed to fully describe ligand position Position Orientation Number of flexible bonds y z Usually the ligand is not rigid and few other parameters are required N p = N fb x (0,0,0,0,0,0,240) Number of parameters needed to fully describe ligand position Position Orientation Number of flexible bonds If there are many parameters the problem is more complicated. GA have advantages over other methods in these situations

6 Searching procedures applied for docking Greed search Monte Carlo search/random Genetic Algorithms Scoring functions procedures applied for docking Force fields Geometric features Knowledge based parameters DARWIN Taylor and Burnett, 2000 Flexible docking program Using force field as a scoring function and a genetic algorithm for the search Parallel processing Possible to search simultaneously different ligands

7 GA parameters The basic operation are: Mutaion (P m =0.2) Recombination with one cut (P c1 =0.4) Recombination with two cuts (P c2 =0.4) The death rate is 5% and the survival rate is 10-30% Every solution is represented by a binary string. 3 genes descibe position (0.5 Å) resolution and 3 describes orientation (11.25 resolution). Every flexible bond is described by one parameter (60 resolution). The population size is and the number of generations is 10% of the population size Parallel processing is a natural choice for heavy calculations when using genetic algorithms Results The accuracy of the algorithm was shown to be good When the algorithm did not converge to the correct solution it was shown the problem is in the scoring function and not in the search The results were much better when water molecules were included or when the scoring function was modified and electrostatic term was omitted

8 DOCK The binding sites are represented by overlapping spheres which describe the shape of the cavity. The ligand molecule is placed in the binding sites and the shape and chemical compatibility is evaluated

9 1, 2 3 4, 5 LigIn Rigid ligand docking algorithm that makes use of contact surface areas to account for complementarity in the level of atoms Consider also chemical compatability Random search

10 Evaluation of docking algorithms The RMSD is most used parameter to evaluate the predictions with relation to known complexes A graph of RMSD against the score is usually given. Successful prediction should assign good scores to low RMSD complexes