4/10/2011. Rosetta software package. Rosetta.. Conformational sampling and scoring of models in Rosetta.

Size: px
Start display at page:

Download "4/10/2011. Rosetta software package. Rosetta.. Conformational sampling and scoring of models in Rosetta."

Transcription

1 Rosetta.. Ph.D. Thomas M. Frimurer Novo Nordisk Foundation Center for Potein Reseach Center for Basic Metabilic Research Breif introduction to Rosetta Rosetta docking example Rosetta software package Breif introduction to Rosetta Rosetta docking example Rosetta consists of multiple modules: protein folding, comparative modeling, ligand docking, protein design, antibody/antigen interactions, etc. Rosetta is developed in a consortium of twelve laboratories by around 50 developers Rosetta is free for academic user guide and tutorials are available PyRosetta is a python interface allows integration with Pymol FoldIt is the better video game for you and your kids Rosetta@home uses your computer for our RosettaCon 2009, Leavenworth, WA, USA research Conformational sampling and scoring of models in Rosetta 1. Rosetta Combines Conformational Sampling ~12,000,000 sequences >Q8TDV5 GP119_HUMAN Glucose-dependent... MESSFSFGVILAVLASLIIATNTLVAVAVLLLIHKNDGVSL CFTLNLAVADTLIGVAISGLLTDQLSSPSRPTQKTLCSLRM AFVTSSAAASVLTVMLITFDRYLAIKQPFRYLKIMSGFVAG ACIAGLWLVSYLIGFLPLGIPMFQQTAYKGQCSFFAVFHPH FVLTLSCVGFFPAMLLFVFFYCDMLKIASMHSQQIRKMEHA GAMAGGYRSPRTPSDFKALRTVSVLIGSFALSWTPFLITGI Scoring Function for Structure Determination? ~70,000 structures 1

2 Conformational sampling and scoring of models in Rosetta 1. Rosetta Combines Conformational Sampling 1. Exchanging the backbone conformation of 9 and 3 amino acids peptide fragments, collected from the pdb of homologous strethes. 2. Metropolis Monte Carlo E new < E old accept E new E old accept with probability e-[ (Enew - Eold)/T ] 2. Scoring Function for Structure Determination 1. Low resolution: Reduced atom representation (centroid) Simple energy function which Aggressively search conformational space 2. High resolution: Full atom more sophisticated energy function. Local search of conformational (and sequence) space Sampling strategies for backbone degrees of freedom >Q8TDV5 GP119_HUMAN Homo sapiens MESSFSFGVILAVLASLIIATNTLVAVAVLLLIHKNDGVSLCFTLNLAVADTLIGVAISGLLTDQLSSPSR PTQKTLCSLRMAFVTSSAAASVLTVMLITFDRYLAIKQPFRYLKIMSGFVAGACIAGLWLVSYLIGFLPLG IPMFQQTAYKGQCSFFAVFHPHFVLTLSCVGFFPAMLLFVFFYCDMLKIASMHSQQIRKMEHAGAMAGGYR VLTLSCVGF SPRTPSDFKALRTVSVLIGSFALSWTPFLITGI... Approximate local interactions using the distribution of conformations seen for similar sequences in known protein structures.. For each sequence window, select fragments that represent the conformations sampled during folding While not every protein fold is present in the protein databank, all possible conformations of small peptides are.. Majority of conformational sampling protocols in Rosetta use Metropolis Monte Carlo follow by gradient based minmization Monte Carlo simulated annealing assembly of fragments Statistically-derived potential function Steric overlap (vdw interactions) Residue environment (solvation) Pairwise interactions (electrostatics) Strand pairing (hydrogen bonding) Compactness (solvation) Filter conformational ensample: identify the best structure Cluster models that maintain the same overall fold e.g. Cαrmsd < 5 Å 1) Remove very low contact order structures 2) Select broadest minima using cluster analysis 3) Select lowest energy structures with full atom potential Simplified protein representation One centroid per amino acid side chain High resolution potential energy function full atom representation The free energy minimum corresponds (usually) to the native protein fold Its depth is obscured because of the simplified energy approximation Computational strategy.. ab initio protein folding Rosetta begins with an extended peptide chain. Insertions of backbone fragments rapidly folds the protein Folding Units.. Approximate local interactions using the distribution of conformations seen for similar sequences in known protein structures.. Sample conformational space using Monte Carlo simulations Statistically-derived potential function Cluster analysis.. Select broadest minima Simons K. T.; KooperbergC.; Huang E.; Baker D. (1997) J. Mol. Biol. 268, RohlC. A.; Strauss C. E.; MisuraK. M.; Baker D. (2004) MethodsEnzymol. 383, Simons K. T.; RuczinskiI.; KooperbergC.; Fox B. A.; Bystroff C.; Baker D. (1999) Proteins 34, Bradley P.; MisuraK. M.; Baker D. (2005) Science 309, Refine models.. Full atomic potential energy function.. 2

3 The resulting model undergo atomic-detail refinement GPR119 AR interaction? Breif introduction to Rosetta AR Rosetta docking example Mutational mapping of the AR ligand binding site in the GPR119 receptor 3

4 Rosetta docking protocol Modelling the GPR119 receptor Table 1: Sequence alignment of β2-adrenergic GPR with GPR119 ID=22% pp=7.2 Q8TDV5_GP119_HUMAN Q8TDV5_GP119_HUMAN Q8TDV5_GP119_HUMAN 1 1.E..#...I##.L#.L#I#...####.###.#.+...V..#F...LA#AD.##G#A#..#...#...#C.#...#...#.AS#.T###I.#DRY#AI..PF+Y.. DEVWVVGMGIVMSLIVLAIVFGNVLVITAIAKFERLQTVTNYFITSLACADLVMGLAVVPFGAAHILMKMWTFGNFWCEFWTSIDVLCVTASIETLCVIAVDRYFAITSPFKYQS MESSFSFGVILAVLASLIIATNTLVAVAVLLLIHKNDGVSLCFTLNLAVADTLIGVAISGLLTDQLSSPSRPTQKTLCSLRMAFVTSSAAASVLTVMLITFDRYLAIKQPFRYLK Resulting models sorted with respect to energy 2. Best scoring model had 2.7 Å to TM domain of b2ar template ##...A.##I#.#W#VS.L..FLP#.###...#...A...C.FF...#####...#..##P########...#..#...#...#.-#K 116 LLTKNKARVIILMVWIVSGLTSFLPIQMHWYRATHQEAINCYAEETCCDFFT-NQAYAIASSIVSFYVPLVIMVFVYSRVFQEAKRQLNIFE FCLKEHK 116 IMSGFVAGACIAGLWLVSYLIGFLPLGIPM FQQTAYKGQCSFFAVFHPHFVLTLSCVGFFPAMLLFVFFYCDMLKIASMHSQQIRKMEHAGAMAGGYRSPRTPSDFK AL+T#.###G.F.L.W.PF#I..IV.V#...##...#..#L.##G#.NS.#NPLIY#...-#R#.#..### ALKTLGIIMGTFTLCWLPFFIVNIVHVIQDN-LIRKEVYILLNWIGYVNSGFNPLIYC-RSPDFRIAFQELLCL ALRTVSVLIGSFALSWTPFLITGIVQVACQECHLYLVLERYLWLLGVGNSLLNPLIYAYWQKEVRLQLYHMALGVKKVLTSFLLFLSARNCGPERPRESSCHIVTISSSEFDG Lack of sequnece similarity in loop regions.. 6. Rosetta docking protocol Generate initial homology model Produce 1000 models using Rosetta relaxed 1000 diffeent loop conformations were generated on best model 1. Kinematic closure algorithm 2. Disulphide bridge constrain between Cys in EXL2B and Cys III:01 1. Week sequence similarity to templates.. 3. Quality of homology model is questionable.. Generate GPR119 Homology model (b2ar template) 1000 GPR119 models were produced using Rosetta relax low energy ligand (AR ) conformations were generated A total of 2000 docking trajectories of randomly picked ligand conformations were performed on each of the 1000 loop models Of the combinatorial solutions top 5 % were selected based on total energy and receptor-ligand interaction energy 5 best docking poses were relaxed 5000 times to optimize loop structure and receptor ligand packing Low resolution docking mode Refine best docking poses Generate ensamle of low energy conformations of AR In between helices lack agreement with experimental mapping 2000 docking trajectories performed for each of the 1000 loop models decoys Cluster, filter and extract top10 best models lack agreement with experimental mapping Pose 1 Pose 2 Pose 3 Pose 5 Pose 4 A total og 5000 docking trajectories were performed for each of the 5 best docking poses to relax and refine the receptor and ligand packing Build and refine 100 loop conformations for each of the 10 best models A total of 1000 models Docking pose I: Refinement Docking pose I: Refinement Pose 1 4

5 Docking pose II: Refinement Docking pose II: Refinement Pose 2 Docking pose III: Refinement Docking pose III: Refinement Pose 3 Docking pose III: Refinement Docking pose IV: Refinement Pose 4 5

6 Docking pose V: Refinement Docking pose V: Refinement Pose 5 Refined GPR119 docking decoys Docking pose III Docking pose IV Refined GPR119 docking decoys Docking pose III Docking pose IV Proposed binding pose of AR agonist to the GPR119 receptor Binding pocket of AR in GPR119. Amino acids investigation by mutagenesis is coloured according to potency shift. 6