BELIEF PROPAGATION REVEALS ALLOSTERIC MECHANISMS IN PROTEINS Hetunandan Kamietty Computer Science Department, Carnegie Mellon Univerity, Pittburgh, PA 15213, USA Email: hetu@c.cmu.edu Arvind Ramanathan Lane Center for Computational Biology, Carnegie Mellon Univerity, Pittburgh, PA 15213, USA Email: aramanat@andrew.cmu.edu Chritopher Jame Langmead Computer Science Department, Carnegie Mellon Univerity, and Lane Center for Computational Biology, Carnegie Mellon Univerity, Pittburgh, PA 15213, USA Email: cjl@c.cmu.edu The phyical mechanim involved in alloteric regulation remain unclear. We preent a novel and efficient method for invetigating the propagation of regulatory ignal in protein tructure. Our approach utilize undirected graphical model to efficiently encode the Boltzmann ditribution over geometric configuration. Belief Propagation i then invoked to efficiently compute: (a) free energie and (b) alloteric coupling between dital reidue. We preent reult from two kind of experiment. Firt, we how that our method accurately predict change in free energy upon activation and/or mutation. Specifically, our method achieve a high correlation with experimentally determined G (R 2 = 0.90 for core reidue). Significantly, our method i capable of identifying thoe reidue experiencing the larget relative change in enthalpy and/or entropy. Second, we ue our method to tudy the alloteric behavior of cyclophilin A in enzyme catalyi. Our analyi reveal the alloteric coupling between reidue eparated by a much a 20 angtrom from the active ite. Thee reult correpond well with experimental meaurement. Our method require a few minute per protein, making it uitable for large-cale tudie. Taken together, thee reult ugget that our method provide an effective mean for invetigating alloteric regulation at the proteome cale. 1. INTRODUCTION Protein are inherently dynamic molecule 6. Increaing evidence from experiment a well a computational work ugget that thi inherent flexibility i largely reponible for a protein function. Several tudie indicate the preence of dital coupling between reidue, whereby an event uch a binding i tranmitted acro the entire protein to influence catalyi or ignaling (Fig. 1). The preence of uch alloteric coupling between dital ite naturally lead to the quetion if there are preferred pathway through which conformational change may propagate 5. At the preent time, there are very few method for elucidating thee pathway within protein tructure. Active ite Alloteric ite Alloteric effector Fig. 1. Alloteric Coupling The binding of an alloteric effector induce tructural change at the active ite, thu regulating the behavior of the protein. Correponding author.
2. METHODS We decribe here a novel approach to reveal mechanim of how conformational change at one ite can propagate to dital location along the protein tructure. Our approach i baed on a probabilitic modeling of the idechain conformational pace of a protein. A i common practice, we dicretize the conformational pace of each reidue type uing a rotamer library. If we ue R = R 1, R 2,..., R n to repreent the n variate random variable repreenting the ide-chain conformation of a protein backbone b, our approach encode P (R b) accurately and efficiently uing a Markov Random Field (MRF). The MRF encoding exploit the fact that energetic interaction fall off rapidly with ditance. Thi generic property of phyical ytem at thi cale lead to conditional independencie between atom with negligible interaction energie. Figure 2 illutrate a toy MRF for a 4-reidue protein fragment. The MRF i overlayed on the protein. Node correpond to random variable over rotameric tate. Given a MRF of a protein tructure, we can compute the marginal conditional probabilitie P (R i = r i b, e) of each rotameric tate r i for every reidue i of the protein uing Belief Propagation 12. Figure 2 illutrate thee marginal a hitogram. Our algorithm examine change in thee marginal in repone to conformational perturbation, like binding. In previou work, we have hown that thi probabilitic modeling accurately compute the global propertie of the conformational ditribution in protein 9 11 and protein complexe 7, 8. Moreover, running BP only take a few minute per tructure, making it ideally uited to performing ytematic perturbation tudie. 3. RESULTS We firt demontrate that our method accurately predict change in free energy upon mutation. A hown in Figure 3, our method achieve a high correlation between experimentally determined 4 and predicted G (R 2 = 0.9) for five point mutation in the core of eglin C (V13A, V14A, V34A, V54A, V62A). Fig. 3. Predicted G value how good agreement with experimental value. Inet how predicted G value for all mutant howing outlier which are olvent expoed. x 4 x 2 x 3 x 1 Fig. 2. A toy 4-reidue model. V18A V62A Fig. 4. Reidue in eglin c howing ignificant change in free-energy upon mutation. An important feature of our method i that we are able to compute the reidue-pecific change in free energy upon mutation. We ue thi information to identify
communication pathway. Figure 4 how the reult of two uch mutation. In each cae, the reidue highlighted in green i the reidue mutated, while red urface i ued to highlight ide-chain that are ignificantly perturbed by the mutation. Notice that a ingle mutation can induce change throughout the protein. Thi phenomenon i not unique to eglin C, nor doe it require mutation. Figure 5 illutrate that ignificant free energy change are detected upon ligation in region dital to the active ite in four unrelated protein. The inactive protein hown a gray cartoon. The activated protein i hown a a backbone trace, colored by the change in free energy. 1G16 1OIV 1. 1G16: Sec protein 2. 1OIV: Rab 11 3. 1XTQ: RHEB 4. 1ERK: MAP Kinae every poible rotamer r i to ae the nature of conformational coupling that i intrinic to the tructure. We then compared the change in the marginal probabilitie before and after the conformational change for every reidue uing the ymmetric KL divergence (KL ym (P (R j b, r i ), P (R j b, r j ))). For every reidue acro all conformation, we counted the number of reidue that howed ignificant deviation in their marginal (N a ) by defining a uitable threhold. Uing thi metric and the ditance of eparation (D) between the two reidue, we aeed if our method can ditinguih between the behavior of network reidue veru nonnetwork reidue. A hown in Table 1, network reidue (N) affect a larger proportion of non-neighbor reidue at longer ditance than non-network reidue (non-n). The ability of network reidue to influence the conformational tate of dital reidue i tatitically ignificant acro tructure, a well in each individual tructure. 1XTQ 1ERK Table 1. Summary of network (N) and non-network (non-n) reidue behavior acro all pecie of CypA. µ and σ repreent the mean and tandard deviation of the repective clae of reidue in each of the ix categorie (a) Hydrophobic/ Polar (H/P), (b) Number of affected reidue (N a), (c) average ditance to affected reidue (Avg(D)), (d) maximum ditance to affected reidue (Max(D)) (e) Number of neighbor at 10 angtrom cut-off (N n) and (f) ratio of (c) and (e). µ(n) σ(n) µ(non-n) σ(non-n) p-value Fig. 5. Four alloteric protein. 3.1. Identifying Alloteric Coupling In thi ection, we examine in detail the alloteric coupling network revealed by belief propagation in cyclophilin A (cypa) from 3 different pecie (H. apien, B. tauru, P. yeolii). CypA i an important receptor for everal immunouppreive drug and HIV infection. Experimental work 3 and molecular dynamic (MD) imulation 1, 2 have identified a network of coupled motion that influence the ubtrate iomerization proce. Thee network conit of 27 reidue exhibiting correlated motion, extending from the flexible urface region all the way to the active ite of the enzyme. The tudy provide u an ideal teting ground ince the network have been extenively characterized acro multiple pecie and multiple ubtrate. For every reidue i in each tructure of cypa, we ytematically fixed it idechain conformation to H/P 0.637 0.4826 0.4085 0.492 0.001 N a 14.3926 5.7732 9.6285 5.1538 0.001 Avg(D) 8.4333 1.1245 7.8289 1.4344 0.001 Max(D) 13.8646 3.181 12.2153 3.5086 0.001 N n 21.8815 4.6857 18.0352 4.9721 0.001 N a/n n 0.656 0.2357 0.5367 0.2543 0.001 Next we examined whether conformational change within the network reidue affected imilar et of reidue acro different pecie. For thi, we compared the interection of reidue affected by the change in conformation of every network reidue pair eparated by at leat 10 angtrom. We oberved that irrepective of the pecie or ubtrate bound, the network reidue affect the conformation of pecific reidue located acro the entire protein. Thee reidue are located on dynamically coupled region in CypA affecting the catalytic proce a oberved in previou tudie. For uitable control, we conidered thoe reidue that were not part of the network yet conerved acro different pecie. Thee reidue are located proximal to
H. apien B. tauru P. yeolli Fig. 6. Conervation of Alloteric pathway in cyclophilin A (CypA). Top row: Pathway detected via belief propagation analyi from F22-F113. Bottom row: Pathway detected via belief propagation analyi from F22-R55. the network reidue both tructurally and equentially. We oberved that although the average ditance between the network and non-network pair of reidue were imilar, the non-network reidue did not affect ditally eparated reidue, in agreement with previou tudie regarding coupled motion in CypA. Thi indicate a clear bia in the nature of conformational coupling exhibited by network reidue to thoe of the non-network reidue. Finally, a careful inpection of the reidue affected by change in the conformation of the network alo reveal imilar pathway of conformational connectivity acro multiple pecie. Example pathway are hown in Fig. 6. The top row left figure how the pathway between F22 and F113 (human). The pathway between the correponding reidue in cow and P. yoelii are hown in the top middle and right figure. F22 i in the core of the protein, and F113 i on the urface and i where ubtrate binding happen 3. Thee two reidue are eparated by over 12 angtrom. The belief propagation analyi reveal that thee reidue are connected via a imilar et of hydrophobic interaction, irrepective of equence homology. Additionally, thi pathway exhibit a high correlation in term of coupled motion a oberved from MD imulation 2. The bottom row how conerved pathway from F22 to a different active reidue (R55). Thi pathway i mediated by hydrogen bond and hydrophobic interaction. Once again, the belief propagation analyi reveal a conervation in the pathway. 4. CONCLUSIONS We have preented a novel approach to reveal mechanim of dital conformational coupling within protein tructure. The approach i phyically baed a it incorporate tandard molecular mechanical force field for computing internal energie and compute a rigorou approximation to the entropic contribution to the free energy. Our reult ugget that belief propagation can be ued to identify network of reidue that repond to variou perturbation (mutation, binding, etc) in an efficient manner. Thee network appear in a variety of protein and our experiment on cypa ugget that they may be conerved to ome degree acro pecie. We believe that by uing our approach, one may predict mechanim of energy tranfer between different part of the protein and analyze alloteric regulation in protein tructure. The availability of uch accurate and efficient method to undertand protein function could be of ignificant ue to biologit wanting to undertand protein-protein interaction network at a tructural level. 5. ONGOING WORK We are preently puruing a number of extenion to thi preliminary tudy. Firt, we have recently developed a MRF capable of modeling both backbone and idechain flexibility 7, 8. We are preently re-running thee experiment to examine the role that backbone flexibility play in ignal tranduction. Second, we are intereted in the phyical interpretation of meage paing algorithm. It i known that the fixpoint of the belief propagation algorithm correpond to thoe of well-tudied free energy-approximation 13. However, the phyical interpretation of the actual meage ued in belief propagation i not well undertood. Finally, we are examining a collection of well-characterized mutant to characterize our method ability to predict which mutation affect the coupling between reidue.
Reference 1. P. K. Agarwal. Ci/tran iomerization in hiv-1 capid protein catalyzed by cyclophilin a: Inight from computational and theoretical tudie. Protein: Struct., Funct., Bioinformatic, 56:449 463, 2004. 2. P. K. Agarwal, A. Geit, and A. Gorin. Protein dynamic and enzymatic catalyi: Invetigating the peptidyl-prolyl ci-tran iomerization activity of cyclophilin a. Biochemitry, 43(33):10605 10618, 2004. 3. Daryl A. Boco, Elan Z. Eienmeer, Suan Pochapky, Weley I. Sundquit, and Dorothee Kern. Catalyi of ci/tran iomerization in native hiv-1 capid by human cyclophilin a. Proc. Natl. Acad. Sci. USA, 99(8):5247 5252, 2002. 4. Michael W. Clarkon, Steven A. Gilmore, Marhall H. Edgell, and Andrew L. Lee. Dynamic coupling and alloteric behavior in a nonalloteric proteinx2020;. Biochemitry, 45(25):7693 7699, 2006. 5. Nina M. Goody and Stephen J. Benkovic. Alloteric regulation and catalyi emerge via a common route. Nat. Chem. Biol., 4(8):474 482, 2008. 6. K. Henzler-Wildman and D. Kern. Dynamic peronalitie of protein. Nature, 450:964 972, 2007. 7. H. Kamietty, C. Bailey-Kellogg, and C.J. Langmead. A graphical model approach for predicting free energie of aociation for protein-protein interaction under backbone and ide-chain flexibility. Technical Report CMU-CS-08-162, Carnegie Mellon Univerity, 2008. 8. H. Kamietty, C. Bailey-Kellogg, and C.J. Langmead. A Graphical Model Approach for Predicting Free Energie of Aociation for Protein-Protein Interaction Under Backbone and Side-Chain Flexibility. In Proc. Structural Bioinformatic and Computational Biophyic (3DSIG), page 67 68, 2009. 9. H. Kamietty, B. Ghoh, C. Bailey-Kellogg, and C.J. Langmead. Modeling and Inference of Sequence-Structure Specificity. In Proc. of the 8th International Conference on Computational Sytem Bioinformatic (CSB), page 91 101, 2009. 10. H. Kamietty, E. P. Xing, and C. J. Langmead. Free energy etimate of all-atom protein tructure uing generalized belief propagation. J. Comp. Biol., 15(7):755 766, September 2008. 11. H. Kamietty, E.P. Xing, and C.J. Langmead. Free Energy Etimate of All-atom Protein Structure Uing Generalized Belief Propagation. In Proc. 7th Ann. Intl. Conf. on Reearch in Comput. Biol. (RECOMB), page 366 380, 2007. 12. J. Pearl. Fuion, propagation, and tructuring in belief network. Artif. Intell., 29(3):241 288, 1986. 13. J.S. Yedidia, W.T. Freeman, and Wei Y. Contructing free-energy approximation and generalized belief propagation algorithm. IEEE Tranaction on Information Theory, 51:2282 2312, 2005.