MULTIPLE SUBSTRATE KINETICS OF RIBONUCLEASE P: RELATIVE RATE CONSTANT DETERMINATION THROUGH INTERNAL COMPETITION LINDSAY ELYSE YANDEK

Size: px
Start display at page:

Download "MULTIPLE SUBSTRATE KINETICS OF RIBONUCLEASE P: RELATIVE RATE CONSTANT DETERMINATION THROUGH INTERNAL COMPETITION LINDSAY ELYSE YANDEK"

Transcription

1 MULTIPLE SUBSTRATE KINETICS OF RIBONUCLEASE P: RELATIVE RATE CONSTANT DETERMINATION THROUGH INTERNAL COMPETITION by LINDSAY ELYSE YANDEK Submitted in partial fulfillment of the requirements For the degree of Doctor of Philosophy Dissertation Advisor: Dr. Michael Harris Department of Biochemistry CASE WESTERN RESERVE UNIVERSITY August 2013

2 CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES We hereby approve the thesis/dissertation of Lindsay E. Yandek candidate for the Ph.D. Degree* (signed) William Merrick (chair of the committee) Michael Harris Eckhard Jankowsky Pieter dehaseth Blanton Tolbert (date) 13 May 2013 *We also certify that written approval has been obtained for any proprietary material contained therein. 2

3 Table of Contents List of Tables..6 List of Figures.7 Acknowledgements.8 Abstract... 9 Chapter 1: Introduction RNase P and pre-trna interactions.12 E. coli transfer RNA.18 Apparent Kinetic Mechanism...21 Uniformity and thermodynamic compensation in substrate binding by RNase P.24 Facing the biological context 25 Chapter 2: Molecular Recognition in trna Processing by the RNase P Ribonucleoprotein.28 Results and Discussion.33 Comparison of the multiple turnover kinetics of pre-trna MET82 and pretrna METf47 processing by E. coli RNase..37 Pre-steady state kinetic analyses to evaluate the reaction step that limits V..46 Single turnover kinetics to evaluate the reaction step that is rate limiting for V/K...46 Competitive alternative substrate kinetics of pre-trna MET82 and pre-trna METf47 processing by RNase P..51 Determination of relative rate constants for pre-trnas in complex substrate populations by internal competition

4 Chapter 3: Simultaneous Determination of Processing Rate Constants for All Individual RNA Species Processed by RNase P.65 Experimental Design.74 Results and Discussion..79 Competitive Multiple Turnover Kinetics of the Randomized Pool..79 Effects of N(-3) to N(-8) randomization and N(-1) to N(-6) randomization on the kinetics of processing of these populations by RNase P...82 Competitive Multiple Turnover Reactions of L1-L5 against pre-trna MET82 +2 Shortened Sequence.. 83 Chapter 4: Discussion...86 Comparison of the multiple turnover kinetic schemes of representative canonical and non-canonical pre-trna substrates 87 Testing the fundamental features of a simple alternative substrate kinetic model for RNase P processing of multiple pre-trnas in vitro 89 Comparison of relative V/K values for selected alternative substrates.. 90 Parallels between RNase P processing and alternative substrate recognition by other enzymes 93 Application of internal competition kinetics to the analysis of complex substrate populations. 94 Validation of results from HTS-KIN reveals apparent effects of kinetic mechanism on sensitivity to substrate variation 95 Chapter 5: Future Directions.97 Single turnover HTS-KIN kinetics comparing two different +21 extended sequence of the randomized pool in the N(-1) to N(-6) position..97 Competitive Multiple Turnover Reactions Varying Mg 2+ Concentrations...97 Explore possibility of in vivo experiments 98 4

5 Chapter 6: Experimental Procedures..100 Appendix: Expansion on the mathematical properties of alternative substrate kinetics.108 References 112 5

6 List of Tables Table 2-1. Multiple turnover rate constants for processing of pre-trna MET82 and pretrna METf47 by RNase P Table 2-2. Single turnover rate constants for processing of pre-trna MET82 and pretrna METf47 by RNase P.42 Table 3-2. Multiple turnover data for randomized pre-trna MET82 against RNase P 81 Table 3-3. Non-competitive multiple turnover data for L1-L5 against RNase P Holoenzyme...81 Table 3-4. Single turnover reaction rates of a uniform pre-trna controls and two different randomized pools of pre-trnas.83 Table 3-5. r k values determined for L1-L5+21 variants to pre-trna MET Table 3-6. r k values determined for L1-L5 substrates lacking the additional +21 leader nucleotides relative to pre-trna MET

7 List of Figures Figure 1-1. Pre-tRNA cleavage by RNase P Holoenzyme.13 Figure 1-2. Schematic of E. coli RNase P..14 Figure 1-3. Secondary structure representation of E. coli RNase P...17 Figure 1-4. RNase P and pre-trna interactions at the cleavage site. 20 Figure 2-1. Secondary structure and sequence conservation of E.coli pre-trnas. 30 Figure 2-1. Secondary structure and sequence representative pre-trnas.. 39 Figure 2-3. Multiple turnover and pre-steady state kinetics of pre-trna MET82 and pretrna fmet47 by RNase P...43 Figure 2-4. Progress curve analysis of pre-trna MET82 and pre-trna METf47 multiple turnover kinetics Figure 2-5. Single turnover kinetics of pre-trna MET82 and pre-trna METf47 processing by RNase P...49 Figure 2-6. Competitive multiple turnover reactions containing both pre-trna METf47 and pretrna MET Figure 2-7. Analysis of the relative rate constant for processing of pre-trna LEU76 by internal competition...58 Figure 2-8. Determination of relative rate constants for pre-trna SER80 cleavage and miscleavage by internal competition...62 Figure 2-9. Histogram of r k values for different pre-trna substrates...64 Figure 3-1. Crystal structure of the RNase P and leader sequence interactions.68 Figure 3-2. Hydroxyl radical protection analysis of pre-trna binding to E. coli RNase P..71 Figure 3-3. The randomized leader sequence of pre-trna MET82 72 Figure 3-4. Difference between random pool and single substrate multiple turnover kinetics..75 Figure 3-5. Histogram of distribution of random population in comparison to pre-trna MET82 77 Figure 3-6. The sequence logo of fastest sequences analyzed through HTS-KIN.78 7

8 Acknowledgements For the successful completion of my tenure here at Case Western Reserve University I owe a great many thanks to a number of individuals including mentors, colleagues, friends and family. I owe Dr. Michael Harris a world of thanks for being an incredible advisor and mentor. Through his guidance I have grown as a scientist in ways I never thought possible. I will be eternally grateful to him for bringing me into his laboratory, helping me to develop my independent thinking, and allowing me to see all aspects of what it takes to be a successful scientist, which I think he has accomplished quite extraordinarily. When I joined the Harris Laboratory, it was Mike, Dr. Frank Campbell, Dr. Eric Christian and myself. I could not have asked for a better introduction to my Ph.D. studies. These men, all accomplished scientists, patiently taught me the ropes and spent endless hours talking with me about science, politics and life. Even though we have gone our separate ways, I think of those years as the fondest of my graduate career, largely in part to them and will be forever grateful for all their advice and ideas. I would like to acknowledge the Biochemistry Department who has provided me with a wonderful education and learning environment. I am very grateful to my committee members Drs. Eckhard Jankowsky, Pieter dehaseth and Blanton Tolbert for participating on my committee and spending time giving thoughtful feedback on my project. I would like to give a special thank you to my committee chair Dr. William Merrick for not only participating on my committee, but being a support system to me and all of the graduate students by acting as our always supportive and helpful academic advisor. Last but not least I would like to thank my family and friends who have supported my tirelessly. From my friends at Case Western Reserve University that are now spread far and wide perusing their scientific careers, to my non-science friends, thank you for countless hours of commiserating, having scientific discussions and of course beer drinking and department gossip. I am extremely lucky to have a sister, Amy, who is perusing a doctorate degree at the same time. Although our degrees are in very different fields, we can always talk to each other about our successes and road blocks with understanding. I have been very lucky to have such a solid support system in Kevin Reilly. Without his tireless patience and understanding I do not know if I could have made it through this. And of course, I would like to thank my parents, Edward and Catherine Yandek, who have been with me every step of the way offering endless support through my 25 years of schooling. Without the love and support they have always provided I would never be where I am today. 8

9 Multiple Substrate Kinetics of Ribonuclease P: Relative Rate Constant Determination through Internal Competition Abstract By LINDSAY ELYSE YANDEK A single enzyme, ribonuclease P (RNase P), processes all precursor trna (pre-trna) in cells and organelles that carry out trna biosynthesis. This substrate population includes over 80 different competing pre-trnas in Escherichia coli. While the reaction kinetics and molecular recognition of a few individual model substrates of bacterial RNase P have been well described, the competitive substrate kinetics of the enzyme are comparatively unexplored. To understand the factors that determine how different pre-trna substrates compete for processing by E. coli RNase P, we compared the steady state reaction kinetics of two pre-trnas that differ at sequences that are contacted by the enzyme. For both pre-trnas, we demonstrated that substrate cleavage is fast relative to dissociation such that there is a large commitment to catalysis. As a consequence, V/K, the rate constant for the reaction at limiting substrate concentrations, reflects the substrate association step for both pre-trnas. Reactions containing two or more pre-trnas follow simple competitive alternative substrate kinetics in which the relative rates of 9

10 processing are determined by pre-trna concentration and their relative V/K values. The relative V/K values for eight different pre-trnas, that were selected to represent the range of structure variation at sites contacted by RNase P, were determined by internal competition in reactions in which all eight substrates are present simultaneously. The results reveal a relatively narrow range of V/K values suggesting rates of pre-trna processing by RNase P are tuned for uniform specificity and consequently optimal coupling to precursor biosynthesis. To evaluate this further, we developed a new, scalable method in order to study larger and more complex populations of pre-trnas. We examined all sequence variants in the cognate site of the RNA substrate for the C5 protein subunit of E. coli RNase P, which binds pre-trna leaders non-specifically. A high-throughput sequencing kinetics approach (HITS-KIN) reveals pronounced discrimination of C5 between sequence variants, determinants for discrimination that cannot be delineated by analysis of cellular substrates, and a distribution of substrate affinities of C5 that is very similar to specific DNA binding proteins. 10

11 Chapter 1: Introduction Ribonucleoproteins (RNPs) are some of the most highly conserved and important catalysts in biology. Ribonuclease P (RNase P) is an essential RNP enzyme that is responsible for catalyzing the maturation of the 5' end of transfer RNAs (trnas) through site-specific hydrolysis of a phosphodiester bond in precursor trnas (pre-trnas) (1). The products resulting from the reaction are a trna with a mature 5' end and an RNA corresponding to the pre-trna 5 leader sequence (2,3,4). Although the P RNA subunit is catalytic, RNase P differs from other ribozymes in two important ways. First, its biological role is to perform multiple turnover reactions, whereas other ribozymes typically undergo single turnover self-splicing or self-cleavage. The only other ribozyme known to have this ability is the ribosome and potentially the spliceosome (1). Second, RNase P has the ability to process multiple RNA substrates, including all pre-trnas in the cell. However, the basis for RNase P substrate specificity is not well understood; therefore, a better understanding of steady-state reaction kinetics for different substrates is essential. Since the discovery of the catalytic property of the RNA subunit, most detailed kinetic studies limited their focus to reactions involving the RNA alone, and consequently, significantly less is known regarding the multiple turnover kinetics of the biologically relevant ribonucleoprotein holoenzyme. Because, RNase P functions as the holoenzyme in vivo, an understanding of the kinetics of this enzyme form is essential for understanding its function in trna processing. Recent evidence indicates that RNase P Abbreviations used are: RNase P, ribonuclease P; trna, transfer RNA; pre-trna, precursor trna; HTS- KIN, High Through-Put Sequencing Kinetics. 11

12 binds pre-trnas with uniform affinity and cleaves with an essentially identical rate constant (5). Extensive previous studies of essential RNPs such as the ribosome, snrnps, and RNase P, give us a detailed understanding of the structures of the free and bound RNA and proteins, and the function and regulation of the assembled complexes. Much less is known about the structural dynamics, kinetics, and thermodynamics that underlie molecular recognition of complex substrates, and these are the areas of interest in the field currently. For that reason, my goal has been to study RNase P structure and function in a manner that will provide better understanding of the cooperative function of the P RNA and P protein subunits in molecular recognition by the RNase P holoenzyme. Learning how the RNA and protein subunits function and coordinate activities in enzyme specificity and catalysis is necessary for achieving an accurate understanding of their function in normal cell growth and development as well as in human disease. RNase P and pre-trna interactions - RNase P is a ph and metal-dependent enzyme responsible for catalyzing the maturation of the 5' end of pre-trnas (Fig. 1-1). It has been shown that the protein subunit enhances cleavage rates when metal and substrate are subsaturating (6-8). While many RNase P holoenzymes consist of more than one protein subunit, bacterial RNase P has a simple configuration, consisting of one protein and one RNA subunit. We study E. coli RNase P (Fig 1-2) which consists of a single RNA (~400 nt) and a single small protein subunit (~100 aa) which is easily reconstituted in vitro. For this reason, studying E. coli RNase P allows for more a precise and simplified study of the role of each subunit and their role in pre-trna processing. 12

13 Figure 1-1. A) Pre-tRNA cleavage by RNase P holoenzyme. On the left is a cartoon of a generic pre-trna, and on the right are the products of the phosphodiester bond cleavage, a 5 precursor sequence and a trna with a mature 5 end. The red sphere represents the cleavage site. B) The proposed mechanism of the reaction based on biochemical studies. On the left is a putative structure of the transition state of the reaction. On the right are the reaction products (1). Figure permission license number:

14 Figure 1-2. Schematic of E. coli RNase P as adapted by Dr. Michael Harris. The red portion represents the small protein subunit. The green represents the pre-trna, with the dashed line representing the leader sequence. The gold star represents the cleavage site. 14

15 In bacteria, RNase P secondary structure falls into two classes, the more common A type, of E. coli is an example, and the B type, which is found only in gram-positive bacteria (9-12) (Fig 1-3). The tertiary structure of bacterial RNase P RNA is formed by coaxially stacked helical domains which are stabilized by long range docking interactions (13,14). Bacterial RNase P has its most evolutionarily conserved sequences in the substrate binding pocket, while the periphery of the RNA contains the most variation between species (13,15). Even though there is significant sequence and structure variation between the A type and B type RNA, both have a single, essential and homologous protein subunit. The presence of the protein subunit in E. coli RNase P stabilizes the tertiary structure of the corresponding RNA and decreases the Mg 2+ dependence (16-20). There are currently three structures of the protein component of bacterial RNase P, one of the A type and 2 of the B type, which are have a strikingly similar crystal structure despite their overall very different sequences. This is not exceptionally surprising as early studies have shown that the protein subunits are interchangeable between species, which is thought to be largely in part due to their highly conserved hydrophobic core (12,21). But the structure of the protein alone provides limited information on the mechanism without the accompanying RNA component (22-24). The three dimensional structure of the RNA component was initially investigated on a smaller scale, and then by domain (25-28), but more recently the structure of the entire RNA component was solved (13,14). There are two distinct structural domains in bacterial RNase P that are responsible for two different actions. The S domain, or specificity domain recognizes the TΨC-loop of pre-trna while the catalytic (C) domain recognizes the acceptor stem and the 3 CCA and catalyzes the hydrolysis of the 5 leader 15

16 (13,14). The protein itself interacts primarily with the 5 leader sequence of pre-trna that is eventually removed to make a mature trna. These discoveries have been very useful in figuring out the RNase P mechanism, but ideally we would have a structure of the holoenzyme bound to pre-trna and to date, this has not been accomplished. 16

17 Fig 1-3. A secondary structure representation of type-a E. coli RNase P. Paired regions are labeled as P1, P2, etc. A linker joining two helices is labeled J11/12, and the loop capping P18 is called L18 (29). 17

18 E. coli Transfer RNA - Transfer RNA is a small RNA molecule essential for protein synthesis in all organisms where is serves as the adaptor between the genetic code contained in the mrna and the peptidyl transfer center of the ribosome. In bacterial cells, trna accounts for 20% of the total RNA. Transfer RNA steady-state levels in bacteria are necessarily a function of four processes: transcription of trna genes, processing of trna precursors, degradation of trna precursors, and degradation of mature trnas (30). In E. coli there are 87 genes encoding for trnas and RNase P has to process all of them. Many experiments have been done to understand the principles that govern molecular recognition that are essential for pre-trna processing using P RNA alone. These experiments have led to the elucidation of recognition elements in the pretrna sequence and structure that can greatly affect processing efficiency. The recognition elements that are near the cleavage site are a 3 RCCA sequence, a G(+1)/C(+72) as the first base pair in the acceptor stem (cleavage takes place between N- 1 and N+1), and U(-1) (Fig.1-4). Another recognition element present in pre-trnas that interacts with the substrate binding domain in P RNA are the 2 -OH groups in the T stem-loop. Changes in all of the recognition elements have an effect on cleavage efficiency except the 2 -OH groups, which only appear to affect binding efficiency. Roughly half of all E. coli pre-trnas have these recognition elements (consensus pretrnas) whereas the others lack one or more of these elements (non-consensus pretrnas). Mutations in individual recognition elements, greatly affect cleavage by P RNA alone in vitro, but with respect to the holoenzyme, the binding affinities and cleavage rates of both consensus and non-consensus substrates are essentially uniform, showing that it is not sensitive to changes in these elements of pre-trna sequence or 18

19 structure (5,31). Understanding how this uniformity is achieved is important in order to better understand how RNase P processes pre-trnas in a biological context. RNA production is an essential process in all cells for their successful growth, of which trna is a major component. In E. coli cells, the production of trna is regulated by the stringent response, which senses an increase of uncharged trna and negatively regulates the initiation of transcription of trna operons (32). E. coli has 87 trna genes coding for the different amino acids, however the steady state distribution of these trna species is not uniform (2). The trnas that are present at higher concentrations are those that recognize the most commonly used codons for highly expressed proteins, with the overall abundance being largely due to gene copy number (33,34). This correspondence of codon usage and trna abundance is believed to increase translational efficiency and therefore growth rates of the organism (35). 19

20 Figure 1-4. RNase P and pre-trna interactions at the cleavage site indicating the recognition elements. A) Cartoon diagram of the RNase P enzyme substrate complex. P RNA is depicted as a series of cylinders that indicate the positions of individual helices. The C5 protein subunit is shown as a sphere of approximate size relative to that of P RNA. The pre-trna substrate is shown as a black ribbon, with the leader sequence shown as a broken line. The site of processing by RNase P is indicated by an arrow. The nucleobase residues that are identified as important determinants for substrate recognition are shown as circles. B) Details of the interactions between P RNA and the pre-trna cleavage site. The 5 R(73)C(74)C(75) 3 sequence located at the 3 terminus of pre-trna pairs with G292, G293, and U294 in the L15 region of P RNA; this interaction helps to align the correct phosphodiester bond in the active site. A pyrimidine (predominantly a U) is present 5 to the cleavage site in about 80% of E. coli pre-trnas, and this residue contacts A248 in the J5/15 region of P RNA. Figure from Sun et. al (36). 20

21 Kinetic Mechanism of RNase P RNase P has the ability to recognize and process many different substrates, but as introduced above, its native substrate, pre-trna, has a wide variability including the sequence and structure around the cleavage site. Both protein and RNA components are needed in vivo, but in vitro, the P RNA can catalyze RNA cleavage in the absence of the protein under conditions of high salt and divalent ion concentrations. RNase P was one of the first discovered ribozymes and initial studies focused almost exclusively on the P RNA alone. These early experiments showed that the P RNA reaction was ph-dependent, divalent metal ion-dependent and that substrate binding affinity is increased by increasing the salt concentration in addition to the structural recognition elements on the pre-trna mentioned above. The majority of the steady-state experiments showed trna product release to be the rate limiting step and not pre-trna cleavage (1,6-8,37). While these results were revealing, mechanistically these studies were limited as all other reaction steps are not visible. Because of this limitation, pre-steady state kinetics have been integral to acquiring more detailed information on the kinetic and catalytic mechanism of RNase P. Steady-state and pre-steady-state kinetics have recently been executed on not only the P RNA subunit, but the holoenzyme with the latter experiments showing that the C5 protein enhances the rate constant for catalysis in addition to increasing pre-trna binding affinity (2,5,19,36,38). In order to understand the recognition of different pretrnas by RNase P a simple kinetic model is applied allowing basic kinetic parameters for steady state and pre-steady state kinetics to be quantitatively related. The current model is a two-step mechanism for substrate binding (Scheme 1-1). This scheme is supported by data and previous observations from our laboratory and others, and includes 21

22 the adjustment of thermodynamic contributions of substrate interactions to catalysis through threshold effects. In this scheme, substrate binding begins with a low affinity complex (ES) resulting from initial enzyme and substrate collision, and a second conformational change or docking step where the substrate recognition elements are contacted (ES*). Previous [ 32 P] 5 -end labeled steady-state and transient kinetic experiments from our laboratory and others determined that the cleavage step (kc) is irreversible (5,31,36,39-41). A more minimal scheme (Scheme 1-2) in which compresses the two step binding mechanisms into a single equilibrium can fit most of the same data, and is more useful to quantitatively compare RNase P holoenzyme and P RNA kinetics, as well as kinetics of RNase P components from different species. Henceforth we will apply the convention of using V and V/K as the fundamental multiple turnover kinetic parameters, where V is the rate constant for reaction of ES to form products and regenerate free enzyme and is equivalent to kcat, whereas V/K is the second order rate constant at limiting substrate concentrations, equivalent to kcat/km. 22

23 Scheme 1-1. Two-step RNase P substrate association mechanism. E (RNase P); S (pre-trna) ES (RNase P * pre-trna complex); EP (RNase P*leader*mature trna); P (5 leader and mature trna). Substrate binding begins with a low affinity complex (ES) resulting from initial enzyme and substrate collision followed by a conformational change or docking step where the substrate recognition elements are contacted (ES*). The first two steps are considered to be in equilibrium (K), whereas the cleavage step and product release are essentially irreversible. Scheme 1-2. Minimal kinetic scheme for pre-trna cleavage by RNase P. This simplified scheme that includes only initial binding, cleavage and product release if sufficient to fit most kinetic data. This is a more ideal scheme for comparing and contrasting different substrate mechanisms. 23

24 Uniformity and thermodynamic compensation in substrate binding by RNase P - As mentioned above, the absence of one or more consensus recognition elements decreases processing rates by the catalytic P RNA subunit in vitro. The presence of these elements varies among pre-trnas, and it has often been proposed that these differences might influence the steady state abundance of individual trnas via effects on RNase P processing based in studies of the P RNA alone (42-44). However, recent holoenzyme analyses from our laboratory suggest otherwise. Quantitative kinetic and equilibrium binding analyses of the molecular recognition properties of the reconstituted RNase P holoenzyme suggest that it has evolved to be insensitive to variation in pre-trna sequence and structure. Previously, the holoenzyme had been proposed to be sensitive to changes in pre-trna structure like the P RNA alone (5,36). Holoenzyme experiments have shown uniformity in binding affinity and catalysis with pre-trnas that vary significantly in the cognate recognition elements at the cleavage site (5,36,45). An interpretation of this result is that RNase P should optimally process the 5 ends of all pre-trnas at a high uniform rate. This would make the overall rate of biosynthesis linked to transcription (46). Some new observations are that weak trna binding may be compensated by tight leader sequence binding (4) and that the thermodynamic contribution of different contacts is non-additive (4,5,36) and appear to be controlled by threshold effects leading to uniformity. This observation raises two questions; What are the optimal leader sequences for a particular substrate? and, How are variations in nonconsensus sequences accommodated? Insight into the question is provided by the identification of the first sequence specific contact between leader sequence and protein subunits (47). We suggest, since pre-trnas have multiple recognition elements, that 24

25 variations at individual contacts may be tolerated due to thermodynamic coupling between the remaining contacts, reducing the apparent contribution of individual contacts to binding when multiple interactions are present, resulting in apparent uniform binding and catalysis for all substrates (2,4,5,36). With the information presented above and our current understanding of RNase P association kinetics and bacterial trna biosynthesis, we hypothesize that uniformity in multiple substrate recognition is due to different thermodynamic contributions from C5 protein interactions with pre-trna leader sequences. This sets the stage for us to determine the mechanistic basis for multiple substrate recognition, through comparison of rate constants from our simple association model for a number of sequence varying substrates. Facing the Biological Context - A large part of the substrate recognition studies of RNase P have focused on dissecting the determinants for high binding affinity and fast cleavage rate by changing the structure of a model substrate (44,48,49). In some instances this model substrate was not even from the same organism from which the RNase P was extracted (50,51). These types of experiments make the incorrect assumption that all trnas are the same. A majority of these earlier experiments were also done under single turnover conditions whereas RNase P holoenzyme is capable and works in multiple turnover capacity in vivo. For this reason we strive to develop a series of experiments that will bring us closer to understanding the kinetics of the enzyme that are relevant to biological context. The first step in developing a framework for understanding RNase P kinetics in vivo has been to compare in detail the kinetic schemes 25

26 for canonical versus non-canonical pre-trnas in order to provide a background for understanding how binding uniformity is achieved and to begin analysis of competitive multiple turnover kinetics. The experiments to date provide a framework for testing the basic model of substrate processing uniformity by RNase P. Because substrate binding is essentially irreversible relative to cleavage, it predicts that two substrates will compete at the level of the association step, which will be discussed in detail in Chapter 2. This observation predicts that substrates in a population will similarly compete with their relative rates determined by their concentration and their relative V/K values. Once the relevant kinetic perspective for understanding the processing of substrates in vivo has been established, we will be in a position to more effectively test the uniformity hypothesis for pre-trna processing by RNase P. It has been established that the bacterial RNase P protein, C5, interacts with the leader sequence of pre-trna. This interaction is thought to compensate for weakened binding from sequence and structural variation in pre-trnas. To further investigate and test this idea we are currently utilizing a novel high-throughput deep sequencing method (high throughput sequencing kinetics, HTS-KIN) for determining the relative rate constants of all members of a large population of RNA sequence variants in collaboration with the laboratory of Dr. Eckhard Jankowsky which will be discussed in detail in chapter 3. Importantly, the interpretation of HTS-KIN data takes advantage of the simple competitive substrate inhibition model developed and validated, above. We are using this approach to identify the effect of pre-trna leader sequences variation on the interaction with C5 protein. We hope the results from these experiments will not only give insight 26

27 into protein-rna interactions that control catalytic efficiency, but also will optimize this method for many future experiments. 27

28 Chapter 2: Molecular Recognition in trna Processing by the RNase P Ribonucleoprotein 1,2 Ribonuclease P (RNase P) is an essential ribonucleoprotein enzyme that is responsible for catalyzing the maturation of the 5' end of transfer RNAs (trnas) through site-specific hydrolysis of a phosphodiester bond in precursor trnas (pre-trnas) (52-54). The RNA subunit, termed P RNA, contains the active site (17,55) while the smaller protein subunit (C5 in E. coli) is required for optimal molecular recognition and catalysis in vitro and is essential in vivo (6,19,39,48,56,58,59). Although P RNA is a ribozyme, its mode of molecular recognition differs from other catalytic RNAs in two important ways. First, its biological role in pre-trna processing requires that it act in trans as a multiple turnover enzyme whereas, other ribozymes, with the exceptions of the ribosome and spliceosome, undergo single turnover self-splicing or self-cleavage reactions (46,60-62). Second, RNase P processes multiple RNA substrates, including all pre-trnas in the cell, whereas other ribozymes, again with the exceptions of the ribosome and spliceosome, have one specific substrate (63,64). These characteristics are essential to RNase P function as they are to the ribosome and spliceosome, and are common to many enzymes involved in RNA metabolism (65-68). Therefore, understanding the multiple substrate recognition properties of RNase P can shed light on general principles of molecular recognition by other ribonucleoproteins and multi-substrate enzymes. 1 Yandek, L.E., Lin, HC., and M.E. Harris (2013) Alternative substrate kinetics of Escherichia coli Ribonuclease P: Determination of relative rate constants by internal competition. J Biol Chem. 288(12): Further explanation of rational behind the equations published is discussed in the appendix. 28

29 The pre-trna nucleotides contacted by RNase P have been determined by chemical interference and structure-function studies (Fig. 2-1) (31,46,63,69). The recognition elements near the cleavage site include the 3 RCCA sequence, a G(+1)/C(+72) as the first base pair in the acceptor stem, and the 2 OH and nucleobase of a U(-1) residue 5 to the cleavage site. The substrate binding domain of P RNA also contacts 2 -OH groups in the T stem-loop (70,71). The spacing of these contacts in the T stem-loop in relation to the cleavage site results in an overall shape recognition of the substrate (72-75). 29

30 Figure 2-1: Secondary structure and sequence conservation of E. coli (K12) pre-trnas at regions contacted by RNase P. A) the conserved secondary structure of trna is shown. The RNase P ribonucleoprotein interacts with the acceptor stem and TψC stem and loop of trna (black circles). The enzyme also contacts in the 5 leader (gray circles). The sequences identified as forming RNA-RNA contacts with the RNase P enzyme are shown as letters and include a U at N( 1), a G(1)-C(73) base pair at the top of the acceptor stem, and the 3 -RCCA sequence as described in the Introduction. B) the linear sequences of trna involved in substrate binding are separated into three regions for presentation of sequence variation. Genomic trna sequences and alignments were obtained from the genomic trna database (64). Sequence logos for regions I, II, and III were created using WebLogo (76). Region I (N( 10 to N( 1)) includes the protein binding site in the 5 leader and the nucleotide at N( 1) that contacts P RNA. Region II (N(1) to N(7)) is the 5 side of the acceptor step including the RNase P cleavage site 5 to N(1). Region III (N(66) to N(76)) includes the 3 side of the acceptor stem and the conserved RCCA sequence that interacts with P RNA. 30

31 31

32 Comparative analysis of E. coli trna gene sequences shows significant variation among the nucleotides identified as contact points with the enzyme. As shown in Fig. 2-1, alignment of the 87 pre-trna genes of E. coli K12 (64) reveals that the leader sequences (Region I) and the acceptor stem (Regions II and III) show only minimal sequence conservation. An exception is the 3 CCA sequence that is recognized by the ribosome (77), aminoacyl-trna synthetases (78,79) and EF-Tu (80). Only two thirds of E. coli pre-trnas (66/87) contain a G(+1)/C(+72) and a similar fraction (63/87) have an optimal U at the N(-1) position (64). The population of pre-trnas that contain all of the recognition elements is significantly smaller (~50%; 42/87). These pre-trnas make up a canonical sequence, whereas a non-canonical pre-trna is missing one or more of these recognition elements. The adjacent basepair to these recognition elements is often a G(+2)/C(+71), however, this position is not known to contact RNase P. The 5 leader sequence shows no conserved motif, nevertheless, both binding and cleavage of model substrates by E. coli RNase P are sensitive to changes in the sequence of the 5 leader (2,36). Indeed, recent studies identified a protein-rna interaction between the leader sequence and the Bacillus subtilis RNase P (47). A structure from Mondragon and coworkers of the Thermotoga maritima RNase P bound to trna and leader products (75) is consistent with the experimentally defined interface between enzyme and substrate drawn from biochemical studies. Although specific leader contacts are not resolved in the crystal structure, it generally corresponds with the perspective from crosslinking and structure-function studies. Central to achieving a complete understanding of multiple substrate recognition by RNase P is the observation that catalysis by P RNA alone is sensitive to natural structural 32

33 variation among pre-trnas that results in the loss of RNA-RNA contacts between P RNA and pre-trna (5,36,40,49). Catalysis by the ribonucleoprotein holoenzyme, which forms additional leader sequence interactions, is less sensitive to sequence and structure variation among endogenous pre-trnas (36). A conformational change during substrate recognition has been documented for B. subtilis RNase P, where the protein subunit facilitates via leader sequence contacts (31,39). This two-step mechanism for substrate binding may give rise to threshold effects resulting in similar rate constants for catalysis for substrates lacking optimal contacts with the enzyme (5,36,40,80). Thus, detailed in vitro structure-function studies measuring binding and catalysis for model substrates have revealed basic principles of molecular recognition by RNase P. Nonetheless, information on the competition between different alternative substrates is needed to understand RNase P function in vivo. Here, we test a simple competitive model to describe the relative rates of pre-trna processing by RNase P, and apply this model to evaluate the effect of natural, genomic variation in pre-trna substrate sequence on relative processing rates. The results provide insight into the features of the kinetic mechanism of RNase P that may govern its function in vivo. These insights could potentially be relevant for other multiple substrate enzymes. Results and Discussion Application of competitive alternative substrate kinetics to pre-trna processing by RNase P- As illustrated in Scheme 2-1, a simple competitive multiple turnover mechanism allows the competition between different pre-trna substrates for processing by RNase P to be quantified (85-87). A single population of RNase P (E) combines with 33

34 multiple pre-trna substrates (S 1, S 2, S 3... S N ) to form individual RNase P-pre-tRNA complexes (ES 1, ES 2, ES 3... ES N ) that react with rate constants V 1, V 2, V 3... V N to form trna and leader products that together are represented by P 1, P 2, P 3... P N. We apply the convention of using V and V/K as the fundamental multiple turnover kinetic parameters. The parameter V is the rate constant for reaction of ES to form products and regenerate free enzyme and is equivalent to k cat. The V/K is the second order rate constant at limiting substrate concentrations (i.e. k cat /K m ) (88,89). Importantly, both S 1 and S 2 must compete with the remaining population of substrates which act as competitive inhibitors (85-87,90). As a result the expression for the ratio of the rates for conversion of S 1 and S 2 to products simplifies to, v obs1 / v obs2 = (V 1 /K 1 )[S 1 ] / (V 2 /K 2 )[S 2 ] Equation 1 Thus, the ratio of the observed rates of product formation for the two substrates depends on the ratio of their V/K values and their concentrations. The designation r k is used, below, to refer to the ratio of the V/K values for an experimental or unknown substrate relative to a reference substrate ( r k = (V/K) /(V/K) reference ) (90). As indicated in Experimental Procedures the pre-trna MET82 (+2) substrate is used as the primary reference in this study. There are two key consequences of Scheme 2-1 and consequently Eq. 1 that are important in considering the in vivo function of RNase P (85-87,90). First, the relative V/K values and consequently the observed rates of any two substrates will be independent of the presence or concentration of alternative substrates. The reason for this is that the 34

35 additional substrates act essentially as competitive inhibitors decreasing the concentration of free enzyme available for all substrates equally. Second, the relative processing rates will depend on the V/K values of the two substrates regardless of the enzyme concentration, or whether either substrate concentration is saturating. These considerations highlight that the second order rate constant at limiting substrate as an essential parameter in understanding the biological function of RNase P as it is with other enzymes. 35

36 K 1 V 1 E + S 1 E S 1 E + P 1 K 2 V 2 + S 2 E S 2 E + P 2 K 3 V 3 + S 3 E S 3 E + P 3 K N V N + S N E S N E + P N Scheme

37 Accordingly, we set out to test whether this simple competitive model describes the relative rates of pre-trna processing by RNase P, and to evaluate the effect of natural, genomic variation in pre-trna substrate structure on the kinetics of competition reactions containing multiple substrates. As described in the following sections, we first measured the V and V/K values for two well-characterized canonical and non-canonical pre-trnas using standard steady state reactions of uniform RNA populations. We used pre-steady state and single turnover kinetic analysis to determine the reaction steps that limit V and V/K. Reactions containing mixtures of both substrates were analyzed, and the simple competitive model described above was validated. Using an internal competition approach based on this model, we determined the relative rate constants for eight different pre-trnas representing the range of pre-trna structural variation at sites of RNase P contact occurring in the E. coli genome. Comparison of the multiple turnover kinetics of pre-trna MET82 and pretrna METf47 processing by E. coli RNase P- The substrates pre-trna MET82 and pretrna METf47 (Fig. 2-2) were selected as representative examples of canonical and noncanonical pre-trnas, respectively. Both pre-trnas have similar sequence length and base composition; however, they differ significantly in the nucleotides contacted by the P RNA subunit of RNase P. The pre-trna METf47, an initiator trna, has an A instead of an optimal U at N(-1) and a C(+1)-A(+72) pair at the cleavage site that results in a >900- fold decrease in the rate of catalysis by the P RNA subunit alone (36). In contrast, the RNase P holoenzyme binds both pre-trna MET82 and pre-trna METf47 with equivalent equilibrium binding constants and processes them with similar single turnover rate 37

38 constants (36). The metal ion and ph dependence of the single turnover reactions of both substrates are also comparable (2). In order to isolate the effects of trna sequence and structure that contact RNase P from secondary effects due to flanking sequences that are idiosyncratic to individual pre-trnas we use a standard substrate structure containing the trna and ten additional nucleotides to make up the leader sequences (Fig. 2-2). 38

39 Figure 2-2: Sequence and secondary structure of representative pre-trnas. The location of the RNase P cleavage site between nucleotides N( 1) and N(+1) is indicated by an arrow for each pre-trna. The N(+1)/N(+72) base pair is boxed, and the N( 1) position is indicated by a gray circle. 39

40 To evaluate the V/K for processing of pre-trna MET82 and pre-trna METf47 by RNase P the observed initial rates for both substrates are plotted against their concentrations and fit to the Michaelis-Menton equation (Fig. 2-3). v obs = VE total / (1 + K/[S]) Equation 2 The steady-state kinetic parameters V and K for both substrates are highly similar (V MET82 = s -1 ; V METf47 = s -1 and K MET82 = nm; K METf47 = nm) resulting in a r k ratio near unity (ca. 0.9, where r k = (V METf47 /K METf47 )/(V MET82 /K MET82 )) (Table 1). Fitting complete time courses of the multiple turnover reactions of pre-trna MET82 and pre-trna METf47 to the integrated Michaelis-Menton equation shows evidence of product inhibition (Fig. 2-4.). An additional approach to measure V/K from multiple turnover reactions is to analyze progress curve data using the integrated Michaelis-Menton equation: t = (K/V)(ln(S 0 /S t ) + (1/V)(S 0 - S t ) Equation 3 Although the multiple turnover time courses for RNase P cleavage of pre-trna MET82 and pre-trna METf47 fit well to the above equation, the value of V/K determined using initial rate data do not predict the observed time courses for either substrate (dotted lines in Fig. 2-4). It is observed that the kinetics are significantly slower and a much larger K value is obtained from fitting. These features are hallmarks of product inhibition, and thus we fit the progress curve data to the integrated equation including product inhibition: 40

41 t = (K/V)(1 + S 0 /K i )(ln(s 0 /S t ) + (1/V)(1-K/K i )(S t - S 0 ) Equation 4 Equilibrium binding studies as well as competitive single-turnover inhibition experiments indicate that the K d for the trna MET82 is 150 nm and trna METf47 is 100 nm (5,36). Using these values for K i in the above equations provides a much improved fit of the data (solid line in Fig. 2-4). The values of V and K for the two substrates obtained by this method are ca. 2-fold lower than those obtained from analysis of the initial rate data, however, the values of V/K are highly similar. The multiple turnover kinetic parameters, V and V/K, estimated by both approaches are highly similar for both substrates despite their significant difference in structure (Table 1). Next, we asked whether the similar V and V/K values for the two substrates reflect the same or different rate limiting steps. 41

42 Table 2-1. Rate constants for processing of pre-trna MET82 and pre-trna METf47 by RNase P substrate V (s -1 ) K (nm) V/K (M -1 s -1 x 10 6 ) pre-trna MET82 MM PC pre-trna METf47 MM PC MM, Michaelis-Menton; PC, progress curve. Table 2-2. Rate constants for processing of pre-trna MET82 and pre-trna METf47 by RNase P k 1 k -1 K m, calc K d, calc K d, obs (Ca 2+ ) 1 (M -1 s -1 x 10 6 ) (s -1 x 10 4 ) nm nm nm pre-trna MET pre-trna METf From Sun et al., 2006 (5) 42

43 Figure 2-3: Multiple turnover and pre-steady state kinetics of pre-trna MET82 and pretrna METf47 processing by RNase P. A) the observed initial rates for pre-trna METf47 (open symbols) and pre-trna MET82 (filled symbols) processing by E. coli RNase P were determined and normalized to the total enzyme concentration, (v/[e] total ) as described under Experimental Procedures. These data are plotted as a function of the initial substrate concentration and fit to the Michaelis-Menten equation as described below. Note that the substrate concentrations are shown on a log scale to better display the range of concentrations tested. Error bars indicate S.D. B) pre-steady state kinetics of pretrna METf47 (open symbols) and pre-trna MET82 (filled symbols) at 5 and 10 nm RNase P concentration. The maximal predicted burst amplitudes for these two reaction conditions are indicated on the y axis. 43

44 44

45 Figure 2-4: Progress curve analysis of pre-trna MET82 and pre-trna METf47 multiple turnover kinetics. The kinetics of substrate depletion from reactions containing 400 nm pre-trna MET82 (A) or pre-trna METf47 (B) substrate and 2 nm RNase P were analyzed by fitting to the integrated Michaelis-Menten equation as described below. Simulations in which the V and V/K determined from analysis of initial rate data are shown as dotted lines. The solid lines show fitting the data to a model assuming product inhibition as described below. 45

46 Pre-steady state kinetic analyses to evaluate the reaction step that limits V- The kinetics of pre-trna cleavage at increasing concentrations of RNase P were examined to determine the predominant form of the enzyme that is populated at steady state (ES or EP, in Scheme 2-1). As shown in Fig.2-3B, for reactions in which RNase P (5 nm and 10 nm) and the pre-trna substrate (500 nm) are both present at concentrations in excess of K d (> 1 nm), there is a linear increase in product concentration that extrapolates back to the origin. Reactions with either 5 or 10 nm RNase P result in product formation that increases linearly with no evidence for a pre-steady state burst. A simple interpretation of this result is that the net rate constant for dissociation of products and regeneration of free enzyme (k 3 in Scheme 2-2) is faster than substrate cleavage (k 2 ). Therefore, ES is the predominate form of the enzyme that accumulates at steady state. This result contrasts with the kinetic mechanism of B. subtilis RNase P which is limited by product release for a canonical pre-trna ASP substrate (7). Single turnover kinetics to evaluate the reaction step that is rate limiting for V/K- An important observation relevant to the reaction mechanism of E. coli RNase P is that the observed K (where K = (k off + V)/k on in Scheme 2-1; i.e. K m ) (53) from multiple turnover kinetic analyses is greater than the independently measured equilibrium dissociation constant, K d (310 nm versus 0.5 nm for pre-trna MET82 and 280 nm versus 0.3 nm for pre-trna METf47 ) (Tables 1 & 2). This result implies that the net rate constant for cleavage to regenerate free enzyme (V = k 2 k 3 /k 2 +k 3 ) in Scheme 2-2) is fast relative to substrate dissociation (k -1 ) (91-93). It follows that at limiting substrate concentration the rate of multiple turnover could therefore be limited by substrate association (89). 46

47 To test these predictions we determined the relative magnitudes of the rate constant for catalysis, k 2, and the rate constant for substrate dissociation, k -1 using a sequential mixing or isotope trapping experiment (84). The RNase P-pre-tRNA complex was formed by mixing limiting substrate (1-2 nm) with a saturating concentration of enzyme (100 nm). At an intermediate time an excess of non-radiolabeled substrate is added. If k 2 is fast relative to k -1, then there will be little dissociation of the remaining RNase P-pre-tRNA complexes over the remaining time course of the reaction, and correspondingly no effect on the accumulation of product. Alternatively, if substrate dissociation is fast relative to catalysis (k -1 >> k 2 ), then the addition of non-radiolabeled substrate will quench the formation of radiolabeled product. The dependence of the observed pseudo first order rate constant on enzyme concentration showed saturable behavior as predicted based on Scheme 2-1 (data not shown). These data allowed reaction conditions to be determined under which all of the radiolabeled pre-trna is present in the ES complex. As shown in Fig. 2-5A & B, 47

48 addition of a cold substrate chase after formation of ES did not result in quenching or a change in reaction kinetics. In contrast, addition of the excess non-radiolabeled substrate at the start of the reaction resulted in the expected slow, multiple turnover kinetics. Therefore, we concluded that substrate dissociation is negligible over the remaining time course of the reaction (k 2 >> k -1 ). An important implication of this result is that the substrate association rate constant, k 1, can be measured from the concentration dependence of the single turnover reaction (k obs versus [E]) (54). Fitting the dependence of k obs to [E] at concentrations below K 1/2 permits k 1 and k -1 to be estimated as the slope and intercept (Fig. 2-5C). The kinetic parameters for both substrates are similar ( x 10 6 M -1 s -1 and x 10 6 for pre-trna MET82 and pre-trna METf47, respectively) (Table 2). The estimates for k -1 from second order analyses are less than the V for both substrates. In this case the observed K for the multiple turnover reaction will be approximated by V/k 1 (91). The experimentally measured values of these kinetic parameters result in calculated K values of nm for pre-trna MET82 and nm for pre-trna MET47. These calculated values are within 2-fold of the experimentally observed K determined from analysis of initial rate data (Tables 1 and 2). It is possible that differences in the reaction ph or errors in the determination of concentrations of substrate and enzyme account for this difference. From the definitions for V and K, above, it follows that V/K = k 2 k 1 /(k -1 + k 2 ) (89). Thus, when k 2 >> k -1 then V/K k 1. Therefore, the most simple interpretation of the presteady state and single turnover results is that the cleavage step (k 2 ) is rate limiting for V (i.e. at saturating substrate concentrations) and that V/K reflects the association step (k 1 ) for both pre-trnas at limiting substrate concentrations. 48

49 Figure 2-5: Single turnover kinetics of pre-trna MET82 and pre-trna Metf47 processing by RNase P. A) single turnover sequential mixing experiment with initial concentrations of 1 nm pre-trna MET82 and 100 nm RNase P. At the time indicated by the vertical dotted line, the reaction was divided, and one fraction was combined with a high concentration (5 μm) of nonradiolabeled pre-trna (circles). Time points were continuously collected from the remaining fraction (squares). As a control, an identical reaction was combined with nonradiolabeled substrate before the addition of enzyme (triangles). B) single turnover sequential mixing experiment using pre-trna Metf47 performed as described in panel A. C) second order analysis of RNase P binding of pretrna MET82 and pre-trna Metf47 to increasing concentrations of RNase P. The pseudo-first order rate constants (k obs ) determined for a single turnover reaction containing nm RNase P concentrations are plotted versus [E]. These data are fit to a linear function k obs = k 1 [E] + k 1 to determine the rate constants reported in Table 2. Error bars indicate S.D. 49

50 50

51 Competitive alternative substrate kinetics of pre-trna MET82 and pre-trna METf47 processing by RNase P- As introduced above, in competitive multiple turnover reactions the relative rates for two competing pre-trnas are expected to be determined by their relative V/K values ( r k = (V/K)/(V/K) reference ) and their concentrations (46-48,51). Also, it follows that the presence of additional substrates will decrease the observed rates for all substrates in the reaction due to competition for free enzyme, but should not affect the r k value for any two substrates (85,94,95). We tested the competitive alternative substrate model for RNase P by analyzing the competitive kinetics of reactions containing both pre-trna MET82 and pre-trna METf47. To simultaneously measure the reaction kinetics of two pre-trnas in the same reaction we used a reference substrate in which two additional G residues are added to the 5' end of the leader sequence. This modification allows the products of the pretrnas to be distinguished by their mobility on denaturing PAGE and quantified individually. An example of the primary data from this approach for pre-trna MET82 (+2) and pre-trna METf47 is shown in Fig. 2-6A. The precursor band contains both substrates as these species are not resolved under these gel conditions. However, the products from the two substrates are readily distinguished and quantified allowing relative rates of product formation to be measured. To address the effect of the additional nucleotides on pre-trna processing the relative rate constants for comparison of pre-trna MET82 to pretrna MET82 (+2) and comparison of pre-trna METf47 to pre-trna METf47 (+2) were also measured (data not shown) and were observed to be Thus, the presence of the additional nucleotides required to distinguish the products from two substrates has essentially no effect on the rate of RNase P processing. 51

52 Figure 2-6: Competitive multiple turnover reactions containing both pre-trna Metf47 and pre-trna MET82 (+2). A) PAGE analysis of the products of a reaction containing 5 32P end-labeled pre-trna Metf47 and pre-trna MET82 (+2). The two precursors run as a single band indicated by a bracket denoting the presence of both substrates. The two leader sequence products are indicated by lines with or without the additional guanosines that identify the product from pre-trna MET82 (+2). B) plot of the observed multiple turnover rate constants (v obs ) for pre-trna Metf47 (open symbols) and pre-trna MET82 (filled symbols) as a function of the relative concentrations of the two substrates. The data are fit to the log form of Equation 2 (Equation 5). C) plot of the observed multiple turnover rate constants (v obs ) for pre-trna Metf47 (open symbols) and pre-trna MET82 (filled symbols) as a function of the concentration of the third substrate pre-trna LEU76. The data are fit to a mechanism in which pre-trna LEU76 acts as a competitive inhibitor (Equation 6). The inset shows the individual r k values determined from dividing the observed rate for pretrna Metf47 by the observed rate for pre-trna MET82 at each of the different pre-trna LEU76 concentration. The solid line and dashed lines represent the average and standard deviation, respectively, calculated from this data set. 52

53 53

54 To compare the competitive kinetics of pre-trna METf47 and pre-trna MET82 the observed rates of product formation were determined for reactions containing substrate concentration ratios (in nm) of 10:100, 100:10 10:10 and 100:100 (pre-trna METf47 :pretrna MET82 ). The product ratios from at least three time points taken under steady state conditions were averaged and then corrected for the relative substrate concentrations. Additionally, the collection of data for the observed rates as a function of the relative rates of the two substrates were fit to the logarithmic form of Eq. 1, log(v 2 /v 1 ) = log r k + log (S 2 /S 1 ) Equation 5 where v 2 /v 1 is the ratio of the observed initial rates for pre-trna METf47 relative to pretrna MET82 (+2). Analysis of the data in this manner allows determination r k from the combined data set. As shown in Fig. 2-6B the data for both the pre-trna METf47 /pretrna MET82 (+2) and the pre-trna METf47 (+2)/pre-tRNA MET82 combination of substrates fits this relationship as predicted. Fitting to Eq. 3 yields an r k value of 0.5 ((V/K) MET47 /(V/K) MET82 ) for the reaction in which the pre-trna MET82 was modified to contain the additional two leader nucleotides. As a control, the r k was measured in competitive reactions in which pre-trna METf47 instead of pre-trna MET82 was lengthened in order to distinguish the products from the two substrates. A similar value of 0.6 was observed consistent with the value measured in which the pre-trna MET82 (+2) was used for the reference substrate. An additional prediction of the internal competition model is that the addition of a third substrate will not affect the r k value for these two substrates. Accordingly, we 54

55 tested the effect of increasing concentrations of a third substrate on the observed rates of pre-trna MET82 (+2) and pre-trna METf47 product formation. In Scheme 2-1 additional substrates act as competitive inhibitors that decrease the observed rate of processing of both substrates by competing for free enzyme. In Fig. 2-6C, pre-trna LEU76 is added as a competitive alternative third substrate in reactions containing pre-trna METf47 and pretrna MET82 (+2) as the reference substrate. Increasing concentrations of non-radiolabeled pre-trna LEU76, which binds to RNase P with similar affinity as the other two pre-trnas in the reaction, decreases the observed rates of pre-trna METf47 and pre-trna MET82 (+2) processing as expected. The data fit to a simple competitive inhibition model derived from Scheme 1, v obs = V 1 E total /(1 + K 1 /S 1 + S 2 /K 2 + S 3 /K 3 ) Equation 6 where S 1 is the concentration of the labeled substrate, S 2 and S 3 are the concentrations of the competitive alternative, non-labeled substrates in Fig. 2-6C. Analysis of the observed rates data for pre-trna MET82 and pre-trna METf47 in the presence of 10 nm to 3000 nm pre-trna LEU76 allows the K value for pre-trna LEU76 to be estimated. A value of ca. 300 nm is obtained, which is similar to the values measured by analysis of initial rate data for reactions containing pre-trna MET82 or pre-trna METf47 alone (Table 1). Nonetheless, as demonstrated in the inset in Fig. 2-6C the ratio of the observed rates, the r k for pretrna METf47 referenced to pre-trna MET82 (+2) is independent of the presence and concentration of a competing substrate. Since the r k value for two competing substrates is insensitive to a third competing substrate, the internal competition approach could be 55

56 used to determine the r k values for substrates in reactions containing more complex populations. Determination of relative rate constants for pre-trnas in complex substrate populations by internal competition- It follows from Scheme 1 and the observations documented above, that the presence of additional substrates, regardless of their number or concentration, should also have no effect on the relative rates of processing of any two substrates in the population. To test this concept, we generated five pre-trna substrates in addition to pre-trna MET82, pre-trna METf47 and pre-trna LEU76. Substrates were selected to span the range of pre-trna structure variation encountered by E. coli RNase P in vivo, and their secondary structures are shown in Fig Among these similar pretrna HIS and pre-trna SER substrates have served as substrates for analyzing the determinants of specificity adjacent to the site of 5 processing (96). We used the same approach, described in the preceding section, of distinguishing between the products of two substrates by analyzing the relative rate constants of pretrna MET82 (+2) and pre-trna METf47. Since the two substrates of interest are the only species that are radiolabeled, their products alone are detected. As shown in Fig. 2-9, the r k determined by this method (0.3) is within error of the value of 0.5 determined by analysis of the two substrates alone. Thus, the presence of additional competing substrates in the reaction does not have an appreciable effect on the magnitude of the relative V/K for pre-trna METf47 and pre-trna MET82 (+2). Next, we determined the r k values for the remaining seven substrates using the pretrna MET82 (+2) as the reference substrate. As shown in Fig. 2-7, the r k value for the pretrna LEU76 substrate is readily determined by this approach. This substrate has an r k of 56

57 3.5 indicating faster processing of pre-trna LEU76 over the reference pre-trna MET82 when they compete for RNase P processing. For this particular substrate the pre-trna can be resolved from the unreacted reference pre-trna MET82 (+2). This allows the change in the relative concentrations of the residual substrates to be quantified as well. As shown in Fig. 2-7B, we took advantage of internal competition analyses typically used to measure the relative rate constants for isotope effect measurements (90). The slower reacting pretrna will become progressively enriched in the residual substrate population and the relative rate constant can be determined by analyzing the change in substrate ratio as a function of the fraction of reaction. Using the ratio of residual precursor concentrations derived from the ratio of radiolabeled precursor bands the r k for pre-trna LEU76 was determined by fitting to, ln(r s ) = ( r k 1)ln(1- f) ln(r 0 ) Equation 7 where R 0 is the ratio at the start of the reaction and R s is the ratio at fraction of reaction (f) of the reference substrate (Fig. 2-7C) (90,97). The fraction of reaction for pretrna MET82 (+2) is determined from the intensity of its precursor and product bands. As expected the faster rate constant for the pre-trna LEU76 substrate results in faster depletion of this substrate from the residual precursor population relative to the slower reacting pretrna MET82 (+2). As a result the pre-trna LEU76 /pre-trna MET82 (+2) ratio becomes progressively smaller as the reaction progresses. An essentially identical r k value of 3.4 is obtained from the fitting of the data shown in Fig. 2-7B. 57

58 Figure 2-7: Analysis of the relative rate constant for processing of pre-trna LEU76 by internal competition. A) PAGE analysis the observed rates of processing determined by quantification of both precursor and product bands in a background population of 100 nm each of the eight pre-trnas shown in Fig 2-2. Note that in this case, the larger trna of pre-trna LEU76 results in sufficient separation of the two substrates such that the precursor bands can be distinguished. B) determination of the r k value for pre-trna LEU76 by internal competition kinetic analysis of the depletion of the faster reacting substrate in the residual precursor population. The graph shows the natural log of the ratio of the two substrates plotted as a function of the total fraction of reaction. These data are fit to an integrated form of the relative rate constant equation (90,97). 58

59 59

60 Interestingly, in the course of experiments to determine the r k for pre-trna SER80 we detected two cleavage products in addition to correct RNase P cleavage at the mature trna 5 end. As shown in Fig. 2-8A the reaction of pre-trna MET82 (+2) yields a single product as expected, while the pre-trna SER80 substrate gives rise to three products (labeled P1, P2 and P3 in Fig. 2-8). The P1 product maps to the expected site for RNase P processing between N(-1) and N(1). The P2 product is derived from miscleavage one nucleotide 5 to the authentic site yielding a product one nucleotide smaller. Cleavage to give the P3 product occurs five nucleotides upstream of the correct site. RNase P cleavage in the leader sequence is not expected, although several studies have demonstrated the ability of the RNase P holoenzyme to cleave unstructured RNA, but with sequence or structure specificity that is not yet well defined (73,98). Alternatively, cleavage may result from alternative RNA folding (99,100). The r k for the miscleavage at P2 occurs at essentially the same rate as P1 (both have an r k of ca. 0.6). Surprisingly, the r k value for miscleavage of the pre-trna SER80 substrate at P3 occurs with an r k that is significantly higher (2.2). Although the precursors of both substrates can be resolved, the fact that pre-trna SER80 reacts to form multiple products precludes determination of its relative rate constant by analysis of precursor ratios by Eq. 5. Nonetheless, as demonstrated in Fig. 2-8B the relative rates of accumulation of the three products of pre-trna SER80 are readily distinguished. For the pre-trna GLY62, pretrna ILE1, pre-trna HIS37, and pre-trna GLN85 substrates the r k values were determined relative to pre-trna MET82 (+2) from analyzing the initial rates of formation of the products. The r k values for all eight substrates, shown as the natural log to provide a 60

61 linear scale, are compared in Fig. 2-9 together with the values for the alternative products for pre-trna SER80. In this study we have taken a new approach to analyzing substrate recognition by RNase P by applying the perspective of alternative substrate kinetics. In addition to providing insight into the enzymatic behavior that underlies its biological function, the framework described here is useful for extracting relative rates by internal competition. With the simple competitive substrate kinetics of RNase P established, more broad application of competitive kinetics may be considered. In principle, internal competition methods are applicable to very large populations of substrates so long as reaction progress and the ratios of their precursors or products can be quantified as will be seen in the following chapter. 61

62 Figure 2-8: Determination of relative rate constants for pre-trna SER80 cleavage and miscleavage by internal competition. A) PAGE analysis of the precursor and products of the competitive cleavage reaction containing 5 end-labeled pre-trna SER80 and pretrna MET82 (+2) in a background of the eight different E. coli pre-trna shown in Fig. 2. The large variable arm of pre-trna SER80 results in a substrate that is 15 nucleotides longer than pre-trna MET82 (+2), which can be resolved under these gel conditions (15% PAGE). The pre-trna SER80 substrate is cleaved by RNase P to give three products: the correct cleavage product resulting from cleavage 5 to N(+1) (SER P1), miscleavage one nucleotide 5 into the leader sequence (SER P2), and miscleavage four nucleotides 5 to the correct cleavage site (SER P3). All of these products are resolved from the single cleavage product resulting from processing at the authentic 5 end of pre-trna MET82 (MET82). B) plot of product accumulation versus time for the products indicated in panel A showing the initial rates of product formation for the Ser P1 (open circles), Ser P2 (filled circles), and Ser P3 (open triangles) products relative to the accumulation of the product from pre-trna MET82 processing (squares). C) secondary structure of pretrna SER80 with arrows indicating the location of the three cleavage sites in pretrna SER80 by RNase P. 62

63 63

64 Figure 2-9: Histogram of r k values for different pre-trna substrates. The individual r k values are presented as their natural log so that the length of the bar is linearly proportional to the difference from the reference substrate for substrates that are faster and slower than the reference pre-trna MET82 (+2). For the pre-trna METf47 substrate, the bar indicating the r k determined by calculation from the individually measured V/K values is indicated by an asterisk. The r k values for the three cleavage products of pre-trna SER80 are indicated by P1 P3. 64

65 Chapter 3 - Simultaneous Determination of Processing Rate Constants for All Individual RNA Species Processed by RNase P Introduction Typically RNA binding proteins bind their specific proteins based on distinct structures, sequences or both (101,102). Despite this general property, some proteins bind RNAs in a non-specific manner, in which distinct sequence and structure motifs are not apparent from analysis of known physiological substrates. To date it is unclear how substrate affinities for non-specific RNA binding proteins are impacted by sequences or structure variation at cognate sites, aside from effects of RNA structure ( ). There are currently many detailed models that link affinities to RNA sequence or structure variation for specific RNA binding proteins. These models show the highest affinities for distinct physiological binding sites ( ). Based on this idea, it is assumed, but not proven, that an absence of specific sequence or structure for nonspecific RNA binding proteins reflects an inability to discriminate between different RNA sequences. This difference between specific binding sites and non-specific binding sites for RNA binding proteins has recently become a topic of great interest due to the emerging importance of RNA binding proteins with broad specificity. Many studies have been performed on specific RNA binding proteins but the binding modes and determinants of affinity non-specific RNA binding proteins is not as clear. Therefore, the C5 subunit of E.coli RNase P, which functions essentially as a non-specific RNA binding protein, will be used to systematically probe substrate discrimination for this class of macromolecular associations (31). 65

66 The C5 subunit of E. coli RNase P is a useful model system for addressing mechanisms of multiple substrate recognition by RNA binding proteins. E. coli RNase P is responsible for processing the 87 genomically encoded pre-trna leader sequences. It has been established that the bacteria RNase P protein, C5, interacts with the leader sequence of pre-trna (Fig 3-1). Previous studies have shown conserved residues in the leader sequence (47) but where the C5 protein binds, there does not appear to be any specificity among genomically encoded pre-trnas. The C5-5 leader sequence interaction is thought to compensate for weakened binding from sequence and structural variation in pre-trna sequence. To further investigate this idea, we are currently utilizing a novel high-throughput deep sequencing method (high throughput sequencing kinetics, HTS-KIN) in collaboration with the laboratory of Dr. Eckhard Jankowsky, for determining the relative rate constants of all members of a large population of RNA sequence variants. Importantly, the interpretation of HTS-KIN data takes advantage of the simple competitive substrate inhibition model developed and validated in the previous chapter. We are using this approach to identify pre-trna leader sequences that interact with the C5 protein. We anticipate that the results from these experiments will not only give insight into protein-rna interactions that control catalytic efficiency, but also will optimize this method for many future experiments directed at understanding RNA binding specificity. HTS-KIN was developed based on a publication by Ferre-D Amare (109) in which the change in distribution of sequences of a ribozyme in a large population over the course of several rounds of in vitro selection was followed. We adapted this approach to permit the measurement of relative rate constants of different sequences in a 66

67 population simultaneously. In a mixed population of RNA with different sequences, the faster reacting (or binding) sequences will become progressively depleted in the unreacted substrate population, and the slower reacting sequences will become progressively enriched. Using this principle the basic strategy of HTS-KIN involves: 1. Synthesis of a population of RNA (pre-trna leader sequence variants in this case) in which specific nucleotides comprising a potential recognition site are randomized; 2. Reaction or binding of the randomized population followed by purifying the residual, unreacted substrate from various intervals of time or concentration; 3. Determination of the distribution of each sequence in the residual pre-trna population by Illumina sequencing; and, 4. Quantitative analysis of the concentration or time dependence of the distribution of each sequence in the residual, unreacted population to calculate the relative rate or binding constants for all members of the population simultaneously. My contribution has been in the validation of initial HTS-KIN experiments by analysis of individual sequence variants and continued application of HTS-KIN to further analyze the interaction between pre-trna sequence and processing rate. 67

68 Figure 3-1: The pre-trna 5 leader (purple, with purple and orange spheres for the phosphorous and non-bridging oxygens, respectively) was modeled as a polyphosphate chain with five phosphates (P 1 to P 5 ). The leader follows a highly conserved patch in the protein extending from the 5 end of the mature trna (red) and away from the P RNA. The addition of a 5 leader with metal (Sm 3+ ) reveals a second metal ion (M2) (75). Figure permission License Number:

69 We applied the general HTS-KIN approach to understand the molecular recognition properties of the C5 protein and its contribution to RNase P processing of pre-trna in the following way. It has previously been determined that the leader sequence between positions N(-3) and N(-7) are in contact with the C5 protein (6,47,58) (Fig 3-2). Investigation of the pre-trna genomic sequences did not reveal any significant conservation between sequences that would be responsible for tight or weak binding. Further information obtained by determining specific leader sequences responsible for uniformity in pre-trna processing has the potential therefore to advance our understanding of RNase P biology in two ways. First, it will provide a more comprehensive model for substrate association, and second, it will provide a better understanding of the fundamental principles that govern enzyme specificity in general. My contribution to this collaborative project has been to further characterize the kinetics of 5 chosen hexamers, known henceforth as L1-L5. These results were then matched to the results obtained through the HTS-KIN analysis as a verification of the method. In addition to the initial HTS-KIN experiments, I have continued this project in order to both improve the methodology and advance the scope of our research. In continuing to characterize the five chosen hexamers past the initial experiments, we executed multiple turnover kinetics on the individual hexamers as a single species population. Interestingly, as single uniform populations, the relative rates of the variants do not replicate the results of HTS-KIN while in reactions where they are in competition, the results conform precisely to the expectations from the relative rate constants measured using the new method. The simplest explanation consistent with the alternative substrate kinetic model used to describe the reaction is that HTS-KIN measures effects on 69

70 V/K since substrates are in competition whereas for reactions performed at saturating, uniform substrate concentrations the observed rate is determined by V. As described in Chapter 3, the kinetic parameters V and V/K represent different rate limiting steps in the E. coli RNase P kinetic scheme. These discoveries and the results explained in more detail below, led to our interest in repeating the HTS-KIN experiment using single turnover kinetics in which the effects of substrate variation on V and V/K can be distinguished. A complicating factor in the initial application of this method is that the pretrnas have additional nucleotides on their 5 end that are used in subsequent RT-PCR and Illumina sequencing steps. Because of the potential for these sequences to influence the outcome of the experiment by forming alternative, inhibitory RNA structures, it is of utmost importance to further investigate the effects of substrate structure and sequence on an individual substrate basis in the course of future HTS-KIN experiments. Several factors argue that the extended 5 leader sequences are unlikely to interfere with RNase P processing. First, pre-trnas can be encoded in polycistronic genes along with rrna and mrna, as well as in single trna genes. These pre-trnas are separated from transcripts containing mrnas and other trnas primarily by RNase E (110,111). Additionally, analysis of the bacterial genome sequences and the structures of pre-trna precursors in vivo demonstrate that the leader sequence length can vary among precursor trnas. Nonetheless, the potential for formation of alternative structures is fundamental to the molecular biology of RNA and thus this issue needs to be properly evaluated. Accordingly, we analyzed the multiple turnover kinetics of the L1-L5 leader sequence variants, without the extended sequences and compare their processing rates. 70

71 P protein protection Figure 3-2: Hydroxyl radical protection analysis of pre-trna binding to E. coli P RNA and RNase P holoenzyme. Compiled protection data for nucleotides 10 to +35 shown on the secondary structure for pre-trna. Each nucleotide is displayed as a circle. The data for protection by P RNA are indicated by blue circles, and the data for RNase P holoenzyme are indicated by red circles. Adapted from Sun et al (36). 71

72 Figure 3-3. The randomized portion of the pre-trna MET82 leader sequence, N(-3)-N(-8). Image courtesy of Drs. Michael Harris and Ulf-Peter Gunther. 72

73 Experimental Design As introduced above, my contribution has been the analysis of individual substrate 5 leader sequence variants, however, it is necessary to review how they were identified using HTS-KIN in collaboration with the laboratory of Dr. Eckhard Jankowsky. Briefly, to examine how the C5 protein from E. coli RNase P discriminates between all possible sequence variants of its binding site, a population of pre-trna MET82 was synthesized in which the nucleotides that interact with the C5 protein were randomized (Fig 3-3). An initial test of the randomized pool consisted of simple multiple turnover kinetics in which the processing of the random RNA was compared with the kinetics of the reference pre-trna, pre-trna MET82, alone (Fig 3-4). This experiment showed a difference in processing rates between the randomized pool and a uniform population of pre-trna MET82. The conclusion from this result is that there must be sequences that react faster/slower/differently than pre-trna MET82 and thus developing the HTS-KIN method is expected to reveal the identity of sequences with reaction kinetics distinct from the genomically encoded sequence. The processing rates of this population of over 4,000 sequences were then measured under multiple turnover reaction conditions. Because the sequence variants under these conditions are in competition and therefore are limited by each other s V/K values, we were able to directly observe the discrimination by C5 for certain sequence variants in the ongoing reaction as further described below. Multiple turnover kinetics with the randomized population of pre-trnas were performed and relative rate constants determined by quantifying the change in the number of specific sequence variants over time by analysis of the distribution of varients using high-throughput sequencing (112,113). Results are presented as r k as defined above as the ratio of the V/K. In this case the r k is computed for each sequence variant 73

74 relative to the encoded leader sequence for pre-trna MET82 (AAAAAG). Thus, a r k of 1 means that a particular sequence has the same V/K as the reference sequence and variants with an r k >1 have larger V/K and variants with r k < 1 have a lower V/K. Preliminary results (Fig. 3-5) were displayed in a histogram relative to the native sequence, pretrna MET82. It is apparent from these results that a significant number of sequence variants (~1/3) reacted faster than the pre-trna MET82 leader sequence ( r k >1). This indicates that the physiological leader sequence, or genomically encoded leader sequence, is not an optimal sequence for the C5 protein subunit. In the C5 protein distribution there is a clear sequence logo from the fastest sequences and a less clear sequence logo from the slowest sequences (Fig 3-6). What is interesting from the standpoint of in vivo RNase P function, is that none of the genomic sequences for pre-trna fall into the fast sequences, instead they all fall near the median of r k values. These observations are similar to those seen in specific RNA binding proteins, showing similarities between classically defined non-specific RNA binding proteins and specific RNA binding proteins as pointed out by our collaborators in the Jankowsky laboratory. The clear difference seems to be that the high affinity sequences for the specific RNA binding proteins are physiological whereas in the RNase P non-specific example, they are not (114,115). From this data, 5 sequences were selected for further study as shown in Table 3-1 and Fig 3-5. My contribution has been to analyze these individual sequence variants with different r k values and examine their processing kinetics directly in order to validate the quantitative HTS-KIN analysis. 74

75 Figure 3-4 Difference between random pool and single substrate multiple turnover kinetics. Figure adapted from Guenther, Yandek, Niland, Campbell, Anderson, Anderson, Harris and Jankowsky. Hidden rules govern discrimination by a non-specific RNA binding protein. 75

76 Table 3-1. Observed r k values for hexamers selected for further kinetic study. *Ref Sequence is AAAAAG R Timepoint 1 Timepoint 2 Hexamer Exp. 1 Exp. 2 Exp. 1 Exp. 2 (L) TTATAT (L1) TCAGAC (L2) ATTCAA (L3) CGTCAG (L4) CTCCTG (L5) 76

77 Figure 3-5. Histogram of distribution of random population in comparison to pretrna MET82. The boxed sequences are the five hexamers selected for further evaluation. Adapted figure courtesy of Drs. Ulf-Peter Gunther and Eckhard Jankowsky. 77

78 Figure 3-6: The sequence logo of the fastest 1% of all sequences analyzed through the HITS-KIN method. Adapted figure courtesy of Drs. Ulf-Peter Gunther and Eckhard Jankowsky. 78