Supplementary Information for

Size: px
Start display at page:

Download "Supplementary Information for"

Transcription

1 Supplementary Information for Conformational landscapes of DNA polymerase I and mutator derivatives establish fidelity checkpoints for nucleotide insertion Hohlbein et al. 1

2 Supplementary Figure S1. Wt KF forms an intermediate FRET species at 1 mm dttp. (a) E* histograms for the binary complex (row 1) and the ternary complex (1 mm dttp, A- dttp) fitted with either a fixed (E* = 0.45, row 2) or a floating (row 3) peak position of the low- FRET Gaussian (data from Figure 2A). In row 2, the residuals show systematic deviations along the E* axis, diagnostic of a poor fit for the fixed Gaussian. Specifically, the fit overestimates the number of molecules in the open conformation. Fitting the low-fret Gaussian of the ternary complex without fixing the peak position resulted in E* 1 = 0.51 (row 3), improving the fit and removing the systematic deviations in the residuals. (b) PDA for the ternary complex of panel A, row 2. Assuming a dynamic equilibrium between two states results in a poor fit (red line) with 2 = 4.6 for a fixed peak position of the low-fret Gaussian E* 1 = 0.45 (E* 2 = 0.66, 1 = 0.21, 2 = 0.22, k 1 = 180 s -1, and k -1 = 26 s -1 ). (c) PDA for the ternary complex of panel A, row 3. PDA using a fixed peak position of E* 1 = 0.51 results in an improved fit (red line) compared to panel B ( 2 = 1.5 for E* 2 = 0.66, 1 = 0.22, 2 = 0.22, k 1 = 150 s -1, and k -1 = 25 s -1 ). 2

3 Supplementary Figure S2. Examples of nucleotide titrations. As described in Figure 2, all FRET histograms were fitted to a double-gaussian function (black lines, sum of Gaussians; grey lines, individual Gaussians). The DNA concentration was 100 nm for all experiments with binary and ternary complexes. (a) Titration of the Pol-DNA binary complex (DNA1, templating base A, Fig. 1d) with the complementary nucleotide, dttp. As the concentration of the correct nucleotide is increased, there is a gradual increase in occupancy of the high-fret species and a concomitant decrease in the occupancy of the low-fret species. The two dotted vertical lines mark the mean E* values of the FRET species in the binary complex (open and closed). The titration was completed in a single day. (b) Titration of the Pol-DNA binary complex (DNA1, templating base A, Figure 1D) with the mismatched nucleotide, dgtp. As the concentration of the incorrect nucleotide is increased, there is a peak shift of the low-fret species. Since the titration in panels a and b were performed in the same day, rows 1 and 2 are identical in (a) and (b). The dashed line marks the mean E* value of the intermediate-fret species as populated in the ternary complexes. 3

4 (c) Titration of the Pol-DNA binary complex (DNA 2, templating base G, Fig. 1d) with the complementary nucleotide, dctp. We plotted the fraction of the molecules in the high-fret species (black circles) and the peak shift of the mean E* of the low-fret species (blue triangles) as a function of nucleotide concentration. The peak shift was normalized relative to the E* difference between the means of the open and closed conformations. The data were globally fitted with the four-state model as described (Fig. 3c), which allows for simultaneous fitting of the two FRET observables. (d) Titration of the Pol-DNA binary complex (DNA1, templating base A, Fig. 1d) with the mispaired ribonucleotide, rgtp. (e) Titration of the unliganded polymerase with dgtp and rgtp. For fitting the data, we used a four-state model as described (Fig. 3c), but without the presence of DNA. 4

5 Supplementary Figure S3. Nucleotide titrations for E710Q and the Y766 derivatives. Ternary complexes were formed after addition of various nucleotides to E710Q (a) and Y766A (b), and Y766F (c) in presence of DNA1. Titrations and style as in Figures 3a, 3b and Supplementary Figures S2c, S2d, S2e, but with KF mutants instead of wt KF. Data points with error bars are represented as mean +/- s.e.m., derived from three independent measurements. 5

6 Supplementary Figure S4. PDA analysis for the binary and ternary complex of wt KF. (a) E* histogram and PDA-generated fit for the binary complex of wt KF and DNA1. PDA analysis and figure style as in Supplementary Figures S1b and S1c. The best fit ( 2 <2) is obtained for rates of k 1 = 15 s -1 (transition from the open to the closed conformation) and k -1 = 85 s -1 (transition from the closed to the open conformation). Further model parameters: E* 1 = 0.45, E* 2 = 0.68, 1 = 0.22, 2 = (b) E* histogram and PDA-generated fit for the mismatched ternary complex of wt KF, DNA1, and dgtp. For the fit, we combined the experimental data for 1 mm and 2 mm A-dGTP. The best fit is obtained for k 1 = 3 s -1 and k -1 = 45 s -1. Further parameters used: E* 1 = 0.48, E* 2 = 0.66, 1 = 0.22, 2 =

7 Supplementary Figure S5. E* width analysis of the low-fret distribution. (a) At nucleotide concentrations much lower or much higher than K d1 for the interaction of an incorrect nucleotide with the binary complex, the width of the low-fret Gaussian resembles that of either the open conformation (upper panel) or the partially-closed conformation (lower panel). At nucleotide concentrations equal to K d1, however, both O and IN states (see Fig. 3c) are equally populated and the FRET distribution for the low-fret species is a convolution of both states (grey distribution in the middle panel). (b) Even though we cannot resolve the individual FRET states at K d1, we expect a broadening of the FRET species depending on the timescale of dynamics between the two states: for interconversions slower than the corresponding diffusion time through the confocal focus (~3 ms), we expect an increase in the FRET width with a maximum around K d1 at which both states (O and IN) are equally populated. (c) The FRET width of low-fret species at different nucleotide concentrations is normalized versus the width of the corresponding binary complex. Each data point is calculated from three independent measurements; error bars represent the s.e.m. for the fits of three independent measurements. The calculated K d1 values and their s.e.m. (Table 1), shown as shaded areas, 7

8 indicate the concentration range where any maximum in the FRET width for the convolved low- FRET species would be expected. The results show a broadening of the low-fret species around the K d1 for most, if not all, nucleotides. (d) FRET width-analysis of simulated data as a function of interconversion rates. Simulations representing the four-state model by using 4 fixed rates (corresponding to fixed rate constants k O C, k C O, k IN CN, and k CN IN ) and varying k O IN and k IN O. The FRET width for each simulation is normalized versus the width of the corresponding binary complex (simulated using [N] k O IN << k IN O ). To account for FRET broadening beyond the shot-noise limit, we added noise to each burst, a technique similar to the dithering technique described previously 19 ; this additional broadening does not change the position of the individual species. The width of the low-fret species at nucleotide concentration equal to K d1 remains constant for any rate constants of k IN O 300 s -1, independent from k O IN. For k IN O < 300 s -1 or slower, there is a significant increase of the width around K d1 (e.g., see maximum at O-to-IN rate of 10 s -1 for the top panel, for which k IN O was set to 10 s -1 ) compared to the width of the binary complex or the partially-closed ternary complex (simulated using [N] k O IN >> k IN O ). The maximum FRET broadening seen in the simulation is ~10%, similar to what is observed in the analysis of the experimental data in panel c. 8

9 Supplementary Figure S6. Error rates for mispair formation by wt KF and derivatives. The error rate for each mispair corresponds to the number of errors per detectable nucleotide incorporation. The data for wt KF, E710A, and E710Q were taken from 13, and Y766A from 11. 9

10 Supplementary Table S1. Fluorophore labelling of proteins used in this study. Pol I(KF) a Protein (µm) Cy3B (µm) ATTO647N (µm) 550R-744G (%) b WT E710A E710Q Y766A Y766F a All proteins have the genotype: N-His 6, D424A, K550C, L744C, C907S in addition to the listed mutation. b 550R-744G as a percentage of the molecules carrying one donor and one acceptor dye label. 10

11 Supplementary Table S2. Single-turnover kinetic data for complementary dntp incorporation a. Pol I(KF) b Reaction c K d(dntp) (µm) k pol (s -1 ) WT A-dTTP 17 ± 1 40 ± 2 E710A A-dTTP 110 ± ± 0.01 E710Q A-dTTP 320 ± ± 0.01 Y766A A-dTTP 150 ± ± 0.6 Y766F A-dTTP 41 ± 4 50 ± 3 WT G-dCTP Y766A G-dCTP a Data reported as mean ± s.e.m. are average values from at least 2 experiments; the others are single measurements. The data for wild-type Pol I(KF) are in good agreement with previous measurements 23. b All the proteins had the genotype N-His 6,D424A,L744C,C907S in addition to the listed mutations. c A-dTTP incorporation was measured using a 32 P-labeled DNA substrate. G-dCTP incorporation was measured using a 5'-Cy5-labeled substrate, which gives ~2-fold lower k pol compared with the corresponding 32 P-labeled DNA. 11

12 Supplementary Methods. Five-state model for data analysis. The five-state model including the presence of the partially-closed state in the binary complex is defined as follows: [S1] where the five-states are represented as C B (closed binary), PC B (partially-closed binary), O B (open binary), PC T (partially-closed ternary) and C T (closed ternary), with N representing dntp. The equations describing the two FRET observables are then given by: Normalized peak shift of the low-fret Gaussian: (E* obs E* initial ) / (E* closed E* initial ) = y max [N] / ((1 + K 0ʹ)*K d1 + [N]) [S2] Fraction of high-fret species: ([C] + [CN]) / [Total] = ([C] + [CN]) / [Total] [S3] = (K 0ʹ K 3 + K 1 K 2 [N]) / (1 + K 0ʹ + K 0ʹ K 3 + K 1 [N] + K 1 K 2 [N]), where K 0 represents the equilibrium constant between the open binary and partially-closed binary states, and K 3 represents the equilibrium constant between partially-closed binary and closed binary states. Four-state, off-pathway model for data analysis. The data analysis using the four-state model assumes that the intermediate-fret species, IN, is on-pathway between open and closed conformations (Fig. 3c), as seems most reasonable. The alternative off-pathway model leads to a set of equations identical to the above, except that [CN] = K 2 [O][N] and, as a consequence, K d2 = 1 / K 2. 12

13 DNA preparation. DNA oligonucleotides (hairpin DNAs, shown in Fig. 1d) were synthesized by the Keck Biotechnology Resource Laboratory at Yale Medical School and purified using denaturing polyacrylamide gel electrophoresis as described 15. The 3 terminus was a dideoxynucleotide, allowing formation of ternary complexes with incoming nucleotides, but preventing phosphoryl transfer and restricting our observations to pre-chemistry species. DNA molecules of this type bind to wt KF with a K d < 1 nm 33. Protein expression and purification. KF derivatives were prepared using either our previously described expression plasmid, with transcriptional and translational signals from bacteriophage λ 23,34 or a pet-derived construct in which the protein is expressed from a bacteriophage T7 promoter 35. Wild-type and mutant constructs were based on a KF genotype of N-His 6,D424A,K550C,L744C,C907S to provide for double-labelling with fluorophores, as described below. The listed changes had a negligible effect on polymerase activity 7. For simplicity, the N-His 6,D424A,K550C,L744C,C907S protein is referred to as wild-type (wt KF). The expressed proteins were purified by affinity chromatography on Ni-NTA agarose (Qiagen) 23. Site-specific labeling of KF. Double-labelling of wild-type and mutant KF proteins using maleimides of Cy3B (GE Healthcare) and ATTO647N (ATTO-TEC) was modified from our previous labelling procedure 7 : by including DNA and dntp substrates during labelling, we improved the labelling bias to ~ 99% 550-ATTO647N,744-Cy3B for all the proteins in this study (Supplementary Table S1). Before labelling, the double-cys KF derivative was reduced in 5 mm DTT and dialyzed into the nonsulfhydryl reducing agent, TCEP (Invitrogen). To the dialyzed protein, in 50 mm Tris-HCl, ph 7.5, 120 µm TCEP, was added an equimolar amount of a duplex DNA primer-template (having C as the next templating base) and the complementary dgtp (1 mm final concentration). To promote substrate binding but prevent nucleotide addition, the final mix also contained 1 mm EDTA and 5 mm CaCl 2. The protein was labelled by sequential addition of the two maleimides: first, ATTO647N maleimide was added at 1.2-fold molar excess relative to the 13

14 protein, and allowed to react for 1 h at 22 C; then, Cy3B maleimide was added at 3.4-fold molar excess, and the mixture was incubated for a further 16 h at 4 C. The reaction was stopped by addition of 1 mm DTT, and the labelled protein was purified from excess dyes, DNA and dgtp using chromatography on heparin-agarose (Sigma-Aldrich). The reaction mix was loaded onto the column, and washed extensively with 20 mm Tris-HCl, ph 7.5, 1 mm EDTA, 2% (vol/vol) glycerol, 1 mm 2-mercaptoethanol, followed by the same buffer containing 50 mm NaCl. The bound protein was then eluted with the same buffer containing 0.4 M NaCl. Labelled proteins were stored at 20 C in 50 mm Tris-HCl, ph 7.5, 1 mm DTT, 40% (vol/vol) glycerol. The molarities of protein and dye labels were calculated from absorbance spectra. The relative amounts of Cy3B and ATTO647N at positions 550 and 744 were measured using partial digestion with chymotrypsin, as described 7. After fractionation on SDS-PAGE, the Cy3B and ATTO647N fluorescence in appropriate peptides was quantitated and used to estimate the fraction of the donor-acceptor molecules that had the 550R,744G labelling pattern (Supplementary Table S1). Two assays were carried out on the labelled proteins to assess the effectiveness of removal of DNA and dgtp by the heparin column. Contamination by DNA was measured using T4 polynucleotide kinase and gamma-[ 32 P]ATP. Addition of a primer-template with C as the templating base was used to detect residual dgtp by extension of the 32 P-labeled primer. Quantitation of the labelled products in these assays indicated that DNA contamination of the labelled proteins was 5% and dgtp contamination 1% on a molar basis. The enzymatic activity of KF derivatives used in this study (Supplementary Table S2) was assessed by measuring the single-turnover rate of nucleotide addition to a DNA primer terminus by chemical quench methods 23,36 ; stopped-flow fluorescence studies of these proteins will be published elsewhere (O.B., N.D.F.G., C.M.J, unpublished). Probability Distribution Analysis (PDA) fitting parameters. PDA fitting parameters for Figure 5: wt KF static model fit: E* 1 = 0.49, E* 2 = 0.68, 1 = 0.27, 2 = 0.27, N 1 = 0.58, N 2 = 0.42; dynamic model fit: E* 1 = 0.46, E* 2 = 0.72, 1 = 0.17, 2 = 0.16, k 1 = 150 s -1, and k -1 = 320 s -1, Y766A static model fit: E* 1 = 0.51, E* 2 = 0.68, 1 = 0.23, 2 = 0.24, N 1 = 0.84, N 2 = 0.16; dynamic model fit: E* 1 = 0.49, E* 2 = 0.71, 1 = 0.19, 2 = 0.20, k 1 = 9 s -1, k -1 = 128 s -1, and Y766F static model fit: E* 1 = 0.50, E* 2 = 0.69, 1 = 0.28, 2 = 0.27, N 1 = 14

15 0.25, N 2 = 0.75; dynamic model fit: E* 1 = 0.49, E* 2 = 0.71, 1 = 0.23, 2 = 0.23, k 1 = 435 s -1, and k -1 = 108 s

16 Supplementary references 33. Turner, R. M., Grindley, N. D. F. & Joyce, C. M. Interaction of DNA polymerase I (Klenow fragment) with the single-stranded template beyond the site of synthesis. Biochemistry 42, (2003). 34. Joyce, C. M. & Derbyshire, V. [1] Purification of Escherichia coli DNA polymerase I and Klenow fragment. Methods Enzymol. 262, 3 13 (1995). 35. Studier, F. W., Rosenberg, A. H., Dunn, J. J. & Dubendorff, J. W. Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol. 185, (1990). 36. Johnson, K. A. Rapid quench kinetic-analysis of polymerases, adenosine-triphosphatases, and enzyme intermediates. Methods Enzymol. 249, (1995). 16