Supporting Information. Multi-strand Structure Prediction of Nucleic. Acid Assemblies and Design of RNA Switches

Size: px
Start display at page:

Download "Supporting Information. Multi-strand Structure Prediction of Nucleic. Acid Assemblies and Design of RNA Switches"

Transcription

1 Supporting Information Multi-strand Structure Prediction of Nucleic Acid Assemblies and Design of RNA Switches Eckart Bindewald 1#, Kirill A. Afonin 2,3#, Mathias Viard 1, Paul Zakrevsky 2, Taejin Kim 2, Bruce A. Shapiro 2* 1 Basic Science Program, Leidos Biomedical Research, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, USA 2 Gene Regulation and Chromosome Biology Laboratory, Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702, USA 3 Department of Chemistry, University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, North Carolina 28223, USA S1

2 Figure S1. Flow chart of algorithm for computing RNA silencing switch. Input to the algorithm is the sirna sense and antisense strand sequences as well as the trigger strand sequence. Adjacent to each text box a visual depiction of the 2-stranded RNA switch is shown; the regions that are modified at each step are highlighted with green ellipsoids and arrows. S2

3 Figure S2. Experimental results showing relative amount of reassociated RNA/DNA hybrids with DNA toeholds of various toehold lengths (0, 2, 4, 6, 8 and 10) and nanomolar strand concentrations (depicted as different colors). (a) toeholds with ~60% G+C content. (b) toeholds with ~25% G+C content. (a) (b) S3

4 Figure S3. Predicted structure of CTGF-eGFP switch in a) absence and c) presence of trigger strand. Grey: anti-target strand; blue: anti-trigger strand; orange: trigger strand (fragment of egfp-mrna). a: predicted structure in absence of trigger strand. b) arc-representation of structure shown in (a); c: The predicted structure in presence of trigger strand exhibits an active shrna-like conformation. (a) (b) (c) S4

5 Figure S4. GFP (121) as a trigger, Polo-Like Kinase 1 (PLK1) as a target. Predicted structure of the GFP-PLK1 two-stranded switch in presence and absence of a trigger strand. Grey: anti-target strand; blue: anti-trigger strand; orange: trigger strand (fragment). (a): predicted structure in absence of trigger strand; (b): The predicted structure in presence of the trigger strand exhibits the anti-target strand in an inactive conformation (grey, left) as well an active shrna-like conformation with higher free energy (grey, right). (a) (b) S5

6 Figure S5. GFP (238) as a trigger, Polo-Like Kinase 1 (PLK1) as a target. Predicted structure of the GFP-PLK1 two-stranded switch in presence and absence of a trigger strand. Grey: anti-target strand; blue: anti-trigger strand; orange: trigger strand (fragment). (a): predicted structure in absence of trigger strand; (b): The predicted structure in presence of the trigger strand exhibits the anti-target strand in an inactive conformation (grey, left) as well an active shrna-like conformation with higher free energy (grey, right). (a) (b) S6

7 Figure S6. GFP (633) as a trigger, Polo-Like Kinase 1 (PLK1) as a target. Predicted structure of the GFP-PLK1 two-stranded switch in presence and absence of a trigger strand. Grey: anti-target strand; blue: anti-trigger strand; orange: trigger strand (fragment). (a): predicted structure in absence of trigger strand; (b): The predicted structure in presence of the trigger strand exhibits the anti-target strand in an inactive conformation (grey, left) as well an active shrna-like conformation with higher free energy (grey, right). (a) (b) S7

8 Figure S7. Post-transcriptional processing with RNase H of switch strands. Schematic representation and 8M urea gels demonstrating the removal of RNA promoter starting sequence with RNaseH. S8

9 Figure S8. Incubation of RNA switches at various temperatures. The performance of the RNA switch was studied in vitro at various incubation temperatures. Native PAGE analysis was used to assess the ability of the RNA switch be triggered by an mrna fragment at various temperatures (2 nd, 3 rd, 4 th lanes from right), as well as to assess its ability to properly reassemble after thermal denaturation (far right lane). The released DS RNA anti-target strand is boxed in red. Native PAGE was visualized using SYBR Gold total RNA staining. S9

10 Figure S9. Experimental verification of RNA switch formation and function. Total SYBR Gold staining native-page experiments. Results demonstrate formation of two-stranded switches (left lanes in all gels). However, the interaction with the corresponding mrna fragments lead to the release of DS RNA containing shrnalike structures only in the case of GFP (633) mrna trigger. The cognate switches and mrna fragments are shown in red. S10

11 Figure S10: Fluorescent microscopy images of cells transfected with anti-ctgf dicer substrate RNA (DS RNA) prior to transfection with the CTGF-eGFP switch. Cells were initially transfected with either 5 or 50nM anti-ctgf DS RNA. Two days later cells were transfected with either 50nM of the CTGF-eGFP switch or the antitarget strand. For each transfection, Lipofectamine 2000 only samples were used as a negative control. Microscopy was performed 3 days following the second transfection. S11

12 Figure S11: egfp silencing for the CTGF-eGFP switch compared to a no-toehold control. Shown is the mean egfp fluorescence signal (measured 1 (top), 2 (middle) or 3(bottom) days after transfection) as obtained via flow-cytometry for a functional switch (black) and for a control switch (red) where the toehold region of the antitarget strand has been removed. The standard error (shown as two horizontal bars of same color) cannot be resolved by the chosen line width. Fluorescence signal values are in each case plotted for the range of to as a function of the concentration of the utilized RNA complexes. The reduced difference between RNA switch and the no-toehold control for day 3 may be interpreted as evidence for degradation of the switch construct. In all cases, the no-toehold switch leads to a higher fluorescence signal, indicating less egfp silencing. Even though the differences are in some cases small, they are for all cases statistically significant (Two-sample t-test, P < 0.008, n=10000) Signal Day: 1 Day: 2 Day: Concentration (nm) S12

13 Figure S12: Predicted structure of partially degraded CTGF-eGFP switch contains active egfp-ds helix. The anti-trigger strand is for this prediction assumed to have been cleaved by a nuclease. Simulated RNA sequences are the antitarget strand (gray), and the hypothetical 5 and 3 cleavage products of the antitrigger strand (shown in blue and orange, respectively). It is predicted that the shorter cleavage product (shown in orange) is dissociated from the switch complex. Because 5 and 3 ends of the anti-target strand are predicted to form a long egfp Dicer-Substrate helix, the predicted structure of the partially degraded switch corresponds to an active conformation that can potentially lead to the knockdown of egfp even without the CTGF trigger RNA. S13

14 Estimation of Entropy Contributions of Nucleic Acid Complex Formation Let there be a set of nucleic acid strands, each having the concentration c. We are approximating the entropy change S!,! that occurs due to the formation of a base pair between residues i and j. The entropy change is the logarithm of the reduction of number of available states and is the same as the probability p that this smaller number of states is attained: S!,! = R ln p!,! The probability p!,! of the next folding event involving the formation of base pair between residues i and j that have an estimated maximum distance r!,! is estimated as!! p!,! = ar!,! with the exponent κ being set to κ = 3 for residues that belong to two different strands and κ = 1.8 for the case of two nucleotides that belong to the same strand. This choice of exponent is based on the reported observation that rates for folding events involving loops of size n decrease with n!!.! (see [2]). The proportionality factor a is fitted such that HyperFold is able to reproduce the reported melting temperatures of several RNA duplexes [1]. The maximum distance r!,! between two residues i,j within an a single-stranded region is assumed to be r!,! = r adj j i. The constant of the maximum distance of adjacent residues r adj = 6.52Å has been obtained by analyzing maximum residueresidue distances within RNA 3D structures consisting of one RNA strand. The maximum distance between two residues i,j that belong to two different strands that are not yet connected via base pairing is estimated through the strand concentration: r!,! =!!!. if each strand has the concentration c. S14

15 The distance between two residues that form a base pair is set to 14.98Å. For all other cases, the maximum distances between nucleotide pairs is determined recursively such that r!,! r!,!!! < r adj r!,! r!,!!! r!,! r!!!,! r!,! r!!!,! < r adj < r adj < r adj The total entropy change of a structure with n base pairs i!, j!, i!, j!,, i!, j! is computed as:! S = S!!,!!!!! The free energy of folding is computed as: G = G! T S where G! is the free energy of folding estimate based on nearest-neighbor energy parameters as described in the main manuscript. Supplementary References [1] Freier, Susan M., Ryszard Kierzek, John A. Jaeger, Naoki Sugimoto, Marvin H. Caruthers, Thomas Neilson, and Douglas H. Turner. "Improved free-energy parameters for predictions of RNA duplex stability." Proceedings of the National Academy of Sciences 83, no. 24 (1986): [2] Zhang, Wenbing, and Shi-Jie Chen. "Exploring the complex folding kinetics of RNA hairpins: II. Effect of sequence, length, and misfolded states." Biophysical journal 90, no. 3 (2006): S15

16 Sequences Connective Tissue Growth Factor (CTGF) as a trigger, enhanced Green Fluorescent Protein (egfp) as a target: Asymmetric Dicer substrate (DS) sirna duplex designed against egfp sense 5 - pacccugaaguucaucugcaccaccg-3 antisense 5 CGGUGGUGCAGAUGAACUUCAGGGUCA-3 Asymmetric Dicer substrate (DS) sirna duplex designed against CTGF sense 5 - pcccagacccaacuaugauuagagcc-3 antisense 5 GGCUCUAAUCAUAGUUGGGUCUGGGCC-3 Anti-Target ACCCUGAAGUUUAUUUGUAUCAUUGCAAACAACUGUCCCGGAGACAAUUAAACUUCAGGGUAAUUAUUCUG GUGGUGCAGAUGAACUUCAGGGUAA-3 Anti-Trigger (toehold is underlined) 5 UCCUGUAGUACAGCGAUUCAAAGAUGUCAUUGUCUCCGAAAGGACAGUUGAAAUAAUGGCAGGGCCAUU AAAAGCAAUGAUACAAA-3 Anti-Trigger without toehold AUUGUCUCCGAAAGGACAGUUGAAAUAAUGGCAGGGCCAUUAAAAGCAAUGAUACAAA-3 Anti-Trigger with dummy toehold (toehold is underlined) UCCAUAACCUCCAAACACAACCCACAACAUUGUCUCCGAAAGGACAGUUGAAAUAAUGGCAGGGCCAUUAA AAGCAAUGAUACAAA-3 Trigger mrna fragment (sequence complementary to toehold is underlined) AAGACCUGUGCCUGCCAUUACAACUGUCCCGGAGACAAUGACAUCUUUGAAUCGCUGUACUACAGGAAGAU GUACGG-3 The chosen CTGF mrna region is based on shrna D in Benwith et al., Cancer Res February 1, 2009 vol. 69 no : 5 UGACAUCUUUGAAUCGCUG-3 (target site on CTGF mrna, pos with respect to sequence NM_ ). S16

17 GFP (121) as a trigger, Polo-Like Kinase 1 (PLK1) as a target: Anti-Target CCAUUAAUGAGUUGCUUAAUGAUCAAAUGAAGUUCAGCACCACCGGAACUCAUUAAUGGCUCACCAUUAAC GAGUCAUUAAGCAGCUCGUUAAUGGUG-3 Anti-Trigger GGGCACGGGCAGCUUGCCGGUGGUGCAGAUGAACUUCAGGGUCAGCUUGCCAAGCUGAAAAGAUCAUUAAG C-3 Trigger GFP mrna fragment GGCAAGCUGACCCUGAAGUUCAUCUGCACCACCGGCAAGCUGCCCGUGCCCUGGCCCACCCUCGUGACCAC -3 GFP (238) as a trigger, Polo-Like Kinase 1 (PLK1) as a target: Anti-Target CCAUUAAUGAGUUGCUUAAUGAUCAAAUCUUCAAGUCCAUGCCCGAAACUCAUUAAUGGCUCACCAUUAAC GAGUCAUUAAGCAGCUCGUUAAUGGUG-3 Anti-Trigger GGGAACUCCUGGACGUAGCCUUCGGGCAUGGCGGACUUGAAGAAGUCGUGCUGCUUCAGCACGAAAGAUCA UUAAGC-3 Trigger GFP mrna fragment AAGCAGCACGACUUCUUCAAGUCCGCCAUGCCCGAAGGCUACGUCCAGGAGCGCACCAUCUUCUUCAAGGA -3 GFP (633) as a trigger, Polo-Like Kinase 1 (PLK1) as a target: Anti-Target CCAUUAAUGAGUUGCUUAAUGAUGCAAACGCGAUCACGUCCUGCUGGAACUCAUUAAUGGCUCACCAUUAA CGAUGUCAUUAAGCAGCUCGUUAAUGGUG-3 Anti-Trigger GCGGCGGUCACGAACUCCAGCAGGACCAUGUGAUCGCGCUUCUCGUUGGGGCAACGAGAAAGCAUCAUUAA GC-3 Trigger GFP mrna fragment CCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGG -3 S17

18 Sequences used in re-association studies: Fluorescently labeled Dicer substrate RNA (DS RNA) selected against egfp RNA sense 3`-end labeled with Alexa488 pacccugaaguucaucugcaccaccg-alexa488 RNA antisense 5`-end labeled with Alexa546 Alexa546-CGGUGGUGCAGAUGAACUUCAGGGUCA DNA strands designed to form hybrids with 60 % GC in toeholds (toeholds are underlined): DNAs for sense 10 nts toehold tgtggataggcggtggtgcagatgaacttcagggt 8 nts toehold tggataggcggtggtgcagatgaacttcagggt 6 nts toehold gataggcggtggtgcagatgaacttcagggt 4 nts toehold taggcggtggtgcagatgaacttcagggt 2 nts toehold ggcggtggtgcagatgaacttcagggt 0 nts toehold cggtggtgcagatgaacttcagggt DNAs for antisense 10 nts toehold accctgaagttcatctgcaccaccgcctatccaca 8 nts toehold accctgaagttcatctgcaccaccgcctatcca 6 nts toehold accctgaagttcatctgcaccaccgcctatc 4 nts toehold accctgaagttcatctgcaccaccgccta 2 nts toehold accctgaagttcatctgcaccaccgcc 0 nts toehold accctgaagttcatctgcaccaccg S18

19 DNA strands designed to form hybrids with 25 % GC in toeholds (toeholds are underlined): DNAs for sense 10 nts toehold 5 - tatgtaaactcggtggtgcagatgaacttcagggt 8 nts toehold 5 - tgtaaactcggtggtgcagatgaacttcagggt 6 nts toehold taaactcggtggtgcagatgaacttcagggt 4 nts toehold aactcggtggtgcagatgaacttcagggt 2 nts toehold ctcggtggtgcagatgaacttcagggt DNAs for antisense 10 nts toehold accctgaagttcatctgcaccaccgagtttacata 8 nts toehold accctgaagttcatctgcaccaccgagtttaca 6 nts toehold accctgaagttcatctgcaccaccgagttta 4 nts toehold accctgaagttcatctgcaccaccgagtt 2 nts toehold accctgaagttcatctgcaccaccgag S19