Supplementary Figure 1.

Size: px
Start display at page:

Download "Supplementary Figure 1."

Transcription

1 Supplementary Figure 1. a c Percentage of targeted mutagenesis (TM) Percentage of targeted gene insertion (TGI) , , , , controls Mn sctrex2 controls Mn donor matrix b WT:ggtggggtgttttacgttgtacgacgtctagcagccttt 3:ggtggggtgttttacgttg---gacgtctagcagccttt 27% 3:ggtggggtgttttacgtt---cgacgtctagcagccttt 11% 4:ggtggggtgttttacgtt----gacgtctagcagccttt 33% Supplementary Figure 1 Targeted genome modifications induced by engineered meganuclease Mn (a) Targeted mutagenesis (TM) frequencies induced by meganuclease Mn17038 in combination with DNA processing enzyme sctrex2 (colony 4). Events were detected via amplicon sequencing using primers flanking the nuclease target site. All colonies were obtained by co-transformation with the plasmid conferring resistance to the nourseothricin antibiotic. The baseline controls correspond to colonies transformed with a plasmid carrying the antibiotic resistance gene alone (colonies 1 to 3). (b) Nature of mutagenic events induced by Mn17038 in the mosaic colony. The sequence underlined corresponds to the recognition site of Mn The percentage of each event is indicated on the right. (c) Targeted gene insertion (TGI) frequencies induced by meganuclease Mn17038 in the presence of the donor matrix (colonies 3 and 4) as measured by deep sequencing. The baseline controls correspond to colonies transformed with a plasmid carrying the antibiotic resistance gene in the presence of the donor matrix (colonies 1 and 2).

2 Supplementary Figure 2. * Round 1 * * Immobilization Ligation Wash Release * Round 2 * * Immobilization Ligation Wash Release Round 3 to (N-1) * * Round N * Immobilization Ligation Release Wash Cloning Supplementary Figure 2 Graphical representation of a novel high-throughput assembly method developed to assemble TALE repeats. Each block (except the last one containing the half repeat marked with an *) was attached to a solid support (magnetic beads coated with streptavidin). The first step consisted of coupling the last block (containing the terminal half repeat) to a pre-immobilized block coding for the previous repeats (considering an N to C-terminal orientation of the array). The iterative elongation was performed by release (restriction) of the nascent chain and coupling (ligation) of the released product to the next immobilized block. When the desired array length was reached, depending on block size and iterative round number, the product was released from the array and ligated in the destination plasmid. Enzymatic steps are underlined.

3 Supplementary Figure 3. a. ATTTGCGTTCCTTCAGGCAAAAATGGAAGCCGGAGGCTGTGCTCCATCGGCTATTGCCGCCTTCGAGTCAAGCGCCACGCTTC CCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCTTGGGTGCACG AGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTCTTCATATCCTGCAGGGTACGTTTAAACGTAT TAATTAAGACCTAGCATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAATTTCCGGAATGATTTTGGAAGACTCTATTGCGCCCG TCCCCCAG b. GGCGCGGACCTCCCTCGCGGCCTGGAATCTCTTTCTGGCCCTCTTTTCCCTCGTCGGCATACTGTTATTCCTGACTGTGAAAC CAAAGCGGAGTGCCAGGCCCACGGCTTGACCCCGGAGCAGGTGGTGGCCATCGCCAGCAACTTTCCCCAGCTTGTACACAACC TCGCGACGCTCACGCTCCGGGAAAATCTCTGCGCCAAT c. GTTTTCTGTATCCATTTTCCACAACACTGTTAATGCCTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATC GTTTTCGTTGCGCATACCGAGTACCCACAGGACTGTTTTCCTTCCGCAAGTGACGTTGAATGCCAAAAGATGGATGGTAGCAA CGGGGGTAGAAACAAACGCTGCTGTGGCAACTCCAGAAAATGACGAA Supplementary Figure 3 Characterization of insertion sequences found in the Tn19745_1, Tn23158_1 and Tn19746_1 strains. Sequences correspond to fragments of plasmids used for transformation. (a) The sequence of the 228 bp insertion and the sequence targeted by the TALEN are in green and red letters, respectively. The inframe stop codon is underlined in blue. The sequences highlighted in grey correspond to a fragment of the ampr

4 gene. The sequences highlighted in yellow and light blue correspond to fragments of the ColE1-derived plasmid replication origin. (b) The sequence of the 83 bp insertion and the sequence targeted by the TALEN are in green and red letters, respectively. The inframe stop codon is underlined in blue. The sequence highlighted in yellow corresponds to the P.tricornutum fcpa, fcpb, fcpc, and fcpd genes, complete CDS. The sequence highlighted in grey corresponds to the TALEN coding sequence. (c) The sequence of the 47 bp insertion which corresponds to a fragment of the transformation plasmid is in green letters. The sequence targeted by the TALEN is in red letters. The inframe stop codon is underlined in blue.

5 Supplementary Figure bp 500bp 400bp 300bp 200bp 150bp 100bp 50bp Supplementary Figure 4 Molecular characterization of colonies transformed with TALEN. Example of mutagenic events detected by the occurrence of a cleavage product induced by the T7 endonuclease. Colonies presenting the highest targeted mutagenesis frequency after transformation with their respective TALEN are presented in this panel, except for colonies Tn19746 and Tn23157, harboring 6% and 12% targeted mutagenesis, respectively. A weak T7 endonuclease signal was observed for these two mutants, and the cleavage products are pointed out with black arrows. For each locus, the controls correspond to colonies transformed with the plasmid for nourseothricin resistance alone (Ct_1 to Ct_7).

6 Supplementary Figure 5. Locus targeted by Mn17038 Control (816 ) Colony with 6,9% TM (2674)

7 Locus targeted by Mn17181 Control (7146) Colony with 4,8% TM (6456) Colony with 1.2% TM (12738) Colony with 8.3% TM (8025) Colony with 2.5% TM (3742) Colony with 15% TM (4404)

8 Locus targeted by Mn17038 (TGI experiment) Control (6581 Sequences analyzed) Colony with 0.06%TGI and 0.278% TM (5028 Sequences analyzed) Colony with 0.197% TGI and 0.09% TM (5574 Sequences analyzed)

9 Locus targeted by Mn17181 (TGI experiment) Control (9512 Sequences analyzed) Colony with 0.213%TGI and 0.075% TM (9377 Sequences analyzed) Colony with 0.079% TGI and 0.295% TM (9377 Sequences analyzed) Colony with 0.949% TGI and 1.347% TM (12325 Sequences analyzed) Colony with 1.042% TGI and 1.347% TM (11134 Sequences analyzed) Colony with 2.277% TGI and 0.934% TM (12738 Sequences analyzed)

10 Putative palmitoyl-acp thioesterase Enoy-ACP reductase Control (1820) Control (2891) Colony with 22% TM (1708) Colony with 12% TM (3148)

11 Omega-3 fatty acid desaturase Control 0.06% TM (1586 Sequences analyzed) Colony with 93% TM (520 Sequences analyzed) Colony with 1.4% TM (1350 Sequences analyzed) Colony with 14% TM (709 Sequences analyzed)

12 Glycerol-3-phosphate-dehydrogenase Control (2757) Colony with 48% TM (1486) Colony with 1.56% TM (2953) Colony with 59.6% TM (7782) Colony with 34% TM (1187)

13 Delta 12-fatty acid desaturase Long chain acyl-coa elongase Control (2958) Control with 0.01% TM (7854) Colony with 100% TM (2450) Colony with 6% TM (10712) Colony with 21% TM (8714)

14 UDP Glucose pyrophosphorylase Control (16326) Colony with 85% TM (2459) Colony with 75% TM (444) Colony with 85% TM (19748) Supplementary Figure 5 Nature of mutagenic events detected by deep sequencing. Colonies positive in the T7 endonuclease assay (positive for targeted mutagenic events) were analyzed by deep sequencing. A PCR amplicon encompassing the target site was generated using primers with specific adaptors required for HTS sequencing. The number of sequence reads is indicated between brackets. The sequence of the target is underlined in purple. The white and black boxes indicate deletion and insertion events, respectively.

15 Supplementary Figure 6. Before filtering After filtering Control (7146) Control (7146) Mn17181 colony with 15% TM (4404) Mn17181 colony with 15% TM (4404) Supplementary Figure 6 Example of deep sequencing analysis before and after filtering. Results obtained from PCR on the colony showing 15% targeted mutagenesis and its corresponding control. Deep sequencing analysis was performed before and after filtering as described in Methods.

16 Supplementary Table 1. Targeted mutagenesis frequency induced by meganucleases. Percentage of NAT-resistant colonies harboring targeted mutagenic events identified by deep sequencing. Name Number of NAT-resistant colonies Number of NAT-resistant colonies tested Number of colonies with TM events identified by deep sequencing Percentage of colonies with TM events Mn % Mn %

17 Supplementary Table 2. Targeted gene insertion frequency induced by meganucleases. Percentage of NATresistant colonies harboring targeted gene insertion events identified by specific PCR screening. Name Number of NAT-resistant colonies Number of NAT-resistant colonies tested Number of colonies with TGI events identified by specific PCR screening Percentage of colonies with TGI events Mn % (6/22) Mn % (2/9)

18 Supplementary Table 3. Targeted gene insertion and targeted mutagenesis frequencies induced by meganucleases in TGI positive colonies. TGI and TM frequencies quantified by deep sequencing. Name Colonie # Percentage of TGI events Percentage of TM events Mn % 0.295% % 0.075% % 0.714% % 1.347% % 1.347% % 0.934% Mn % 0.278% % 0.090%

19 Supplementary Table 4. Information related to sequences targeted by engineered nucleases. Target genome location, sequences targeted by the engineered nucleases and putative function of the gene targeted are indicated. Name Genome Sequences targeted by engineered nucleases Putative Function location Mn17181 Chr2_ TTTTGACGTCGTACGGTGTCTCCG Putative ammonium transporter Mn17038 Chr28_ GTTTTACGTTGTACGACGTCTAGC Unknown Tn19745 Chr30_80100 TGCCGCCTTCGAGTCGACCTATGGTAGTCTCGTCTCGGGTGATTCCGGA UDP Glucose pyrophosphorylase Tn23158 Chr31_ TTTTCCACAACACTGTTAATGCCTTTTCGTTGCGCATACCGAGTACCCA Omega-3 fatty acid desaturase Tn19746 Chr25_ TCTTTTCCCTCGTCGGCATGCTCCGGACCTTTCCCCAGCTTGTACACAA Long chain acyl-coa elongase Tn23159 Chr11_ TCTGACCAACTCGATAAAGTATGCATCATCGGTAGCGGTAACTGGGGAA Glycerol-3-phosphate dehydrogenase Tn23157 Chr2_ TTCCACTGGTTACGGCTGGGCGATCGCCAAAGCTTTGGCCGAAGCAGGA Enoyl-ACP reductase Tn19744 Chr3_ TGGTCTTTGCCCATGGGATGGGAGATTCGTGCTTTAATTCTGGCATGCA Putative palmitoyl-acp thioesterase Tn19743 Chr3_ TAGCTCCCAAGAGTGCCACCAGCTCTACTGGCAGTGCTACCCTTAGCCA Delta 12-fatty acid desaturase

20 Supplementary Table 5. Yield of 8 TALE arrays (4 TALEN) synthesis using the reverse synthesis method. TALE arrays are typically assembled and screened in less than 1 week using this methodology. TALEN Name Monomer Number of clones Number of clones Positive (%) screened positive in size Tn19743 Left Right Tn19744 Left Right Tn19745 Left Right Tn19746 Left Right

21 Supplementary Table 6. Targeted mutagenesis frequency induced by TALEN. Percentage of NAT-resistant colonies with targeted mutagenic events identified either directly by PCR screening ( a ) or by T7 endonuclease assay ( b ). Name Number of NAT-resistant colonies Number of NAT-resistant colonies tested Number of colonies with TM events Percentage of colonies with mutagenic events Tn a + 3 b 35% (7/20) Tn a + 4 b 56% (5/9) Tn a + 2 b 11% (3/27) Tn b 31% (4/13) Tn b 7% (1/14) Tn b 14% (1/7) Tn b 7% (1/15)

22 Supplementary Table 7. Nucleases sequences. Mn17181 and Mn17038 meganucleases used in this study were derived from I-CreI and were used under a single chain form. Meganucleases Mutations potentially involved in DNA recognition Monomer A _ Monomer B Mn17181 Mn E33C40A96E132V_8K19S30R32T44K61R68Y70S75Y77Q132V 7E33C38A54L70S75H77Y96E132V153G161T_8K19S33T40W61R77H132V I-CreI sequence: MANTKYNKEFLLYLAGFVDGDGSIIAQIKPNQSYKFKHQLSLTFQVTQKTQRRWFLDKLVDE IGVGYVRDRGSVSDYILSEIKPLHNFLTQLQPFLKLKQKQANLVLKIIEQLPSAKESPDKFLEVC TWVDQIAALNDSKTRKTTSETVRAVLDSLSEKKKSSPAAD Monomers A and B are linked by the linker GDSSVSNSEHIAPLSLPSSPPSVGS. 19S mutation is added in monomer B.

23 Supplementary Table 8. Sequences of the TALEN scaffolds, derived from AvrBs3, N-terminal domain (full length) and C-terminal domain (+C40) including the FokI catalytic head used in this study. TALEN Left TALEN Right Nter Sequence Cter Sequence Nter Sequence Cter Sequence MGDPKKKRKVIDYPYDVPDYAIDIADPIRSRTPSPARELLPGPQPDGVQPTADRGVSPPA GGPLDGLPARRTMSRTRLPSPPAPSPAFSAGSFSDLLRQFDPSLFNTSLFDSLPPFGAHH TEAATGEWDEVQSGLRAADAPPPTMRVAVTAARPPRAKPAPRRRAAQPSDASPAAQVDLR TLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAA LPEATHEAIVGVGKQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVH AWRNALTGAPLN SIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLGDPISRSQLVKSELEEKKSE LRHKLKYVPHEYIELIEIARNSTQDRILEMKVMEFFMKVYGYRGKHLGGSRKPDGAIYTV GSPIDYGVIVDTKAYSGGYNLPIGQADEMQRYVEENQTRNKHINPNEWWKVYPSSVTEFK FLFVSGHFKGNYKAQLTRLNHITNCNGAVLSVEELLIGGEMIKAGTLTLEEVRRKFNNGE INFAAA MGDPKKKRKVIDKETAAAKFERQHMDSIDIADPIRSRTPSPARELLPGPQPDGVQPTADR GVSPPAGGPLDGLPARRTMSRTRLPSPPAPSPAFSAGSFSDLLRQFDPSLFNTSLFDSLP PFGAHHTEAATGEWDEVQSGLRAADAPPPTMRVAVTAARPPRAKPAPRRRAAQPSDASPA AQVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKY QDMIAALPEATHEAIVGVGKQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVT AVEAVHAWRNALTGAPLN SIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLGDPISRSQLVKSELEEKKSE LRHKLKYVPHEYIELIEIARNSTQDRILEMKVMEFFMKVYGYRGKHLGGSRKPDGAIYTV GSPIDYGVIVDTKAYSGGYNLPIGQADEMQRYVEENQTRNKHINPNEWWKVYPSSVTEFK FLFVSGHFKGNYKAQLTRLNHITNCNGAVLSVEELLIGGEMIKAGTLTLEEVRRKFNNGE INFAAA RVD sequences used in this study. TALEN name Monomer RVD Tn19743 Left NI-NN-HD-NG-HD-HD-HD-NI-NI-NN-NI-NN-NG-NN-HD-NG# Right NN-NN-HD-NG-NI-NI-NN-NN-NN-NG-NI-NN-HD-NI-HD-NG# Tn19744 Left NN-NN-NG-HD-NG-NG-NG-NN-HD-HD-HD-NI-NG-NN-NN-NG# Right NN-HD-NI-NG-NN-HD-HD-NI-NN-NI-NI-NG-NG-NI-NI-NG# Tn19745 Left NN-HD-HD-NN-HD-HD-NG-NG-HD-NN-NI-NN-NG-HD-NN-NG# Right HD-HD-NN-NN-NI-NI-NG-HD-NI-HD-HD-HD-NN-NI-NN-NG# Tn19746 Left HD-NG-NG-NG-NG-HD-HD-HD-NG-HD-NN-NG-HD-NN-NN-NG# Right NG-NN-NG-NN-NG-NI-HD-NI-NI-NN-HD-NG-NN-NN-NN-NG# Tn23157 Left NG-HD-HD-NI-HD-NG-NN-NN-NG-NG-NI-HD-NN-NN-HD-NG# Right HD-HD-NG-NN-HD-NG-NG-HD-NN-NN-HD-HD-NI-NI-NI-NG# Tn23158 Left NG-NG-NG-HD-HD-NI-HD-NI-NI-HD-NI-HD-NG-NN-NG-NG# Right NN-NN-NN-NG-NI-HD-NG-HD-NN-NN-NG-NI-NG-NN-HD-NG# Tn23159 Left HD-NG-NN-NI-HD-HD-NI-NI-HD-NG-HD-NN-NI-NG-NI-NG# Right NG-HD-HD-HD-HD-NI-NN-NG-NG-NI-HD-HD-NN-HD-NG-NG# Amino acid sequences of the repeat units (RVDs) used in this study. RVD name Targeted base RVD Sequence NI A LTPEQVVAIASNIGGKQALETVQALLPVLCQAHG HD C LTPEQVVAIASHDGGKQALETVQRLLPVLCQAHG NN G LTPQQVVAIASNNGGKQALETVQRLLPVLCQAHG NG T LTPQQVVAIASNGGGKQALETVQRLLPVLCQAHG NG* Any LTPQQVVAIASNGGGRPALE

24 Supplementary Table 9. Primers sequence information. Use Nuclease Locus Forward Reverse Mn17181 Putative ammonium transporter TCAGCTCCATTGGAATGTTGGC CCCTCCGACCAGGGAACTTACTC Mn17038 Unknown CGGTTGTCATGGATAGCGGAGC CCCCAGACGATTCGAAGTCGTCC Tn19745 UDP Glucose pyrophosphorylase GTTGAATCGGAATCGCTAACTCG GACTTGTTTGGCGGTCAAATCC Tn23158 Omega-3 fatty acid desaturase GCGTGTGCTCACCTGTTGTCC AAGCATGCGCTTCACTTCGCTC Monitoring of TM by deep sequencing Tn19746 Long chain acyl-coa elongase AAGCGCATCCGTTGGTTCC TCAATGAGTTCACTGGAAAGGG Tn23159 Glycerol-3-phosphate dehydrogenase TCTGCTACTGCTCATCCGCACC TCGCGACAGGCTTCTGCTAGATC Tn23157 Enoyl-ACP reductase GGACTGTTTCGCTACGGTACATC GAAATGGTGTATCCGTCCAATCC Tn19744 Putative palmitoyl-acp thioesterase GAAGAACAGTCGCACCTGGTGC TCCGCCCTAACACCTTCCGC Tn19743 Delta 12-fatty acid desaturase CTCGTCGGTGGTCCGTATTGG TGGCGAGATCGCGCATCAGG Monitoring of TGI by PCR screening (Screen Left) Mn17181 Putative ammonium transporter CCGGCCAGAGTCGAATTGGCCAC GTGG Mn17038 Unknown GCAGCGTACGCAGCCATAGTCCG GAACG AATTGCGGCCGCGGTCCGGCGC AATTGCGGCCGCGGTCCGGCGC Monitoring of TGI by PCR screening (Screen Right) Mn17181 Putative ammonium transporter TTAAGGCGCGCCGGACCGCGGC GACGACGACGAAAACGTCTTGCG TCCG Mn17038 Unknown TGTTTTACGTTGTTTAAGGCGCGC CG CCGCATCTCAATCACGTCTTGTTG AAGC Monitoring of TGI by deep sequencing (Locus specific) Mn17181 Putative ammonium transporter CCGGCCAGAGTCGAATTGGCCAC GTGG Mn17038 Unknown GCAGCGTACGCAGCCATAGTCCG GAACG GACGACGACGAAAACGTCTTGCG TCCG CCGCATCTCAATCACGTCTTGTTG AAGC Sequence of the 29 bp insertion Matrix TTAAGGCGCGCCGGACCGCGGCCGCAATT Screen for the presence of the meganuclease Mn17181 or Mn17038 TTAACAATTGAATCTCGCCTATTC ATGGTG TAGCGCTCGAGTTACTAAGGAGA GGACTTTTTCTT Screen for the presence of the sctrex2 sctrex2 AATCTCGCCTATTCATGGTG CCAGACCGGTCTGTGGAGGAG