Supplementary Information for Engineering Escherichia coli for production of functionalized terpenoids using plant P450s Michelle C. Y. Chang, Rachel A. Eachus, William Trieu, Dae-Kyun Ro, and Jay D. Keasling* Materials and methods Construction of CAH plasmids Isolation and characterization of cadinene and 8-hydroxycadinene Construction of AMO plasmids Figure S1. 1 H-NMR spectra of cadinene and 8-hydroxycadinene Table S1. N-terminal sequences of CAH variants Table S2. N-terminal sequences of AMO variants Table S3. N-terminal sequence aacpr ct Figure S2. In vivo production of cadinene and 8-hydroxycadinene in E. coli with wtcah Figure S3. Time courses for in vivo production of 8- hydroxycadinene in E. coli Figure S4. In vivo production of artemisinic acid in E. coli with A13AMO Figure S5. Mass spectra of amorphadiene and derivatized artemisinic acid produced in E. coli compared to authentic standards S2 S2 S3 S4 S5 S6 S6 S6 S7 S7 S8 S9
Materials and methods. All synthetic genes were obtained from DNA 2.0 (Menlo Park, CA). petduet-1 was purchased from Novagen (San Diego, CA). δ-cadinene was purchased from Fluka Chemicals (St. Louis, MO). Caryophyllene, cis-nerolidol, L-arabinose, δ-aminolevulinic acid (ALA), Terrific Broth (TB), carbenicillin, tetracycline, chloramphenicol, and dodecane were purchased from Sigma-Aldrich (St. Louis, MO). Silica gel P60 was purchased from SiliCycle (Quebec City, Canada). IPTG, ethyl acetate, glycerol, and agarose were obtained from EM Sciences (Gibbstown, NJ). Phusion polymerase, Taq polymerase, and all restriction enzymes were purchased from New England Biolabs (Ipswich, MA). Shrimp alkaline phosphatase was obtained from Roche Applied Sciences (Indianapolis, IN). 1 H-NMR and 13 C-NMR spectra were collected in CDCl 3 (Cambridge Isotope Laboratories; Cambridge, MA) at 25ºC on a Bruker AV-500 or AV-400 spectrometer at the College of Chemistry NMR Facility at the University of California, Berkeley. All chemical shifts are reported in the standard δ notation of parts per million. Analytical thin layer chromatography was performed using SiliCycle 60 F254 silica gel (precoated sheets, 0.25 mm thick) with 5% (v/v) phosphomolybdic acid stain (Acros Organics; Morris Plains, NJ) to visualize terpenoids. Construction of CAH plasmids. The native CAS gene was amplified from ptrc99a-cas and inserted into the KpnI-XbaI sites of pbad33 using standard protocols. The primers used to amplifiy the CAS gene were: 5 -GGG GTA CCA AGG AGA TAT ACC ATG GCT TCA CAA GTT TCT CAA ATG CCT TCT TCA TCA CCC-3 and 5 -GCT CTA GAT CAA AGT GCA ATT GGT TCA ATG AGC AAT GAA GTG ATT CCA CCC-3. The pbad33-cas construct was sequenced using the following primers: 5 -AAT TGC TTC ATC CAA TAT ATC TTC C-3, 5 -ATG GCT TCA CAA GTT TCT CAA ATG CC-3, 5 -TCT TTC AGT ATA CCA AGA TAT TGA GTC CC-3, 5 -AGG CTA ATG CAT TGC CAA CTT GTG G-3. Synthetic genes were obtained from DNA 2.0 with the CAH variant (scah, trcah, Bov-CAH, A13- CAH, A17-CAH, PD1-CAH, ompa-a17-cah, mistic-cah) inserted in the NdeI-AvrII sites of petduet-1 containing ctcpr inserted in the NcoI-AflII sites. The native CAH was amplified from pbks-cah and inserted into the NdeI-AvrII sites of petduetctcpr containing a synthetic codon-optimized CPR from C. tropicalis (ctcpr) inserted in the NcoI- AflII sites using standard protocols. The primers used to amplify the CAH gene were: 5 - ATC ATA TGT TGC AAA TAG CTT TCA GCT CG-3 and 5 -TAT TCC TAG GTT ACT TCA TAT AGT GCT GGA GAT TTG ATG G-3. The pduet-ctcpr-ncah construct was sequenced using the following primers: 5 -GCA CTT GTT CCC TCC TTA GAC C-3, 5 -ACA GTA GTC CAG ATT GGA GAA TGA AG-3,5 -GCC GTC GAT GAG ACT CAC TTG-3. The aacpr ct gene was obtained as a synthetic codon-optimized gene from DNA 2.0, with the native N- terminal sequence replaced with the N-terminal leader from ctcpr. Digestion of the pj4-aacpr ct plasmid with BsaI yielded the aacpr ct gene with ends compatible for replacement of ctcpr in the NcoI- AflII site of petduet-ctcpr-bovcah. The petduet-aacpr ct -BovCAH construct was sequenced using the following primers: 5 -ACC GTG TTC TTC GGC ACC C-3, 5 -GAT GAT ACC TCC GTG GCA ACC-3, 5 -GTG CTC TGG TGT ATG AGC AGA CT-3, 5 -TCG CCA TCA CCG TAC GTA GCC-3. S2
The BovCAH gene was obtained as the NdeI-XbaI fragment from digestion of petduet-ctcpr- BovCAH and inserted into the NdeI-XbaI site of pcwori. The ctcpr gene was amplified from petduet-ctcpr-bovcah and inserted into the XbaI-SalI site of pcwori-bovcah construct using standard protocols. The primers used to amplify the ctcpr gene were: 5 -GCT CTA GAA AGG AGA TAT ACC ATG GCG CTG GAT AAA CTG GAT CTG TAT GTG ATC-3 and 5 -TAT TGT CGA CTT ACC ATA CAT CTT CCT GGT AGC GAT TCT GAA CTT TCC-3. The pcwori-bovcahctcpr construct was sequenced using the following primers for Bov-CAH: 5 -ATC CGT AAA CCG AAG AAA GAT ATC GC-3, 5 -TTA ATG GAG CTG TGG GAC AGC ATC-3, 5 -CTG GCT CGC TTC GAC ATC CA-3. The ctcpr gene was sequenced using the following primers: 5 -ATG GCG CTG GAT AAA CTG GAT CTG T-3, 5 -CTG AAG GTA TTG ACC TGA CCA AGG GT-3, 5 -TGC CGG TTC ACG TTC GTC GCA GC-3, 5 -ATA TTC CGC GAA ACG ATC ACC ACC C-3. The CAS gene was amplified from ptrc99a-cas and inserted into the AflII site in the pduet-ctcpr- BovCAH construct using standard protocols. The primers used to amplify the CAS gene were: 5 -AAT CTT AAG AAG GAG ATA TAC CAT GGC TTC ACA AGT TTC TCA AAT GCC TTC TTC ATC ACC C-3 and 5 -AAT CTT AAG TCA AAG TGC AAT TGG TTC AAT GAG CAA TGA AGT GAT TCC ACC C-3. The transformants were screened by colony PCR for directionality before sequencing. The pduet-ctcpr-cas-bovcah construct was sequenced with same primers used for pbad33-cas. Isolation and characterization of cadinene (3) and 8-hydroxycadinene (4). 500 ml of TB with 2% glycerol containing carbenicillin (50 µg/l), chloramphenicol (25 µg/l), and tetracycline (5 µg/l) in a 2- L baffled shake flask was inoculated with 10 ml of overnight LB culture of BL21(de3)/pMBIS freshly transformed with pmevt and pduet-ctcpr-cas-bovcah. The cultures were grown at 30ºC at 150 rpm to OD 600 = 0.25-0.30 before inducing with IPTG (1 mm). At this time, ALA (65 mg/l) was added to the culture and the temperature was dropped to 25ºC for 5 d. Dodecane (5% v/v) was added upon induction to cultures grown for isolation of 4. δ-cadinene (3). Cultures grown without a dodecane overlay were extracted with 2 200 ml ethyl acetate. The organic layer was dried over sodium sulfate and evaporated to dryness. The resulting brown solid was dissolved in a minimal volume of dichloromethane. 3 was isolated by flash chromatography (silica, hexanes). 1 H-NMR (500 MHz, CDCl 3, 25 C): δ 5.47 (s, 1H), 2.73 (ddd, J 1 = 13.0 Hz, J 2 = 5.0 Hz, J 3 = 2.5 Hz, 1H), 2.53 (br d, J = 10 Hz, 1H), 2.07 (m, 1H), 2.02 (m, 2H), 1.97 (m, 2H), 1.91 (m, 1H), 1.69 (s, 3H), 1.68 (s, 3H), 1.63 (dq, J 1 = 12.5 Hz, J 2 = 2.5 Hz, 1H), 1.17 (ddd, J 1 = 24.5 Hz, J 2 = 12.5 Hz, J 3 = 6.0 Hz, 1H), 1.07 (tt, J 1 = 10 Hz, J 2 = 2.5 Hz, 1H), 0.97 (d, J = 7.0 Hz, 3H), 0.80 (d, J = 7.0 Hz, 3H). 13 C-NMR (400 MHz, CDCl 3, 25 C): δ 134.21, 129.92, 124.68, 124.49, 45.34, 39.42, 32.33, 31.97, 26.77, 26.70, 23.59, 21.76, 21.18, 18.53, 15.67. For assignments, see Davis et al, Mag. Res. Chem. 1996, 34, 156-161. 8-hydroxy-δ-cadinene (4). Cultures grown with a dodecane overlay were centrifuged to separate the aqueous and organic phases. 4 was isolated by flash chromatography (silica) from the dodecane layer using a step gradient starting with hexanes and 5 ethyl acetate: 95 hexanes washes, followed by elution with 30 ethyl acetate: 70 hexanes. 1 H-NMR (500 MHz, CDCl 3, 25 C): δ 5.57 (d, J = 1.0 Hz, 1H), 4.10 (br s, 1H, OH), 3.11 (dd, J 1 = 12.0 Hz, J 2 = 6.0 Hz, 1H), 2.56 (br d, J = 10.0 Hz, 1H), 2.05 (qd,, J 1 = 7.0 S3
Hz, J 2 = 2.5 Hz, 1H), 1.96 (m, 2H), 1.88 (t, J = 11.5 Hz, 1H), 1.79 (s, 3H), 1.68 (s, 3H), 1. 64 (dd, J 1 = 6.5 Hz, J 2 = 4.0 Hz, 1H), 1.47 (d, 1H, J = 6.0 Hz), 1.17 (m, 1H), 1.04 (m, 1H), 0.98 (d, J = 7.0 Hz, 3H), 0.81 (d, J = 7.0 Hz, 3H). 13 C-NMR (400 MHz, CDCl 3, 25 C): δ 136.43, 128.00, 127.71, 126.99, 71.54, 45.30, 40.12, 37.68, 32.36, 27.08, 21.94, 21.16, 19.17, 18.81 15.90. Construction of AMO plasmids. The native AMO gene was amplified from pescura-cyp71va1- aacpr native (Ro et al, Nature 2006, 440, 940-943.) and inserted into the NdeI-AvrII sites of petduet-1 containing a synthetic CPR from C. tropicalis (ctcpr) inserted in the NcoI-AflII sites using standard protocols. The primers used to amplify the namo gene were: 5 -GGA ATT CCA TAT GAA GAG TAT ACT AAA AGC AAT GGC ACT CTC ACT GAC CAC -3 and 5 - TCT TCC TAG GCT AGA AAC TTG GAA CGA GTA ACA ACT CAG TCT TTC TTT GCA T-3. The pduet-ctcpr-namo construct was sequenced using the following primers: 5 -ATG CAT CAC TTG ATT GGT ACA ACG CC-3, 5 -ATC GAT AAC CTT GTA GCT GAG CAT AC-3, 5 -CCA ACA CTC CTC TTC ACG AAG TGA CTG-3. Synthetic genes were obtained from DNA 2.0 with the AMO variant (samo, A13-AMO, A17-AMO, and Bov-AMO) inserted in the NdeI-AvrII sites of petduet-1 containing ctcpr inserted in the NcoI- AflII sites. The tramo gene was amplified from pduet-ctcpr-samo and inserted into the NdeI-AvrII sites of petduet-1 containing a synthetic CPR from C. tropicalis (ctcpr) inserted in the NcoI-AflII sites using standard protocols. The primers used to amplify the tramo gene were: 5 -AAT GGA ATT CCA TAT GGC GAC CCG TTC TAA AAG CAC TAA GAA ATC TCT GCC GG-3 and 5 -TCT TCC TAG GTT AAA AGG ACG GAA CCA GCA GCA GTT CGG TTT TAC GC-3. The pduet-ctcpr-tramo construct was sequenced using the following primers: 5 - CCA ACC GTC CGG AAA CTC TGA CC-3, 5 -TCG ATA ATC TGG TCG CCG AAC-3, 5 - GAT CTC TTG TAC CAG GTT CCA GC-3. The A13AMO gene was amplified from the pduet-ctcpr-a13amo plasmid and inserted into the NdeI-XbaI sites of pcwori using standard protocols. The primers used to amplify the A13AMO gene were: 5 -AAT GGA ATT CCA TAT GAC CGT ACA CGA CAT CAT CGC AAC GTA CTT CAC-3 and 5 -GCT CTA GAT TAA AAG GAC GGA ACC AGC AGC AGT TCG GTT TTA CGC-3. Next, the aacpr ct gene was amplified from pj4-aacpr ct and inserted into the XbaI-SalI sites of pcwori- A13AMO using standard protocols. The primers used to amplify the aacpr ct gene were: 5 -GCT CTA GAA GGA GAT ATA CCA TGG CGC TGG ATA AAC TGG ATC TGT ATG TGA TCA TTA CGC- 3 and 5 -TAT TGT CGA CTT ACC AGA CAT CAC GCA GGT AGC GGC C-3. The pcwori- A13AMO-aaCPR ct construct was sequenced using the primer sets for tramo (see pduet-ctcprtramo) and aacpr ct (see petduet-aacpr ct -BovCAH). S4
Figure S1. 1 H-NMR spectra (500 MHz) of 3 (a) and 4 (b). S5
Table S1. Construct wtcah Truncated (tr) N-terminal sequences of CAH variants. The predicted TM sequence of wtcah is underlined. N-terminal sequence MLQIAFSSYSWLLTASNQKDGMLFPVALSFLVAILGISLWHVWT-IRKPKK... MA-IRKPKK... P45017a (Bos taurus, Bov) CYP52A13 (C. tropicalis, A13) CYP52A17 (C. tropicalis, A17) OmpA-A17 Peptitergent (PD1) mistic MALLLAVFLGLSCLLLLSLW-IRKPKK... MTVHDIIATYFTKWYVIVPLALIAYRVLDYFY-IRKPKK... MIEQLLEYWYVVVPVLYIIKQLLAYTK-IRKPKK... MKKTAIAIAVALAGFATVAQA-MIEQLLEYWYVVVPVLYIIKQLLAYTK-IRKPKK... EELLKQALQQAQQLLQQAQELAKK-IRKPKK... [MFCT...mistic...GEKE]-IRKPKK... Table S2. Construct wtamo Truncated (tr) N-terminal sequences of AMO variants. The predicted TM sequence of wtamo is underlined. N-terminal sequence MKSILKAMALSLTTSIALATILLFVYKF-ATRSKS... M-ATRSKS... P45017a (Bos taurus, Bov) CYP52A13 (C. tropicalis, A13) CYP52A17 (C. tropicalis, A17) MALLLAVFLGLSCLLLLSLW-ATRSKS... MTVHDIIATYFTKWYVIVPLALIAYRVLDYFY-ATRSKS... MIEQLLEYWYVVVPVLYIIKQLLAYTK-ATRSKS... Table S3. Construct wt-aacpr aacpr ct N-terminal sequence of aacpr ct. The predicted TM sequence of wt-aacpr is underlined. N-terminal sequence MQSTTSVKLSPFDLMTALLNGKVSFDTSNTSDTNIPLAVFMENRELLMILTTSVAVLIGCVVVLVW-RRSS... MALDKLDLYVIITLVVAVAAYFAKN-RRSS... S6
Figure S2. In vivo production of cadinene (3) and 8-hydroxycadiene (4) in E. coli with wtcah. (a) Fullscan GC-MS trace of dodecane overlay from a negative-control strain with no P450 (CPR only) and strains 1 and 2 from Figure 1a. IS, caryophyllene internal standard. (b) EI mass spectrum of 3. (c) EI mass spectrum of 4. Figure S3. Time courses for in vivo production of 8-hydroxycadinene (4) in E. coli. (a) Production of 4 with N-terminal CAH variants. (b) In vivo production of 4 with BovCAH construct in E. coli containing the full pathway from atob to CAH. S7
Figure S4. In vivo production of artemisinic acid (6) in E. coli with A13AMO. (a) Oxidation of amorphadiene (5) catalyzed by AMO to generate artemisinic alcohol (16), artemisinic aldehyde (17), and artemisinic acid (6). (b) Full-scan GC-MS trace of media extract from E. coli DH1 containing the pam92 and pcwori-a13amo-aacpr ct plasmids, when cultured with no dodecane layer (bar 10 from Figure 1b) compared to authentic standards of 6, 16, and 17. 6 is detected as the methyl ester derivative. IS, caryophyllene internal standard. S8
Figure S5. EI mass spectra of 5 (a) and derivatized 6 (b) produced in E. coli compared to authentic standards. S9