Supporting Online Material for

Size: px
Start display at page:

Download "Supporting Online Material for"

Transcription

1 Supporting Online Material for Genome Transplantation in Bacteria: Changing One Species to Another Carole Lartigue, John I. Glass, * Nina Alperovich, Rembert Pieper, Prashanth P. Parmar, Clyde A. Hutchison III, Hamilton O. Smith, J. Craig Venter *To whom correspondence should be addressed. Jglass@jcvi.org This PDF file includes Materials and Methods SOM Text Fig. S1 Table S1 and Table S2 legend References Published 28 June 2007 on Science Express DOI: /science Other Supporting Online Material for this manuscript includes the following: (available at Table S2 as zipped XLS file: Mass Spectrometry Data 1

2 Materials and Methods Mass spectrometry analysis of donor DNA containing agarose plugs for residual proteins. Agarose plugs containing M. mycoides LC cells after exhaustive treatment with detergent and proteinase K according to the Bio-Rad CHEF Mammalian Genomic DNA Plug Kit protocol were evaluated. Four plugs that had been electrophoresed in pulsed-field gels as described in the text and four plugs that were not exposed to PFGE were analyzed by the Proteomics Facility at the University of California at Davis. No M. mycoides LC encoded peptides were found in the samples not run on PFGE. The protocol employed modification of standard methods to analyze protein spots excised from acrylamide gels. Modifications were used because these samples were in agarose rather than acrylamide. Proteins were reduced by reacting with dithiothreitol at 37ºC in order not to melt the agarose. Samples were then treated with trypsin to fragment any proteins present prior to mass spectrometry analysis. The table lists M. mycoides LC peptides found associated with the plugs. Only trace amounts of these peptides were found, and no single M. mycoides LC peptide was found on more than one sample. For reasons we do not understand, M. mycoides LC peptides were only found in the samples that were exposed to PFGE. None of the five M. mycoides LC proteins we identified would be expected to be involved with DNA metabolism. Table S1. Mass spectrometry analysis of donor DNA containing agarose plugs for residual proteins. Sample Peptide sequence* Gene annotation Protein pi Protein molecular mass (kd) LDGDAR 255 Signal recognition particle M54 protein LSNPIYLTK 34 Hypothetical protein MRPDLFKK 111 triacylglycerol lipase SPFELSGGQK 170 ABC transporter, ATP-binding component (vitamin B12?) MYIDPQKR 520 endopeptidase O *The numbers give the locations of the detected peptides in the M. mycoides LC proteins. Mass spectrometry from an analysis of 96 two-dimensional electrophoresis gel spots from the transplant cell lysates. Proteins in two-dimensional electrophoresis (2-DE) gel spots derived from cells of Mycoplasma mycoides LC, the transplant clone and Mycoplasma capricolum were excised, digested with sequencing-grade trypsin, spotted on matrix-assisted laser desorption/ionization (MALDI) targets and analyzed in a matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) MALDI-TOFTOF instrument (Applied Biosystems 4700 Proteomics Analyzer). Details for the MS methods are provided elsewhere (S1). After conversion of mass/charge ratio (m/z) values detected in peptide mass fingerprints (PMF) into peak lists (.txt files), the data were searched in the PMF mode of the search engine Mascot (Matrix Science), which employs a probability-based search algorithm. The databases searched were derived from a draft protein database for M. mycoides LC (the gi accession numbers correspond to those of Mycoplasma mycoides subsp. mycoides SC strain PG1) and a protein database for M. capricolum subsp. capricolum California kid ATCC in National Center for Biotechnology Information (NCBI) (MCAP_ accession numbers). The mass 2

3 tolerances were set at ±100 ppm for MS ion searches (+1 charge) and at ±0.2 daltons for tandem mass spectrometry (MS/MS) fragment ion searches. Mascot protein scores are probability-based Mowse scores (scores >75 are significant, P < 0.05). Peptide ion scores were obtained in the MS/MS analysis mode of Mascot by selecting the 10 most abundant mass peaks for collisioninduced dissociation (CID) in a MS data-dependent mode. Mascot scores were frequently obtained in the MS and MS/MS mode from searches of both databases, resulting from high amino acid sequence identities between homologous proteins in M. capricolum and M. mycoides LC. Some peptides featured the same sequence, but a 16 mass unit difference. This is related to methionine oxidation. As a Mascot search parameter, oxidation of methionine was set as a fixed modification. Table S2. MALDI-TOF mass spectrometry data derived from an analysis of 96 2-DE gel spots from the transplant cell lysate (Fig. 5C). [Table S2 is attached as an Excel file.] Worksheet 1. A) the MS sample numbers correspond to MS raw file names (.t2d files); B) Mascot protein score; C) predicted isoelectric point (pi); D) predicted molecular mass; E) peptide sequence coverage in % based on peptide sequences associated with m/z values versus amino acid sequence of entire protein. Worksheets 2, 3, and 4. MS data (m/z values for peptides) and associated MS/MS data of (2) the transplant clone, (3) M. mycoides LC and (4) M. capricolum. F) observed m/z values in PMF data file; G) sequences in red: peptide unique to the M. capricolum protein (database); in BOLD red: peptide unique to the M. capricolum protein, which also had a peptide fragment ion score (MS/MS mode), * peptide of a transplant clone protein was matched only to the sequence of the M. capricolum protein (according to the database), but the same peptide originating from a M. mycoides protein was also matched only to the sequence of the M. capricolum protein; blue: peptide unique to the M. mycoides LC protein (database); H) location of peptide sequence in the protein counting from the predicted N terminus; I) observations related to differences in MS and MS/MS sequences and scores comparing proteins in databases for M. capricolum and M. mycoides LC. Additional Comments: For most of the samples from the transplant clone, M. capricolum and M. mycoides, Mascot protein scores were obtained for equivalent proteins searching the M. capricolum and the M. mycoides databases. This was not surprising because of high sequence identities in the proteins of these related species. It was evident that Mascot protein scores were frequently higher for proteins annotated in the M. mycoides LC database compared with the M. capricolum database, when proteins of the transplant clone and M. mycoides were analyzed. Mascot protein scores were always higher for proteins annotated in the M. capricolum database, when proteins of M. capricolum were analyzed (see column D, worksheets 2 to 4). This observation suggested that, occasionally, open reading frames (ORFs) were not accurately annotated in the draft of the genome sequence of M. mycoides LC resulting in (i) lack of a protein in the associated protein database, (ii) wrong prediction of the translation start site, or (iii) sequencing errors translated into incorrect peptide sequences. Such peptide/protein sequence prediction errors resulted in lowered Mascot scores and peptides mis-matched after MS analysis of 2-DE gel spots from the transplant clone and M. mycoides. Lack of annotation of an ORF resulted in identification of the protein by MS only from the M. capricolum database. This was observed for three proteins, 3

4 ArcB and a glycosyl hydrolase (MACP_0191) in transplant and M. mycoides and a conserved hypothetical cytosolic protein (MCAP_0598) in M. mycoides. The MS/MS data confirmed that we correctly interpreted the Mascot protein score differences for the transplant clone and M. mycoides versus M. capricolum. Not a single peptide sequence derived from an analyzed M. capricolum protein gave a significant MS/MS score for a peptide from the M. mycoides protein database, unless the peptide was identical in its sequence or indistinguishable by mass spectrometry in M. mycoides and M. capricolum. Analyzing transplant protein samples, not a single peptide with a MS/MS score greater than 10 was obtained for a M. capricolum protein in the database, unless a similar MS/MS peptide score level was also observed for the same protein from the M. mycoides LC strain. The reasons for this were mentioned in the paragraph above. For corresponding 2-DE gel spots proteins of transplant clone and M. mycoides, the peptide identifications and associated MS/MS scores correlated well. For the corresponding proteins of M. capricolum, the MS/MS data were invariably different, unless high amino acid sequence homologies were observed for a given protein in M. capricolum and M. mycoides LC (e.g., elongation factor Tu). Overall, the ratios of peptides with MS/MS scores greater than 10 distinct in the protein sequence in the M. capricolum database compared with the M. mycoides database were as follows: M. capricolum samples, 191/1; M. mycoides, 44/167; transplant clone, 23/91. The ratios for M. mycoides and transplant clone were nearly the same (0.26). In conclusion, the MS and MS/MS data obtained by MALDI-TOF analysis are consistent with the notion that all the analyzed Mycoplasma transplant clone proteins are identical to those of M. mycoides LC. There was no case in which the MS data for a protein from the transplant clone were indicative of a protein resembling the M. capricolum protein sequence more than the corresponding M. mycoides protein sequence, as long as MS data for such a protein were available from M. mycoides LC strain and transplant protein samples for comparative purposes. The combined data of positional locations of 2-DE gel spots and their mass spectrometric analyses (>75 features) present strong evidence that there are no differences between the transplant clone proteins and the corresponding M. mycoides proteins. The differences to the proteins from M. capricolum samples were clear provided that the amino acid sequences of the M. capricolum and M. mycoides proteins were not identical or nearly identical. SOM Text Genome transplantation approaches that did not work Because mycoplasmas are similar to mammalian cells with respect to their lack of a cell wall, we experimented with a series of approaches that are effective for transferring large DNA molecules into eukaryotic cells. These included cation-detergent mediated transfection, electroporation, and compaction of the donor genomes using various cationic agents. None of those approaches proved effective for whole genome transplantation. In search of a method to transplant genomes,we explored many approaches. We hypothesized that mycoplasmas lack of cell walls may make them amenable to some of the approaches used to transfer DNA into mammalian cells. We also decided to first optimize methods for the transfer of oric plasmids into M. capricolum, because this is an established technique that we felt had room for optimization (S2). We tested a series of cationic detergents used for transfection of mammalian cells. Those reagents included: Lipofectin (Invitrogen), DMRIE-C reagent (Invitrogen), Lipofectamine 2000 (Invitrogen), Lipofectamine reagent (Invitrogen), FuGENE 6 (Roche), and DOSPER (Roche). Some of these reagents include cholesterol, which may be necessary for effective fusion of liposomes with mycoplasma membranes (S3). None of these methods used either as directed by 4

5 the manufacturers yielded any genome transplants. We also explored electroporation using a variety of conditions. Although this method does work inefficiently for plasmid transformation of M. capricolum, it did not work for genome transplantation. We were able to improve our efficiency of polyethylene glycol (PEG) based plasmid transformation approximately 100 times, and it was this set of experiments that eventually led to successful genome transplantation. Other ancillary methods we tested included adding polyethylenimine (PEI) at a whole range of concentrations to the donor genomic DNA to compact the chromosomes and make them more easily manipulated (S4). This proved to be of no value. References S1. C. L. Gatlin et al., Proteomics 6, 1530 (2006). S2. K. W. King, K. Dybvig, Plasmid 26, 108 (1991). S3. M. Tarshis, M. Salman, S. Rottem, Biophys J 64, 709 (1993). S4. P. Marschall, N. Malik, Z. Larin, Gene Ther 6, 1634 (1999). Fig. S1. M. capricolum and M. mycoides LC specific PCR amplification of both wild-type strains and a transplant (11.1). The M. capricolum specific primers, ADI-MccF (GTAATTGATAATGAATTAAACCAAAGGA) and ADI-MccR (AACCATACCAAACTCACTTTTAAAA) amplified a region of the arginine deiminase gene. The M. mycoides LC specific primers, IS1296P1F (AAGCGTTTAGAATAGAAGGGCTA) and IS1296P1R (CTGAATTGTACAGGAGACAATCC) amplified a region of the IS1296 insertion element. 5

6