Supporting Information Identification of Microprotein-Protein Interactions via APEX Tagging Qian Chu, Annie Rathore,, Jolene K. Diedrich,, Cynthia J. Donaldson, John R. Yates III, and Alan Saghatelian *, The Salk Institute for Biological Studies, Clayton Foundation Laboratories for Peptide Biology, 10010 N. Torrey Pines Rd, La Jolla, CA 92037, USA Division of Biological Sciences, University of California San Diego, 9500 Gilman Dr., La Jolla, CA 92093, USA Department of Chemical Physiology, The Scripps Research Institute, 10550 N. Torrey Pines Rd, La Jolla, CA 92037, USA * To whom correspondence should be addressed. Email: asaghatelian@salk.edu (A.S.) S1
Experimental Section Mass Spectrometry and Data Analysis The digested samples were analyzed on a Q Exactive mass spectrometer (Thermo). The digest was injected directly onto a 30cm, 75µm ID column packed with BEH 1.7µm C18 resin (Waters). Samples were separated at a flow rate of 200nl/min on a nlc 1000 (Thermo). Buffer A and B were 0.1% formic acid in water and acetonitrile, respectively. A gradient of 5-40%B over 110min, an increase to 50%B over 10min, an increase to 90%B over another 10min and held at 90%B for a final 10min of washing was used for 140min total run time. Column was re-equilibrated with 20µl of buffer A prior to the injection of sample. Peptides were eluted directly from the tip of the column and nanosprayed directly into the mass spectrometer by application of 2.5kV voltage at the back of the column. The Q Exactive was operated in a data dependent mode. Full MS 1 scans were collected in the Orbitrap at 70K resolution with a mass range of 400 to 1800 m/z and an AGC target of 5e 6. The ten most abundant ions per scan were selected for MS/MS analysis with HCD fragmentation of 25NCE, an AGC target of 5e 6 and minimum intensity of 4e 3. Maximum fill times were set to 60ms and 120ms for MS and MS/MS scans respectively. Quadrupole isolation of 2.0m/z was used, dynamic exclusion was set to 15 sec and unassigned charge states were excluded. Protein and peptide identification were done with Integrated Proteomics Pipeline IP2 (Integrated Proteomics Applications). Tandem mass spectra were extracted from raw files using RawConverter 1 and searched with ProLuCID 2 against human UniProt database appended with microprotein sequences. The search space included all fullytryptic and half-tryptic peptide candidates with maximum of two missed cleavages. Carbamidomethylation on cysteine was counted as a static modification. Data was searched with 50 ppm precursor ion tolerance and 50 ppm fragment ion tolerance. Data was filtered to 10 ppm precursor ion tolerance post search. Identified proteins were filtered using DTASelect 3 and utilizing a target-decoy database search strategy to control the false discovery rate to 1% at the protein level. S2
Materials Primary antibodies: rabbit anti-myc tag (#2272, Cell Signaling Technology), mouse anti- FLAG (F1804, Sigma), rabbit anti-flag (#2368, Cell signaling Technology), anti-biotin, HRP-linked antibody (#7075, Cell Signaling Technology), rabbit anti-ku70 (#4104, Cell Signaling Technology) and rabbit anti-ku80 (#2180, Cell Signaling Technology), mouse anti-ha (H9658, Sigma). Secondary antibodies: goat anti-rabbit IRDye 800CW (926-32211, LI-COR), goat antimouse IRDye 800CW (926-32210, LI-COR), Goat anti-mouse Alexa Fluor 647 (A21235, Life Technologies), Goat anti-rabbit Alexa Fluor 488 (A11008, Life Technologies). Plasmids pcdna3-mri-1 was generated as previously described 4. pcdna3-apex-nes was a gift from Alice Ting (Addgene plasmid #49386). pcdna3-flag-apex was generated by inserting a stop codon before NES sequence using QuikChange II kit (Agilent Technologies). APEX coding sequence was subcloned into pcdna3.1(+) vector with an N-terminal myc-tag to generate pcdna3.1-myc-apex construct. C11orf98 microprotein coding sequence was obtained by PCR amplification of HEK293T cdna pool. FLAG- MRI-APEX, C11orf98-APEX-myc and myc-apex-c11orf98 were amplified by overlap extension PCR and subcloned into pcdna3.1(+) vector. NPM1 cdna (BC050628) was obtained from MGC cdna library at the Salk Institute, and subcloned into pcdna3.1(+) vector with a C-terminal HA tag. S3
Figure S1. Loading control of APEX and MRI-APEX FLAG immunoprecipitation (Figure 2B). Anti-FLAG immunoprecipitation of HEK293T cells expressing APEX or MRI-APEX. Eluted proteins were separated by SDS-PAGE and visualized by Coomassie staining. (WCL = whole cell lysate) S4
Figure S2. Biotinylated proteomes of untransfected, APEX and MRI-APEX transfected cells. HEK293T cells were transfected with FLAG-MRI-APEX or APEX control construct. 24 hours post-transfection, cell culture medium was changed to fresh growth media containing 500 µm biotin-tyramide. After incubation for 30 min, H 2 O 2 was added at a final concentration of 1 mm and treated for 1 min. Cells were then lysed and biotinylated proteins were enriched by streptavidin beads and analyzed by Western blotting using anti-biotin antibody, A) short exposure and B) long exposure. Biotin-ladder was purchase from Cell Signaling #7727. S5
Figure S3. Loading control of MRI-APEX biotinylation (Figure 2C and S2). HEK293T cells were transfected with FLAG-MRI-APEX or APEX control construct. 24 hours post-transfection, cell culture medium was changed to fresh growth media containing 500 µm biotin-tyramide. After incubation for 30 min, H 2 O 2 was added at a final concentration of 1 mm and treated for 1 min. Cells were then lysed and biotinylated proteins were enriched by streptavidin beads and analyzed by Coomassie staining. S6
Figure S4. Schematic illustration of APEX control and C11orf98-APEX constructs. S7
Figure S5. The majority of proteins enriched by APEX-C11orf98 and C11orf98- APEX are reported to have a nuclear localization. S8
Figure S6. Loading control of C11orf98-APEX and APEX-C11orf98 biotinylation (Figure 4A). HEK293T cells were transfected with C11orf98-APEX, APEX-C11orf98 or APEX control construct. 24 hours post-transfection, cell culture medium was changed to fresh growth media containing 500 µm biotin-tyramide. After incubation for 30 min, H 2 O 2 was added at a final concentration of 1 mm and treated for 1 min. Cells were then lysed and biotinylated proteins were enriched by streptavidin beads and analyzed by Coomassie staining. S9
Figure S7. Validation of C11orf98 interaction with NPM1 and NCL by FLAG immunoprecipitation. HEK293T cells were transfected with C11orf98-FLAG or pcdna3.1(+) empty vector. 48 hours post-transfection, cells were harvested and cell lysates were subjected to FLAG immunoprecipitation by anti-flag M2 affinity gel. Bound proteins were eluted by incubating the beads with 3 FLAG peptide for 1 hour and analyzed by Western blotting with indicated antibodies. S10
Figure S8. Loading control of C11orf98-FLAG FLAG immunoprecipitation (Figure S7). Anti-FLAG immunoprecipitation of HEK293T cells expressing C11orf98-FLAG or pcdna3.1 control vector. Eluted proteins were separated by SDS-PAGE and visualized by Coomassie staining. (WCL = whole cell lysate) S11
Figure S9. Loading control of reciprocal anti-ha immunoprecipitation (Figure 4B). Reciprocal anti-ha immunoprecipitation of NPM1-HA from HEK293T cells co-expressing C11orf98-FLAG, with cells expressing C11orf98-FLAG alone as a control. Eluted proteins were analyzed by Coomassie staining. S12
Figure S10. Validation of C11orf98 interaction with NPM1 and NCL by reciprocal immunoprecipitation. HEK293T cells were co-transfected with C11orf98-FLAG and NPM1-HA. 48 hours post-transfection, cell lysates were subjected to immunoprecipitation using anti-ha agarose beads (mouse IgG beads as control). Bound proteins were eluted and separated by SDS-PAGE. NPM1 and C11orf98 proteins were visualized by Western blotting using anti-ha and anti-flag antibodies. S13
Figure S11. Loading control of reciprocal anti-ha immunoprecipitation (Figure S10). HEK293T cells were co-transfected with C11orf98-FLAG and NPM1-HA. 48 hours post-transfection, cell lysates were subjected to immunoprecipitation using anti-ha agarose beads (mouse IgG beads as control). Bound proteins were eluted and separated by SDS-PAGE. NPM1 and C11orf98 proteins were visualized by Coomassie staining. S14
Figure S12. C11orf98 microprotein was identified in public protein interaction database. Searching IP-MS raw data of NPM1 (from Mann s dataset and Gygi s Bioplex) and NCL (from Gygi s Bioplex) revealed multiple A) spectral counts and B) peptide fragments of C11orf98 microprotein. S15
References: (1) He, L.; Diedrich, J.; Chu, Y. Y. and Yates, J. R., 3rd (2015) Extracting Accurate Precursor Information for Tandem Mass Spectra by RawConverter. Anal. Chem. 87, 11361-11367. (2) Xu, T.; Park, S. K.; Venable, J. D.; Wohlschlegel, J. A.; Diedrich, J. K.; Cociorva, D.; Lu, B.; Liao, L.; Hewel, J.; Han, X.; Wong, C. C.; Fonslow, B.; Delahunty, C.; Gao, Y.; Shah, H. and Yates, J. R., 3rd (2015) ProLuCID: An improved SEQUEST-like algorithm with enhanced sensitivity and specificity. J. Proteomics 129, 16-24. (3) Tabb, D. L.; McDonald, W. H. and Yates, J. R., 3rd (2002) DTASelect and Contrast: tools for assembling and comparing protein identifications from shotgun proteomics. J. Proteome Res. 1, 21-26. (4) Slavoff, S. A.; Heo, J.; Budnik, B. A.; Hanakahi, L. A. and Saghatelian, A. (2014) A human short open reading frame (sorf)-encoded polypeptide that stimulates DNA end joining. J. Biol. Chem. 289, 10950-10957. S16