Complete Sequence of the Rous Sarcoma Virus env Gene: Identification of Structural and Functional Regions of Its

Size: px
Start display at page:

Download "Complete Sequence of the Rous Sarcoma Virus env Gene: Identification of Structural and Functional Regions of Its"

Transcription

1 JOURNAL OF VIROLOGY, June 1983, p X/83/692-17$2./ Copyright 1983, American Society for Microbiology Vol. 46, No. 3 Complete Sequence of the Rous Sarcoma Virus env Gene: Identification of Structural and Functional Regions of Its Product ERIC HUNTER,'* EDGAR HILL,1 MARIE HARDWICK,' AJIT BHOWN,2 DENNIS E. SCHWARTZ,3 AND RICHARD TIZARD3 Departments of Microbiology1 and Medicine,2 University of Alabama in Birmingham, University Station, Birmingham, Alabama 35294, and Department ofbiochemistry and Molecular Biology, Harvard University, Cambridge, Massachusetts Received 1 November 1982/Accepted 1 March 1983 The amino-terminal amino acid sequences of gp85 and gp37, the envelope glycoproteins of Rous sarcoma virus (RSV), were determined. Alignment of these sequences with the amino acid sequence predicted from the complete nucleotide sequence of the Prague strain of RSV, subgroup C (PR-C), has allowed us to delineate the env gene-coding region of this virus. The coding sequences for gp85 and gp37 have been placed in an open reading frame that extends from nucleotide 545 to nucleotide 6862 and predict sizes of 341 amino acids (36,962 molecular weight) for gp85 and 198 amino acids (21,566 molecular weight) for gp37. Carbohydrate makes a significant contribution to the observed molecular weights of these polypeptides-the amino acid sequence contains 14 potential glycosylation sites (Asn-X-Ser/Thr) in gp85 and two in gp37. Experiments aimed at estimating the number of carbohydrate side chains yielded results consistent with most or all of these sites being occupied. Although an initiation codon is located early (codon 4) in the open reading frame, it is likely that splicing yields an mrna on which translation initiates at the same AUG as that of the gag gene to produce a nascent polypeptide in which gp85 is preceded by a 62-amino-acid-long leader peptide. This leader contains the hydrophobic sequence (signal sequence) necessary for translocation across the endoplasmic reticulum and is completely removed from the env gene product during translation. The polyprotein precursor, pr95env is cleaved to gp85 and gp37 at the carboxyl side of the basic sequence:- Arg-Arg-Lys-Arg-. gp85 is attached through a disulphide linkage to gp37, and although the positions of the cysteines involved in this linkage are not known, the presence of a 27-amino-acid-long hydrophobic region at the carboxy-terminus of gp37 is consistent with its role as a membrane anchor for the viral glycoprotein complex. The location of host range variable regions with respect to the possible tertiary structure of the complex is discussed. The env gene of Rous sarcoma virus (RSV) codes for two glycosylated polypeptides, gp85 and gp37, that are inserted into the lipid bilayer of the virion. The two glycoproteins are linked via a disulphide bond to yield the so-called VGP complex of the virion (5), which in electron micrographs can be observed as a knobbed spike; gp85 represents the knob and gp37 the spike (3, 9, 1). Both polypeptides are susceptible to protease digestion which generates noninfectious, "bald" particles (68). The viral glycoproteins function in viral attachment to and penetration of the host cell where they interact with specific surface receptors coded for by the target cell (16, 6, 82). The sequence encoding the host range determinants appears to reside within gp85 (33, 38), the molecule to which neutralizing antibodies are primarily elicited in animals (81). The mrna for the envelope glycoproteins is a spliced molecule that retains the 5' terminal sequences of the genomic RNA but which has lost over 4, nucleotides of the gag and pol sequences (32, 54, 86). This RNA is translated on membrane-bound polyribosomes (51, 66) to yield a glycosylated polyprotein precursor, Pr95env, that is later cleaved proteolytically to the two virion structural proteins (14, 22, 31, 56). Pactamycin mapping experiments indicate the order NH2-gp85-gp37-COOH on Pr95env (41, 74). This precursor is highly glycosylated, since in the presence of the glycosylation inhibitor tunicamycin (TM), the in vivo translation product of the env gene is a molecule of approximate- 92

2 VOL. 46, 1983 ly 57, molecular weight, suggesting that up to 38, daltons of carbohydrate may be present (2, 76). Furthermore, this susceptibility to TM indicates that the addition of oligomannosyl cores to the polypeptide chain occurs via a lipidlinked dolichol intermediate onto asparagine residues (78, 79). In the mature viral glycoproteins, carbohydrate chains of both the complex acidic type (containing sialic acid and fucose) and the mannose-rich neutral type have been detected, although the former are three times more abundant than the latter (35, 45). The complex processing of these functionally important viral proteins suggests that the primary translation product of the viral env gene must contain regions of amino acid sequence or tertiary structure that mediate (i) translocation of the protein across the endoplasmic reticulum (8, 88), (ii) addition and processing of oligosaccharide chains, (iii) specific cleavages to yield the mature virion proteins, and (iv) movement of the polypeptide from its site of synthesis to the plasma membrane. The location of at least some of these sequences has been determined by a combination of protein sequence and DNA sequence studies which have allowed the placement of the env gene-specific sequences on the complete nucleic acid sequence of RSV that has been described elsewhere (71a). MATERIALS AND METHODS Chemicals and radiosotopes. Sequencer grade chemicals were purchased from Spinco Division, Beckman Instruments, Inc. (Palo Alto, Calif.). Methanol and propanol for high-pressure liquid chromatography (HPLC) were the products of Mallinckrodt (Paris, Ky.). All other chemicals were of highest purity grade and were obtained from Pierce Chemicals (Rockford, Ill.) or Fisher Scientific Products (Fairlawn, N.J.). Chromatographic solvents were filtered through Unipore polycarbonate membranes (Bio-Rad Laboratories, Richmond, Calif.) before use. TM was a generous gift from R. L. Hamill, Lilly Research Laboratories, Indianapolis, Ind. Radiochemicals were purchased from the following suppliers: [3Hlglucosamine (6.3 Ci/mmol), ICN Pharmaceuticals, Inc., Irvine, Calif; [3H]leucine (6 Ci/mmol) and [35S]methionine (1, Ci/mmol), Amersham Corp., Arlington Heights, Ill. Cells and viruses. Chicken embryo fibroblasts of the C/O phenotype were prepared from fertilized eggs obtained from Hyline International, Dallas Center, Iowa. Fertilized eggs with the C/E phenotype were purchased from SPAFAS Inc., Norwich, Conn. Only cells negative for chicken helper factor (chj) were used. The cell culture techniques and the chf tests used have been described previously (36). The culture medium employed consisted of Hams F1 medium supplemented with 1% tryptose phosphate broth, 5% calf serum, 6.6 mm sodium bicarbonate, 1 U of penicillin per ml, and 1 p.g of streptomycin per ml. Polybrene (Aldrich Chemical Co., Milwaukee, Wis.) (2,gIml) was used to enhance virus adsorption, and RSV env GENE SEQUENCE 921 1% dimethyl sulfoxide was added for transformed cell cultures. Stocks of the Prague strain of subgroups A, B, C, and E (PR-A, PR-B, PR-C, PR-E), the Schmidt-Ruppin strain of RSV, subgroups A, D, and E (SR-A, SR- D, SR-E), and the Carr-Zilber strain of RSV, subgroup D (CZV-D), were focus cloned before use. Rousassociated viruses (RAV) of subgroup A (RAV-1, RAV-3), subgroup B (RAV-2, RAV-6), subgroup C (RAV-49), and subgroup D (RAV-5) were propagated in C/O chicken embryo cells transformed by the replication-defective Bryan high-titer strain of RSV (3) to avoid the differences in glycosylation of viral envelope proteins observed in transformed versus untransformed cells (48). Milligram amounts of PR-B, PR-C, and SR-A for glycoprotein purification were prepared from cloned virus grown in chicken embryo cells in roller bottle culture. Supernatant fluids from these cultures were concentrated 1-fold by hollow-fiber ultrafiltration (molecular weight cut-off, 1,; Amicon Corp., Lexington, Mass.) before virus purification by discontinuous and continuous gradient ultracentrifugation as described previously (3). Viral protein was quantitated with the Bio-Rad protein assay. Purification of gp85 and gp37 for amino-terminal sequencing. (i) Lectin affinity chromatography. A total of 15 to 2 mg of virus (PR-B, PR-C, or SR-A) suspended at 5 to 1 mg/ml in 2 mm Tris-hydrochloride (ph to 8.5) was mixed with a small amount of radiolabeled virus (1 x 15 to 2 x 15 cpm, [3H]glucosamine or [3H]leucine) and disrupted by the addition of 1/1 volume of 1o sodium deoxycholate (ph 8.5). After 3 min at 37 C, the mixure was diluted in an equal volume of distilled water and applied to a 2-mi column of Lens culnaris lectin sepharose (Pharmacia, Uppsala, Sweden) equilibrated in loading buffer (.5% deoxycholate, 1 mm Tris-hydrochloride, ph 8.5). Fractions of 1. ml were collected. After the column had been washed thoroughly with 1 column volumes of loading buffer, bound glycosylated proteins were eluted with 1 column volumes of loading buffer containing.2 M a-methylmannoside. Samples of each fraction were added to an aqueous scintillation fluid and counted. The unbound and eluted peak fractions were pooled and precipitated by the addition of 4 volumes of ethanol, and the composition of the peaks was determined by polyacrylamide gel electrophoresis. In some cases, gp85 was isolated by preparative polyacrylamide gel electrophoresis as described below. (ii) HPLC on gel permeation columns. The ethanol-precipitated fractions that had bound to the lectin column were washed twice with 8% ethanol to remove salt and detergent. Protein components in the precipitate were fractionated by hplc utilizing four I- 125 gel permeation columns (.78 by 3 cm; Waters Associates, Milford, Mass.) attached in series. A mixture of acetic acid-propanol-highly purified water (2:15:65) was employed as a solvent at a flow rate of.2 ml/min (4). The sample was dissolved in a minimum volume of solvent, centrifuged for 1 to 15 min in a Beckman Microfuge and injected in a volume not exceeding 1,u. The effluent was monitored spectrophotometrically at 28 nm, using a Waters UV detector (model 44). The peaks were collected manually, lyophilized, and assessed for purity by sodium dodecyl

3 922 HUNTER ET AL. sulfate (SDS)-polyacrylamide gradient (5 to 2%) slab gel electrophoresis. Polypeptides purified in this way could be used directly for amino-terminal sequencing studies. Amino-terminal amino acid sequence determination. Proteins (2 to 5 nm) were subjected to sequential Edman degradation in the presence of Polybrene (2 mg) in the Beckman 89C automated sequencer equipped with a cold trap and a modified vacuum system (5). Quadrol (BASF-Wyandotte Corp., Parsippany, N.J.) (.5 M) was employed as a coupling buffer. Aminoethylaminopropyl glass beads were placed in the Quadrol reservoir to remove undesirable derivatives such as aldehydes. A double-couple and doublecleavage program was adopted for the first cycle. Anilinothiozolinone derivatives of amino acids were converted to their respective phenylthiohydantoin (Pth) derivatives by incubating the dried residue under nitrogen at 8 C for 1 min with 1. N hydrochloric acid containing dithiothreitol (15 mg/liter). The soluble Pth derivatives were extracted twice with 1.-ml portions of ethyl acetate. The organic and aqueous layers were dried under nitrogen. The Pth derivatives of amino acids were identified by HPLC using an Altex 5 Ultrasphere ODS column (.46 by 15 cm; Beckman Instruments, Inc.) and a 5- min linear gradient of 8%o A (.4 M sodium acetate, ph 3.8, containing 5 of acetone per liter) to 5% B (methanol containing 25 p.1 of acetic acid per liter). Samples (aqueous and organic layers) containing the Pth derivatives were dissolved in 2 of methanol, and 2 to 1 p.1 was injected for identification (4). RNA extraction. The RNA extraction procedure was adapted with minor modifications from procedures previously described by Hayward (32). For extraction of virion RNA, purified virus (trace labeled with [3H]uridine) from culture supernatants harvested every 2 h was incubated for 3 min at 37 C in the presence of 2 p.g of proteinase K (E. M. Laboratories, Inc., Elmsford, N.Y.) per ml and.5% SDS and then extracted twice with a mixture of phenol-chloroform-isoamyl alcohol (1:1:.1 [vol/vol]) containing.5% (wt/vol) 8-hydroxyquinoline and once with chloroform containing 1% isoamyl alcohol. The aqueous phase was adjusted to.5 M ammonium acetate, and the RNA was precipitated by addition of 2 volumes of ice-cold ethanol, using 5 p.g of purified yeast trna (Sigma Chemical Co., St. Louis, Mo.) per ml as the carrier. For extraction of cellular RNA, the procedure described by Hayward (32) was followed, with the exception that viral RNA was not added as a marker. Both viral and cellular RNA preparations were applied to an oligodeoxythymidylic acid-cellulose (P-L Biochemicals, Inc., Milwaukee, Wis.) column according to the procedure of Lai and Duesberg (47) to select polyadenylated RNA species. Bound RNAs that were eluted from the column were pooled and precipitated with 2 volumes of ethanol after adjusting the salt concentration to.2 M NaCI and adding 5 pg of trna carrier per ml. In vitro translation of viral RNA. Translation of polyadenylated viral and cellular RNA was carried out in vitro in a micrococcal nuclease-treated rabbit reticulocyte lysate system obtained commercially from Amersham Corp. Each translation contained.5 to 2. p.g of polyadenylated RNA (1 mg/ml in water), 25 p.ci J. VIROL. p.ci of [355]methio- of [3H]leucine (6 Ci/mmol) or 5 nine (1, Ci/mmol), and 2,ul of prepared lysate. After incubation at 3 C for 2 h, a 1-,ul sample was taken for determination of total incorporation of radioactivity into hot trichloracetic acid-insoluble material. The remainder was adjusted to 1 mg of bovine serum albumin per ml, 1.%1 sodium deoxycholate, 1.% Triton X-1,.1% SDS, 25 mm Tris-hydrochloride (ph 8.), and 15 mm NaCI for immunoprecipitation studies. Immunoprecipitation of viral polypeptides. (i) Immunoprecipitation from pulse-labeled ceil lysates. RSVinfected chicken embryo cells in 35-mm culture dishes were treated for 2 h with 1. p.g of TM per ml or retained as untreated controls. Both sets of cells were then incubated with or without TM in Hams F1 medium minus leucine for 1 h to deplete the intracellular pools of this amino acid and were then pulse labeled in 25 p.1 of leucine-free F1 containing [3H]leucine (1 mci/ml) for 1 min. After the labeling period and after the medium was removed,.5 ml of lysis buffer A (1% Triton X-1, 1% sodium deoxycholate, 25 mm Tris-hydrochloride [ph 8.], 15 mm NaCI) was added to each dish, which was left on ice for 15 min. The lysed cells were removed from the dish by pipetting and then centrifuged in a Beckman Microfuge for 5 min to pellet the nuclei and debris. The clarified supematants were removed, and SDS was added to.1%. Samples (2 p1) were incubated at 37 C for 1 h with 4,ul of antiserum (an empirically determined excess of antibody). Rabbit antiserum to gp85 was prepared as previously described by Hunter et al. (37) by multiple injections of affinity columnpolyacrylamide gel-purified RSV PR-B gp85. This monospecific antiserum was specific for gp85 and showed no reactivity to gp37 or the non-glycosylated proteins (data not shown). Then, 1 p.l of a 1%o suspension of Formalin-fixed, heat-killed Staphylococcus aureus (17) was added, and the mixture was incubated at room temperature for 3 min. The immune complexes were washed twice in lysis buffer B (1% Triton X-1, 1% sodium deoxycholate,.1% SDS, 5 mm Tris-hydrochloride [ph 8.], 15 mm NaCl) and once with 2 mm Tris-hydrochloride (ph 8.) to remove salt. The immunoadsorbent was dissociated from the antibody-antigen complex by adding SDS and P-mercaptoethanol to final concentrations of 2% and.2 M, respectively. The preparation was then boiled for 2 min, and the immunoadsorbent was removed by centrifugation with a Beckman Microfuge for 5 min. (ii) Immunoprecipitation from in vitro translation experiments. Immunoprecipitation of in vitro-translated proteins was carried out essentially as described above, except that it was found necessary to reduce the amount of antiserum added to.5 p.l and to increase the time of incubation with S. aureus to 2 h to ensure the completeness of the reaction. For those experiments in which sequential precipitations were used, the antibody-lysate mixture was added to pelleted S. aureus (1 p.1 of a 19o suspension) to prevent an increase in volume. Polyacrylamide gels. Purified viral polypeptides and immunoprecipitated viral polypeptides were electrophoresed on SDS-Tris-glycine slab gels (thickness, 1.5 mm) containing 12% acrylamide or 5 to 2% acrylamide gradients, as described previously (37). Molecu-

4 VOL. 46, 1983 lar weight standards (Bio-Rad Laboratories) were located by Coomassie blue staining. Virus-specific polypeptides were located either by slicing and counting in Omnifluor (New England Nuclear Corp., Boston, Mass.)-toluene containing 9%o Protosol (New England Nuclear Corp.) and 1% water or by fluorography. Preparative gel electrophoresis was carried out on 3- mm-thick 12% acrylamide-tris-glycine slab gels in SDS. Viral proteins were located by one of two methods. In the first, the polypeptides were trace labeled with leucine or glucosamine, the gel was sliced into 1-mm sections perpendicular to the direction of migration, and the polypeptides were eluted overnight from each slice in 1 ml of 1 mm Tris-hydrochloride (ph 8.)-.1% SDS. Samples (25 to 5,ul) were counted in aqueous scintillant, and the slices containing protein peaks were pooled for electroelution and concentration in an ISCO electrophoretic concentrator (ISCO, Lincoln, Neb.) as described previously (6). In the second method, a small fraction (5 to 1%) of the unlabeled polypeptide in 1 p.l of.1% SDS was fluorescamine labeled by the addition of an equal volume of dimethyl sulphoxide containing 1 mg of Fluram (Roche Diagnostics, Div. Hoffman-La Roche, Inc., Nutley, N.J.). The reaction was allowed to proceed for 5 min at room temperature, after which the proteins were precipitated with 4 volumes of ethanol. Polypeptide bands were located on a preparative polyacrylamide gel with a long-wavelength UV lamp, excised, and electroeluted and concentrated as described above. Nucleic acid sequencing. Nucleic acid sequencing strategies and procedures are described elsewhere (71a). RESULTS AND DISCUSSION Amino-terminal amino acid sequencing. To place the env gene sequences on the nucleic acid sequence of RSV, we determined the aminoterminal amino acid sequence of gp85 and gp37 from three host range variants of RSV: PR-B, PR-C, and SR-A. Since these three viruses represent three different host ranges of RSV, it was reasoned that subgroup-specific differences in amino acid sequence at the amino-termini, if any existed, could be determined from these isolates. Furthermore, nucleic acid sequence data was available for both PR-C (71a) and SR-A (18). Glycoprotein preparations for amino-acid sequence analysis were obtained by lectin-affinity chromatography of detergent-disrupted virus (31). A single pass through such a column was sufficient to free the viral glycoproteins from contaminating non-glycosylated proteins. The results of a typical experiment are shown in Fig. 1A. In initial experiments, the individual viral glycoproteins were isolated by preparative gel electrophoresis and electrophoretic concentration after ethanol precipitation of the eluted fraction. Although this procedure reproducibly yielded amino-terminal amino acid sequence information for gp85 (see below), for reasons not gp85_ gp37, "-' U' 2 RSV env GENE SEQUENCE ~~~~Tm ''m m A, p27 3D : B.memrnin) FIG. 1. Purification of gp85 and gp37 for aminoterminal sequencing. (A) A Coomassie blue-stained SDS-polyacrylamide gel of the bound (lane 1) and unbound (lane 2) fractions pooled from an L. culnaris lectin-sepharose column as described in the text. Lane 1, l.o of the total bound glycosylated protein from 18 mg of RSV PR-B; lane 2,.5% of the total pool of unbound material from the same amount of virus. Lane 3, Molecular weight markers (Bio-Rad Laboratories). (B) Elution profile from an HPLC gel permeation column of L. culnaris lectin binding fraction. Glycoprotein fractions eluted from the column by.2 M a-methylmannoside-.5% sodium deoxycholate-1 mm Tris-hydrochloride (ph 8.5) were precipitated with 4 volumes of ethanol, washed twice with 8%o ethanol, and then redissolved in a mixture of acetic acid-propanol-highly purified water (2:15:65) after lyophilization. The effluent from four gel permeation columns attached in series was monitored spectrophotometically at 28 nm, and peaks were collected manually for analysis. O.D., Optical density. clear at the present time it has not been possible to obtain sequence information on gp37 isolated in this way. All sequence information on gp37 has therefore been obtained by subtracting the amino-terminal amino acid sequence of purified 5

5 924 HUNTER ET AL. J. VIROL PR-B NH2 - Asp - Val - His - Leu - Leu - Glu - Gln - Pro - 61y - Asn - Leu - Trp. PR-C NH2 - Asp - Val - X - Leu - Leu - Glu - X - Pro - Gly - X - Leu - Trp. SR-A NH2 - Asp - Val - X - Leu - Pro - Glu - Glu - Pro - Gly - X - Leu - Trp. gp PR-B NH2 - X - Val - X - X - Leu - Asp - Asp - Thr - X - Ala - Asp. PR-C NH2 - X - Val - X - His - Leu - Asp - Asp - Thr - X - X - Asp. FIG. 2. Amino-terminal amino acid sequences of gp85 and gp37. Polypeptides purified by a combination of L. culnaris lectin affinity chromatography and preparative SDS-polyacrylamide gel electrophoresis or HPLC were applied in nanomole amounts (2 to 5 nmol) to a Beckman 89C sequencer and subjected to sequential Edman degradation. Pth derivatives of released amino acids were identified by HPLC as described in the text. Residues for which no firm identification could be made are denoted by an X. gp85 from that of a mixture of the two. Since a minor, Coomassie blue-staining polypeptide band of a size consistent with that of the L. culnaris lectin monomer (17, molecular weight) was consistently observed in our glycoprotein preparations from the lectin column (Fig. 1), it was necessary to include an additional purification step in the isolation procedure. HPLC of the eluted glycoprotein on a highresolution gel permeation column (6) yielded two major protein peaks (Fig. 1B). SDS-polyacrylamide gel electrophoresis of the proteins in the individual peaks showed that pool A was an equimolar mixture of gp37 and gp85, whereas pool B contained only gp85 (data not shown). Since no peak of gp37 alone has been observed in these experiments, it is possible that this lessglycosylated, more-hydrophobic polypeptide (see below) either binds poorly to lentil lectin in the absence of a disulphide-linked gp85 molecule or is lost during the ethanol precipitation steps before loading on the HPLC column. The column fractions in pools A and B were rechromatographed on the gel-permeation column before their application to the amino acid sequencer. The amino-terminal sequences of gp85 and the gp85-gp37 mixture were determined in a Beckman model 89C spinning cup sequencer, using a modified Quadrol program with repetitive yields of 95% or better. The results obtained for the first 11 cycles of gp85 from PR-C, PR-B, and SR-A and for the first 11 cycles of gp37 from PR- C and PR-B are presented in Fig. 2. The aminoterminal sequence of gp85 for PR-B was determined from both electroeluted and HPLCpurified protein. The amino-terminal amino acid sequences of gp85 from PR-B, PR-C, and SR-A appear to be identical, with the exception of a proline rather than a leucine in the fifth position of SR-A. Although no comparisons can be made for the remainder of the molecule, the amino-termini are highly conserved. The gp37 amino-terminal sequences are less complete, since at the level of sensitivity used in these determinations, serine is not detected and histidine is only poorly detected. Nevertheless, in general, conservation is observed for gp37 when PR-B and PR-C are compared. Since subgroup B and C viruses have significantly different host ranges and antigenically distinct envelope polypeptides, the amino-terminal conservation observed in gp85 and gp37 in consistent with host range differences being located within a relatively small region of gp85 (15). Indeed, since the amino-terminus of gp37 represents a site for cleavage of Pr95env into the two mature polypeptides, such conservation might be expected. Coding sequence for env gene of RSV. The determination of the complete nucleotide sequence for RSV PR-C as described elsewhere (71a), together with the amino-terminal amino acid sequences described above, has allowed the coding sequences for gp85 and gp37 to be placed in an open reading frame that extends from nucleotide 545 to nucleotide 6862 on the Rous sarcoma genome. It should be noted that, as with most glycoproteins that require an aminoterminal signal peptide, the translation of gp85 does not initiate at its amino-terminus, an aspartic acid, but rather at some point upstream. A

6 *** leu arg lys met arg arg ala leu phe leu gln ala phe leu thr gly tyr pro gly lys thr ser lys lys asp ser lys glu lys TGA CTA AGA AAG ATG AGG CGA GCC CTC TTT TTG CAGAGCA TTT CTG ACT GGA TAC CCT GGG AAG ACG AGC AAG AAG GAC TCC AAG GAG AAA 55 TSPLICE pro leu ala thr ser lys lys asp pro glu lys thr pro leu leu pro thr arg val asn tyr ile leu ile ile gly val leu val leu CCG CTA GCA ACA AGC AAG AAA GAC CCG GAG AAG ACA CCC TTG CTG CCA ACG AGA GTT MT TAT ATT CTC ATT ATT GGT GTC CTG GTC TTG 514 1NH2 gp85 # l 8 CHO cys glu val thr gly val arg alavasp val his leu leu glu gln pro gly asn leu trp ile thr trp ala asn arg thr gly gln thr TGT GAG GTT ACG GGG GTA AGA GCT GAT GTT CAC TTA CTC GAG CAG CCA GGG AAC CTT TGG ATT ACA TGG GCC AAC CGT ACA GGC CAA ACG 523 asp phe cys leu ser thr gln ser ala thr ser pro phe gln thr cys leu ile gly ile pro ser pro ile ser glu gly asp phe lys GAT TTC TGC CTC TCT ACA CAG TCA GCC ACC TCC CCT TTT CM AATGT TTG ATA GGT ATC CCG TCT CCT ATT TCC GM GGT GAT TTT AAG 532 CHO #4 CHO gly tyr val ser asp thr asn cys ser thr val gly thr asp arg leu val leu ser ala ser ile thr gly gly pro asp asn ser thr GGA TAT GTT TCT GAT ACA AAT TGC TCC ACT GTG GGA ACT GAC CGG TTA GTC TTG TCA GCC AGC ATT ACC GGT GGC CCT GAC AAC AGC ACC 541 CHO #38 thr leu thr tyr arg lys val ser cys leu leu leu lys leu asn val ser met trp asp glu pro pro glu leu gln leu leu gly ser ACCCj MG GTT TCA TGC CTG CTG TTA MG CTG AAC GTC TCC ATG TGG GAT GAG CCA CCT GM CTG CAG CTG CTA GGT TCC _ CHO CHO gln ser leu pro asn val thr asn ile thr gln val ser gly val ala gly gly cys val tyr phe ala pro arg ala thr gly leu phe CAG TCT CTC CCT AAC GTT ACT MC ATT ACT CAG GTC TCT GGC GTG GCC GGG GGA TGT GTA TAT TTC GCC CCA AGG GCC ACT GGC CTG TTT 559 #19 leu gly trp ser lys gln gly leu ser arg phe leu leu arg his pro phe thr ser tnr ser asn ser thr glu pro phe thr val val TTA GGT TGG TCT AAA CAA GGT CTC TCG CGG TTC CTC CTC CGT GAG CCC TTT ACC TCC ACC GAA CCG TTC ACG GTG GTG 568 I1 1 CHOQ thr ala asp arg his asn leu phe met gly ser glu tyr cys gly ala tyr gly tyr arg phe trp glu ile tyr asn cys ser gln thr ACA GCG GAT AGA GCAMT CTTTTTG GGG AGT GAG TAC TGT GGT GCA TAT GGC TAC AGA TTT TGG GAA ATA TAT AAC TGC TCA CAG ACT 577 loc arg asn thr tyr arg cys gly asp val gly gly thr gly leu pro glu thr trp cys arg gly lys gly gly ile trp val asn gin ser AGG AAT ACT TAC CGC TGT GGA GAC GTG GGA GGT ACT GGC CTC CCT GAA ACC TGG TGC AGA GGA AAA GGA GGT ATA TGG GTT AAT CM TCA 586 COHO#B14 lys glu lie asn glu thr glu pro phe ser phe thr ala asn cys thr gly ser asn leu gly asn val ser gly cys cys gly glu pro AAG GAA ATT AAT GAG ACA GAG CCG TTC AGT TTT ACT GCG AAC TGT ACT GGT AGT AAT TTG GGT O AAT GTC AGC GGA TGT TGT GGA GM CCA 595 ile thr ile leu pro leu gly ala trp ile asp ser thr gln gly ser phe tnr lys pro lys ala leu pro pro ala ile phe leu ile ATC ACG ATT CTC CCA CTA GGG GCA TGG ATC GAC AGT ACG CAA GGT AGT TTC ACT AA CCA MA GCG CTA CCA CCC GCA ATT TTC CTC ATT 64 ~~~~~~~~~~~~~~~~~~~#42 #111 cys gly asp arg ala trp gln gly ile pro ser arg pro val gly gly pro cys tyr leu gly lys leu thr met leu ala pro asn his TGT GGG GAT CGC GCA TGG CAA GGA ATT CCC AGT CGT CCG GTA GGG GGC CCC TGC TAT TTA GGC MG CTT ACC ATG TTA GCA CCT AAC CAT 613 CH NH -gp37 thr asp ile leu lys ile leu ala asn ser ser arg thr gly ile arg arg lys argfser val ser his leu asp asp thr cys ser asp ACA GAT ATT CTC AM ATA CTT GCT AAT TCG TCG CGG ACA GGT ATA AGA CGT AM CGA AGC GTC TCA CAC CTG GAT GAT ACA TGC TCA GAT 622 glu val gln leu trp gly pro thr ala arg ile phe ala ser ile lei ala pro gly val ala ala ala gln ala leu arg glu ile glu GAA GTA CAG CTT TGG GGT CCT ACA GCA AGA ATC TTT GCA TCT ATC TTA GCC CCG GGG GTA GCA GCT GCG CAA GCC TTA AGA GM ATT GAG 631 C arg leu ala cys trp ser val lys gln ala asn leu fth thr ser leu leu gly asp leu leu asp asp val thr ser ile arg his ala AGA CTA GCC TGT TGG TCC GTT AAA CAG GCT AAC TTG ACA ACA TCA CTC crc GGG GAC TTA TTG GAT GAT GTC ACG AGT ATT CGA CAC GCG 64 #1CHO val leu gln asn arg ala ala ile asp phe leu leu leu ala his gly his gly cys glu asp val ala gly met cys cys phe asn leu GTC CTG CAG AAC CGA GCG GCT ATT GAC TTC TTG CTT CTA GCT CAC GGC CAT GGC TGT GAG GAC GTT GCC GGA ATG TGT TGT TTC MT CTG 649 ser asp his ser glu ser ile gln lys lys phe gln leu met lys lys his val asn lys ile gly val asp ser asp pro ile gly ser AGT GAT CAC AGT GAA TCT ATA CAG AAG AAG TTC CAG CTA ATG MG AAA CAT GTC AAT MG ATC GGC GTG GAC AGC GAC CCA ATC GGA AGT 658 trp leu arg gly ile phe gly gly ile gly glu trp ala val his leu leu lys gly leu leu leu gly leu val val ile leu leu leu TGG CTG CGA GGG ATA TTC GGG GGA ATA GGG GAA TGG GCC GTT CAT CTG CTA AAA GGA CTG CTT TTG GGG CTT GTA GTT ATT TTA TTG CTA 667 leu val cys leu pro cys leu leu gln phe val ser ser ser ile arg lys met ile asn ser ser ile asn tyr his thr glu tyr arg CTG GTG TGC CTG CCT TGC CTT TTA CM TTT GTG TCT AGT AGT ATT CGA AAG ATG ATT AAT AGT TAGTCAAC A CAT TAC AGG 676 #21 #14 lys met gin gly gly ala val *** MG ATG CAG GGC GGA GCA GTC TAG 685 FIG. 3. Nucleotide sequence and deduced amino acid sequence for PR-C env gene. The nucleotide sequence begins with the termination codon preceding the open reading frame that contains the env gene product coding sequence and stops at the TAG codon that terminates translation of gp37. The numbers on the left below the nucleotide sequence correspond to those of the complete nucleotide sequence of RSV PR-C presented elsewhere (71a). The sequence presented is that of the cloned genome of PR-C in the plasmid patv-8 (39). Selected hydrophobic regions of amino acid sequence are underlined; functionally important regions such as the signal peptide or transmembrane region have bold underlining. Sixteen potential sites for carbohydrate addition (CHO) at Asn-X-Ser or Asn-X-Thr have been highlighted, and RNase T1-resistant oligonucleotides that span this region in the genomic RNA have been underlined (-) and numbered according to Coffin et al. (15) and J. Coffin (personal communication). 925

7 926 HUNTER ET AL. J. VIROL. leader peptide gp85 gp37 ~Yv V VVy A) AA) A) A)A V I A V A 62aa 341 aa FIG. 4. Schematic representation of PR-C env gene product. Glycosylation sites are depicted by, and hydrophobic regions of the leader peptide, gp85, and gp37 are shown by shaded boxes. Sites for signal peptidase cleavage and Pr95e"v cleavage are denoted by arrows. aa. Amino acids. 198 aa potential initiating codon, AUG, is present early in the open reading frame at nucleotide 554, which would allow the translation of polypeptide 63 amino acids long. This is the only initiating codon which precedes the coding region of gp85. The sequence and the translation product for this region are presented in Fig. 3. The major glycoprotein, gp85, has a polypeptide core of 341 amino acids and a molecular weight of 36,956, whereas gp37 consists of 198 amino acids and has a molecular weight of 21,552. The amino acid composition of gp85 predicted from this sequence is in general agreement with that reported previously for other avian retroviruses (46). It is clear that for both molecules, carbohydrate makes an important contribution to the apparent molecular weight determined by SDS-polyacrylamide gel electrophoresis; a point that is borne out by the number of potential glycosylation sites as discussed later. If the AUG at nucleotide position 554 was used to initiate translation, the 539 amino acids corresponding to an unglycosylated Pr95e"v polypeptide would be preceded by a 64-aminoacid amino-terminal leader peptide. This long leader peptide would contain two 13-amino-acidlong regions of uncharged hydrophobic amino acids (underlined in Fig. 3) that are similar in composition and size to the amino-terminal signal peptides of other glycosylated or secreted polypeptides (2, 8). Such sequences are thought to be necessary for the polypeptide to initiate its movement into the lumen of the rough endoplasmic reticulum-a prerequisite for the addition of carbohydrate (7, 71). This particular leader peptide would be unusual in having two such regions separated by a hydrophilic stretch of amino acids. Nuclease S1 mapping of the env mrna indicates, however, that such a polypeptide is not normally synthesized in vivo. The mrna for the RSV env gene product is spliced, and the splicing event appears to occur 3' of the potential genomic initiating AUG (nucleotide 554). The mapping studies have identified the 5' splice donor site at nucleotide 397 (29; D. Schwartz, R. Tizard, and W. Gilbert, unpublished data) and the 3' acceptor site at nucleotide 578 (D. Schwartz et al., unpublished data). This removal of a 4,68-nucleotide intron would generate a coding sequence for a modified leader peptide of 62 amino acids in which the first 8 amino acids of the 64-amino-acid putative leader are replaced by the first 6 amino acids of the p199a9 polypeptide. It also would result in the initiation of env gene translation at the same AUG that is used to initiate translation of the gag and pol genes. A recent survey of translation initiation sites by Kozak (44) has revealed a hierarchy of preferred sequences flanking the AUG, and according to this scheme, the initiation codon from gag is identical to the most favored sequence cxxaugg. It should, therefore, present a stronger ribosome start site than the genomic AUG at nucleotide 554 that it replaces. In addition to providing a new translation start, the RNA splicing event also destroys the 5' hydrophobic amino acid stretch of the leader peptide (nucleotides 563 to 512) by inserting a charged, polar lysine into this region, which reduces it length from 13 to 8 amino acids. The 62-amino-acid leader peptide generated by this splicing event (and removed during translation of the env precursor) thus contains only a single 13-amino-acid-long string of uncharged amino acids that is located close to the amino terminus of gp85. This region, therefore, is presumably the signal that mediates transport of the polypeptide across the endoplasmic reticulum (see below). The product of this spliced mrna is shown schematically in Fig. 4. Identification of long amino-terminal leader by in vitro translation. The leader peptide predicted from these nucleic acid sequence studies is significantly longer than most amino-terminal signal peptides, which in general range from 13 to 2 amino acids in length (2). To confirm the presence of this long amino-terminal extension, we compared the env-related product from in

8 VOL. 46, 1983 Pr RSV env GENE SEQUENCE e translation mix of Pr18, the gag-pol product, Pr76, and a p27-related polypeptide of approximately 6, daltons. A second immunoprecipitation with the same antiserum showed no gag-related products (Fig. 5, lane 2). Sequential immunoprecipitation with an RSV SR-A-specif- * -_p >.- ic anti-pp6src antiserum (Fig. 5, lane 3; nonspecific antibody control) and then with a monospecific antiserum to gp85 revealed an env-specific Pr :; band that migrated with an apparent molecular -5 weight of 64, (p64ev; Fig. 5, lane 4). The size of this in vitro-translated product is consistent with that reported for the SR strain of RSV (62, 66) and probably represents the product from authentic spliced and capped mrna, since other workers have shown that cleavage of the p64env product with V8 protease yields a Met- Glu dipeptide identical to that released from the amino-terminus of Pr76 and p19 (1, 67; S. Anderson, personal communication). Furthermore, in vitro translation of polyadenylated cytoplasmic RNA from RSV PR-C-infected cells and immun- env oprecipitation in the same fashion yielded an FIG. 5. Autoradiograph of innmunoprecipitated gene products from in vitro arid in vivo translation. identical polypeptide, p64env (data not shown). RSV PR-C polyadenylated viric:)n RNA (.5 to 1.,zg) Immunoprecipitation with the same anti-gp85 was translated in vitro in a n uclease-treated rabbit antiserum of untreated and TM-treated cells that reticulocyte lysate system in thie presence of [3H]leu- were pulse labeled with [3H]leucine before lysis cine (25 jj.ci). Polypeptides weire immunoprecipitated yielded polypeptides of 95, and 58,o dalsequential addition of tons (Fig. 5, lanes 6 and 5, respectively) as we from the translation mix by the antisera to p27 (lane 1), p27 ('lane 2), SR-A pp6w have reported earlier (76). The difference in (lane 3), and gp85 (Lane 4). S. aureus immunoadsor- mobility between the TM product and the in bent was added after each ant antigen-antibody complex befo rethem xt anotivse the vitro translation product (6, daltons) is con- precipitated in sistent with the removal of a long 62-amino-acid was added. The size of the polypeptide lane 4 was compared with that c f polypeptides immun- leader peptide. oprecipitated by anti-gp85 antiiserum from PR-C-in- To determine whether all 62 amino acids were fected control (lane 6) or TM-trneated (lane 5) cells that removed during or immediately after translation had been pulse labeled for 1 nnin with [3H]leucine. of the env product, we performed microsequencing of the RSV PRC Pr95enV. The polypeptide for these studies was immunoprecipitated from cells pulse labeled (1-min pulse) with [3H]leucine, vitro translation of viral RN[A in a rabbit reticu- fractionated on an SDS-polyacrylamide gel, and locyte lysate system with tthe product synthe- electroeluted as described above. The radioacsized in cells treated with TM, ' an inhibitor of tivity released at each cycle from the sequencer asparagine-linked carbohydrrate addition. The in was determined, and the results are shown in vitro translation experiment:s were performed in Fig. 6. the absence of membranes, alnd so both products The release of leucine residues from the amishould lack carbohydrate. IHowever, since TM no-terminus of Pr95env is consistent with the does not prevent translocatlion of the polypep- precursor and gp85 having identical amino-ter- reticulum and mini. Since no band larger than Pr95env has been tide across the rough endopllasmic subsequent signal peptidasse action, the two detected, even after pulse-labeling periods as polypeptides should differ by the size of the short as 2 min, it seems likely that the amino- of this experiment terminal leader sequence is completely removed peptide removed. The resultts are shown in Fig. 5. The galg gene product Pr76 during the process of translation. is the major product from ini vitro translation of Hydrophobic regions of the RSV env product. polyadenylated RNA extrac:ted from RSV viri- Using the criteria devised by Segrest and Feld- lysates mann (72), we have searched for hydrophobic ons (65), and therefore, ireticulocyte were first cleared of ti hese products by regions in the predicted amino acid sequence of immunoprecipitation with aritiserum to p27 (Fig. the env gene product. Peptides longer than 9 5, lane 1). A single immu noprecipitation was amino acids that lacked charged amino acid sufficient to completely d4eplete the in vitro residues were identified (as depicted in the up-

9 928 HUNTER ET AL. 4r- 3H T J. VIROL. U ir 1 1 I IL Fraction Number NH2 Asp-Val-His-Lou-Leu-Glu-GIn-Pro-Gly-Asn-Leu-Trp-lIe-Thr-Trp-AIa-Asn-Arg-Thr-Gly FIG. 6. Microsequencing of env precursor polypeptide, Pr9Senv. RSV Pr-C-infected chicken embryo cells were pulse labeled with [3H]leucine for 1 min. The env precursor was precipitated with antiserum to gp85 and separated on and electroeluted from a preparative SDS-polyacrylamide gel as described in the text. The radiolabeled polypeptide was applied to a Beckman 89C sequencer, and the [3H]leucine released at each cycle was determined. Sperm whale myoglobin was included as an unlabeled control for sequencing. The aminoterminal amino acid sequence for gp85 is presented for reference. per portion of Fig. 7), and their length was plotted against their hydrophobicity index. Using this approach, Segrest and Feldmann (72) plotted 774 noncharged polypeptide segments from protein sequences available at that time and found that more than 95% fell within a triangular area, shown schematically in Fig. 7. Most membrane protein signal peptides and transmembrane segments, on the other hand, plotted outside and to the hydrophobic side of this triangle. The majority of the env apolar peptides (1 of 13) also fell within the triangle, but three peptides (numbers 2, 8, and 13) mapped to the hydrophobic side of this area. This is where the hydrophobic signal and transmembrane peptides are mapped for the influenza virus hemagglutinin (HA) polypeptide (25, 55, 64), the vesicular stomatitis virus G protein (69), and the erythrocyte glycoprotein, glycophorin (72, 8). Peptide number 2 (13 amino acids) corresponds to the hydrophobic amino acid stretch located close to gp85 in the amino-terminal leader peptide. Its hydrophobicity supports the conclusion that it plays the role of a signal peptide that mediates the translocation of the nascent polypeptide chain across the rough endoplasmic reticulum. Furthermore, it shares many of the characteristics of other signal peptides in terms of size and amino acid composition (2). It is unusual, however, in that it is located 43 amino acids from the amino-terminus of the polypeptide. This implies that a significant lag exists between initiation of translation and initiation of translocation across the endoplasmic reticulum, and infers an intermediate situation to that observed with the vesicular stomatitis virus G protein and the band III erythrocyte membrane protein (13, 71). Peptide number 13 is located close to the carboxy-terminus of gp37. Its length (27 amino acids) is consistent with it playing the role of transmembrane anchor for the VGP complex, similar to the transmembrane regions of other bitopic membrane proteins. This peptide is flanked by charged residues (Fig. 3), and the Arg-Lys dipeptide at its carboxy-terminus may act as a signal to stop translocation of the protein across the membrane (7). Peptide number 8 (11 amino acids) is located close to the carboxy-terminus of gp85, and it is not clear at this time whether it plays any functional role. gp37 appears to anchor gp85 on the virus via some form of disulphide linkage,

10 VOL. 46, 1983 RSV env GENE SEQUENCE 929 LEADER PEPTIDE j gp85 * gp z i) 3 C (LJ n La. 1 - ID w zr X HYDROPHOBICITY INDEX FIG. 7. Hydrophobic regions of RSV env gene product. (Top) Stretches of amino acid sequence that were greater than nine amino acids long and which lack charged residues (Arg, Lys, Asp, Glu, His) have been located on the amino acid sequence predicted in Fig. 3. (Bottom) The numbered apolar peptides () have been plotted according to the scheme of Segrest and Feldmann (72). Hydrophobicity indexes for each peptide were calculated by dividing the sum of the indexes for each amino acid within that peptide by the length of that peptide. The triangular area corresponds to a region determined empirically by Segrest and Feldmann (72) to be where the apolar regions of most proteins map. The hydrophobicity indexes for the transmembrane peptides (U) of three influenza virus subtypes, vesicular stomatitis virus and glycophorin as well as the signal peptides (X) of these viruses have been plotted in the same manner. since treatment of virions with reducing agents such as dithiothreitol can release most of the gp85 from the virus (61; M. Hardwick and E. Hunter, unpublished data). It seems unlikely, therefore, that peptide number 8 plays a role in anchoring gp85 in the viral membrane. A fourth peptide worthy of mention (peptide number 1) is underlined in Fig. 3. Although this peptide is not exceptionally hydrophobic, its location close to the amino-terminus of gp37 is similar to that of the "fusion" sequence of the Sendai virus F protein and the influenza virus HA2 polypeptide (26, 87). It will be of interest to determine, through the use of site-directed mutagenesis, whether this region plays a similar role in mediating viral entry into the target cell. Glycosylation sites on gp8s and gp37. Glycosylation of the RSV env gene product is sensitive to TM (2, 76), suggesting that all of the carbohydrate chains are attached through asparagine residues in the polypeptide chain. The canonical sequence, Asn-X-Thr/Ser, at which such N- linked additions are made (59), can be found at 16 positions in the env gene sequence. These are marked in Fig. 3 and depicted schematically in Fig. 4. A total of 14 potential glycosylation sites are found in gp85 and 2 in gp37. A 17th site exists close to the carboxy-terminus of gp37, but since this is part of the intracytoplasmic tail of the polypeptide, it would not be expected to have carbohydrate added to it. Although the recognition site for glycosylation may not be sufficient to ensure carbohydrate addition, it would seem likely that gp85, with 14 such sites, would be an exceptionally highly glycosylated polypeptide. Previous studies, however, have suggested that gp85 from RSV of subgroup C contains only a small number (2 to 3) of carbohydrate side chains (23, 45); gp85 from subgroup A viruses, on the other hand, have been reported to possess up to 13 oligosaccharide chains (45). To determine whether all or most of the potential glycosylation sites are utilized, we compared the apparent molecular weights of pulse-labeled env precursor synthesized in the presence and absence of TM. The glycosylated precursor detected in such an experiment contains carbohydrate only in the form

11 93 HUNTER ET AL. of the mannose-rich neutral core that is transferred to the polypeptide as a single unit from a lipid-sugar intermediate (52). This approach thus avoids the problems of charge and carbohydrate chain-length heterogeneity that might interfere with size determinations of mature glycoproteins. Since the molecular weight of each core carbohydrate moeity is approximately 2,1 (52), and since the precursor made in the presence of TM completely lacks sugar (i.e., represents the polypeptide alone), an estimate of the number of carbohydrate chains added during translation can be calculated from the formula: number of carbohydrate chains = (molecular weight of glycosylated product - molecular weight of TM product)/2,1. The results of an experiment utilizing the above approach to quantitate the glycosylation of the env gene product of five viral subgroups are summarized in Table 1. It is clear from these data that a significant amount of heterogeneity exists in both the size of the polypeptide core and in the number of carbohydrate chains added. This is particularly notable within the subgroup A viruses in which the non-glycosylated env gene product of RAV-3 is 8, daltons larger than that of PR-A (Table 1), yet is calculated to have six fewer carbohydrate side chains added (15 versus 21). The pr95e"v of RSV PR-C would have 18 carbohydrate side chains by these calculations, which is reasonably consistent with the 16 chains predicted from the nucleotide sequence. Furthermore, the ratio of glucosamine or fucose counts incorporated into gp85 versus gp37 has consistently been observed to be on the order of 6 to 7:1 (3; M. Hardwick and E. Hunter, data not shown). It has been reported that heavily glycosylated proteins can migrate anomalously on polyacrylamide gels (73), but recent work with the influenza virus hemagglutinin (57) and mouse mammary tumor virus env gene precursor (19) suggests that the pulselabeled precursor does in fact migrate according to its molecular weight. It seems likely, therefore, that most of the potential glycosylation sites are occupied in the mature env gene products. Why the original estimates of carbohydrate addition were so low is not clear. Nor is it apparent why the glycoproteins of the subgroup A viruses were consistently calculated to have many more carbohydrate chains than those of the other subgroups. It is possible, since the data presented here suggest no major differences in the extent of glycosylation of subgroup A versus other viral subgroups, that the phenotype of the cells producing the virus (transformed versus nontransformed) or the final stages of carbohydrate processing may have influenced the previous studies. It should be noted that we attempted to avoid these problems by studying viral J. VIROL. precursor polypeptides synthesized in avian sarcoma virus-transformed chicken cells (see above). Location of host range determinants. Certain mutants of RSV, SR-NY 8 and RSV(-), carry extensive deletions in the coding region for the envelope glycoproteins. Studies in which the T1- oligonucleotide fingerprints of the mutants were compared with those of nondefective wild-type virus indicated that spots allelic to PR-C oligonucleotides 18 (nucleotide 5187) through 14 (nucleotide 6814) were deleted in these viruses (21, 83, 84). Since oligonucleotide 18 is located in the signal sequence of the amino-terminal leader (Fig. 3) and oligonucleotide 14 is in the intracytoplasmic tail of gp37 (Fig. 3), the entire coding TABLE 1. Carbohydrate chain addition to env gene products of different strains of RSV Estimated mol Esiaeno ~wt (x13a Estimated no. Subgroup wt(13a of carbohydrate +TM -TM chainsb A PR-A SR-A R(RAV-1) R(RAV-3) B PR-B R(RAV-2) R(RAV-6) C PR-C R(RAV-49) D SR-D CZV-D R(RAV-5) E SR-E PR-E a Cells infected and uniformly transformed by various strains of RSV were incubated in the presence of TM for 2 h. Untreated cells were maintained as a control. Cultures were pulse labeled for 1 min with [3H]leucine, and the env gene products were immunoprecipitated with antiserum to gp85. To allow an accurate comparison of products from different strains, the immunoprecipitations from treated and untreated cultures were electrophoresed on a single SDS-polyacrylamide slab gel. Molecular weights were calculated from the migration of the proteins in comparison with that of Bio-Rad protein standards electrophoresed in adjacent wells. b The number of carbohydrate chains per env precursor polypeptide was calculated as described in the text.

12 VOL. 46, 1983 region for gp85 and gp37 must have been lost in these viruses. Oligonucleotide mapping of host range recombinants from various viruses has pinpointed the coding region responsible for host range determination. Initial studies by Joho et al. (38) indicated that a region encompassing oligonucleotides 38 and 111 was responsible for the subgroup B host range of RSV PR-B. Recent studies by Coffin and co-workers have shown that oligonucleotides 19, 11, 11, and B14 (Fig. 3) segregate strictly with the subgroup C host range of the viruses in a variety of genetic crosses (J. Coffin, personal communication). This would locate the region of the polypeptide that presumably binds with the host cell-coded receptor within the middle one-third of gp85 (approximately 1 amino acids from both amino- and carboxy-termini). The remainder of gp85 and all of gp37, with the exception of its extreme carboxy-terminus, appear from limited sequence data and oligonucleotide mapping data to be quite highly conserved between subgroups. It is of interest that during sequencing of RSV PR-C by the random priming technique as described elsewhere (71a), two distinct but overlapping HaeIII fragments were obtained in approximately equimolar amounts in the env region of the genome. The sequence of these fragments is shown in Fig. 8. Although the overall sequence of this region is conserved, extensive base changes can be observed that extend from the 5' side of oligonucleotide 11 to oligonucleotide 42-the experimentally defined host range coding region. The variation observed does not appear to have resulted from random point mutations since a majority of codon changes (46 of 61) result in different amino acids upon translation, the opposite of what is observed during antigenic drift with influenza viruses (12). It is likely, then, that the virus stock sequenced in this instance consisted of two populations that differed only in that region of the genome specifying host range. This might be expected if a single host range variant was repeatedly backcrossed to a parental virus, and selection was primarily for host range. Such a situation could occur if a uniformly infected culture were contaminated by a host range variant that could avoid interference and undergo multiple cycles of reinfection and genetic recombination. It has not been possible, however, to identify the subgroup specificity of this variant, since the novel oligonucleotides generated from the deletions and point mutations do not correspond to previously analyzed host range-specific oligonucleotides (J. Coffin, personal communication). A comparison of the predicted amino acid sequence of gp37 from SR-A (J. Sorge, personal RSV env GENE SEQUENCE 931 communication) and PR-C indicates that there is extensive conservation of sequences in these two proteins. Nevertheless, the extreme carboxy-termini of the two polypeptides show significant divergence in both amino acid and nucleotide sequences. The amino acid differences are summarized in Fig. 9, in which the carboxyterminal sequences are numbered from the first charged residue that follows the 26-amino-acidlong region that is presumed to span the viral membrane. In the first 18 amino acids, only four changes are observed between SR-A and PR-C, and for the most part (3 of 4), these result from single base changes in the coding sequences. From amino acid 19 (gly in PR-C), however, the amino acid sequences for PR-C and SR-A are completely different, and this is reiterated at the nucleic acid sequence level, where a complete divergence is observed. Furthermore, the PR-C sequence terminates after residue 22, whereas the SR-A polypeptide continues for an additional seven amino acids. A comparison of these exogenous virus sequences with those predicted for the endogenous virus RAV- (34) and for that coded at the chicken chromosomal locus, ev-1 (kindly provided by A. Skalka, Roche Institute, Nutley, N.J.) reveals remarkable homology with the SR-A sequence at both nucleotide and amino acid levels. These data support the concept that the subgroup E and A proteins are more closely related to one another than to those of subgroup C. The question of interactions between the viral glycoproteins and gag gene polvpeptides during assembly is unresolved. Although the deletion mutants SR-NY8 and RSV(-) assemble virions even though they appear to lack all env gene information (21; 83), it is still feasible that a gagenv interaction is normally involved in assembly but is not necessary for it to occur. Indeed, the conservation observed in the first 18 amino acids of the intracytoplasmic segment of gp37 suggests that some such functional determinants exist there. Conclusions. The coding region for the RSV env gene predicts the synthesis of a polypeptide that has many structural features in common with other viral and cellular bitopic membrane proteins. The precursor and the mature polypeptide products, gp85 and gp37, show remarkable similarities to the influenza HA and HA1-HA2 polypeptides with respect to functional organization and size: gp85 (341 amino acids) versus HAl (328 amino acids) and gp37 (198 amino acids) versus HA2 (221 amino acids). Chemical cross-linking (89) and X-ray crystallographic studies (9) have characterized the influenza hemagglutinin as a trimer structure in which HAl, linked to the bitopic HA2 through a disulphide linkage, forms a loop. In this way, both

13 932 HUNTER ET AL. 4 u U. a, 4-4- ba,c.,- IC.)aL 44I- 4- C- CD CD C- C- CD. CD _. LC C.) C.)> S CD CD C- 44 u3-i- 1-~ (A 44 CC3 L). LC.)3-I- OA 4 B-CD CD cdm co CD ur D 4AC.) C.) 1 CD CD >4 : < 4 ( ( >4(DC tno to a, cd ( CD tvo C o s.,n C- ȯc >- CD S-1-L ca- C.Dw,L > -CD 4CM 3CD 4 c CD<:, <cs,< so- >4-4-3CD > >4.~( CD ( >4 <:> s (3-C. C4- C-.C c 4-3- P. c >4: 444 PcnD e L, -CD S <: < 44 *_ C-( _ a.c C- C- 4 < < L4- (A C- C- - ( >4 4.)4-I 4.) CL 3-CD) CL 4-4.(< CD. a4-4-a,.4l Li to CD I-a.<: (D C- C- C- C- o >4 ). >4 ( ( co.-. ) C.) L 44 c(. 44, 4-4- a, C 4 Sa CD to m Cd L) ( ( a, a C-) ( )4 C- cm O C.) C- C-) C.) cm cm -C 4-4:- o ( ( C- C- cc- C CA :* :* L 4 C CD > C- CD C ( C C-. C- C- _ 4-4- _ 4-e C4- c nl) 4 4 n<: -cc ) ~ CD 4 C.D CD a) )I- < s C( CD >1 4- D CD 4-44 < U - 4- m LA4-4.C cc( - -L) ( c4 CL ( CD 4 C- ( C.D C- >4 CD ( >4 _ 4-4- _ ( (o S 4 ( ( 4 _oc-34 (- CDo-~ C- C-) C-) c- C- -) _- C-) C- C-)3-4 _ LA 4-3C-) C- 4:3-( C- 44 < < 44 C-LA a, ( 34 ( C- v C) C-) < c( C- C u C- C C-) U C-) C- SC u S >4 ( ( >4 to CD CD I 4- a, C- C- C- C-. C-) C-L).6 c> C-) C-): > > E4.) (3. C-.4 CCD ( ( >$ >4 U< <4C,mc 4CD ( 4 _o 4 CD ( to_ 4 LA 4-4 CA CDCD C C, ( 4- ( 4- a a-, CD ld L C) C-) CD ( 4 4 ( ( 4 C C-) C-) 44 < < ~~~~~~ J. VIROL. c U) 4) cc. u 4- ZJO to X. C O CDO _ CD C-. ID so _ - *44-' I *-4 L. u C-) L r. = cis - S Z 4 m C- C- cl =o a()cd )Q - CDt _ W5 -d4 OC_ ) C-) C-) C-) C- 4.L. C-). A cc cc ~~~ LO ~~~~,~4) U- Q 4C- C.) CD C- CD. C- = 4.. = OCD CD.c >4- CL >4 (3-4 *QO~~~~~~~~~~~~~ 6 ( to o C- C. C.. > : 4- etn t C- b ZCD C P- L) 4 CX CA a) C-) C-) a) 6 eq >44- _ CD 4CD D _ o6 OoF CD CD a, 4-4-a a CD CDI.4,A f3-( 64.) C- L 6 CCD > L.) C.L) o.c U U _ WI to CD CD C.) -. > CD) _ CD). 1 I. C.) C-) C- CD) - C-) _4cs s > _4 CD 4 ( C s ccd.5 o >4o c fi L.)C) to to ( ( 4A C -C 4-) C- CD C C.)L-C) C4-3 CC.- 4- _ ( C C- C C C.) ec.) s 4. n 44,-F cc. ZC. >4 c4 C- D CD3- C.) 4 CK >4 > o CD ( ( ( C- im < cm > CD (. C- cc- 4C -) C- 4 ( ( 44 a, C- C-. _C. 4 C- Q- ~~~~~c # >CD C 4 a~4' c,-.-4.), CD CD. LA- ct: LA vqy C-) CD) ~ C-)U ooz' s CD CD a-co_v 4. CA C- >., >lb ~, ~ 4Ia 4-) o ~ ~ o m _CD> Q _ z a,s *r'. 6 CL U C- 4-,C..- D. CC 4CD CD C-,3 cc (A) 4) o. = CwS U.ZA,.C13 UL) 4- a,c.) C.) ~ oo O. 3 (4:' 4. co M L >4 L, E

14 VOL. 46, 1983 PRC SRA RAV- ev R-K-M-I-N-S-S-I-N-Y-H-T-E-Y-R-K-M-Q-G-G-A-V RSV env GENE SEQUENCE 933 -R-K-M-I-N-N-S-I-S-Y-H-T-E-Y-K-K-I-Q-K-A-C-G-Q-P-E-S-R-I-V -R-K-M-I-N-N-S-I-S-Y-H-T-E-Y-K-K-L-Q-K-A-C-R-Q-P-E-N-G-A-V -R-K-M-I-N-N-S-I-S-Y-H-T-E-Y-K-K-L-Q-K-A-C-R-Q-P-E-N-E-T-V FIG. 9. Host range variation at the carboxy-terminus of gp37. The amino acid sequences for the carboxyterminal cytoplasmic region of gp37 from four strains of avian sarcoma-leukemia viruses are presented according to the single letter code for amino acids. Residues are numbered from the first charged amino acid that follows the uncharged hydrophobic transmembrane region of gp37. The amino acid sequences for SR-A and ev-1 were deduced from nucleotide sequences kindly provided by J. Sorge, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., and A. Skalka, Roche Institute, Nutley, N.J., respectively. The RAV- sequence is that reported by Hughes (34). Amino acid residues that differ from the PR-C sequence are shown in bold-faced type; those that differ between the sequences of the endogenous viruses and SR-A are underlined. amino- and carboxy-termini of HAI are close to the viral membrane, and the receptor-binding, antigenic portion of the molecule is extended away from the viral surface. For RSV, there is no evidence that the VGP complex is in a trimer structure; indeed the knobbed-spike morphology observed in electron micrographs (3, 9, 1) and chemical cross-linking studies (63) argues against such an organization. Nevertheless, the fact that the host range variable region of gp85 is in the center of the molecule suggests that it may also be in a loop-like structure that is linked through a disulphide bond(s) to the membranespanning gp37. As might be expected, the general organization of the coding regions in the RSV env gene product is very similar to that recently reported for the murine ecotropic and dualtropic retroviruses (11, 32a, 53, 75). It seems unlikely, however, that the tertiary structure of these retroviral glycoproteins is tightly conserved. In the influenza virus hemagglutinins, there is extensive conservation in the placement of cysteine residues (and presumably 3-dimensional structure) even between antigenically unrelated type A and B influenza viruses (46, 85). A similar conservation of cysteines does not exist between the murine and avian retrovirus env gene products, nor is there any extensive sequence homology at the amino acid level. This may reflect a divergence in structure to bind different receptor molecules. The env products of both are, however, rich in cysteines (21 residues in PR-C), and so it is likely that there is extensive intrachain disulphide bridging involved in maintaining the conformation of both classes of molecules. During biosynthesis, the glycosylated env precursor protein undergoes two proteolytic cleavages to yield the mature products. The first of these, in which the leader peptide (which includes the signal sequence) is removed, is presumed to occur during translation and to be carried out by a host-specified signal peptidase. The second cleavage, which generates gp85 and gp37 from p95env, occurs at the carboxyl side of the amino acid sequence gp85-arg-arg-lys- Arg-NH2-gp37, located at the extreme carboxy terminus of gp85. This highly basic cleavage site is similar to that described for several viral membrane proteins (24, 42, 53, 64, 75) and for several peptide hormone precursors (27, 28, 4, 58, 77). In the case offowl plague virus, an avian influenza virus, where cleavage occurs at a similar sequence of basic amino acids, HAl- COOH-Lys-Lys-Arg-Glu-Lys-Arg-NH2-HA2, carboxy-peptidase action can remove all six of the charged residues from the carboxy-terminus of HAl (42). Initial attempts to determine the carboxy-terminus of gp85 have been unsuccessful, and so it is not clear at the present time whether a similar process may be involved with RSV. Cleavage of the hemagglutinin of influenza viruses is essential for infectivity (43, 49), and therefore, it should be interesting to develop mutants of RSV that are unable to cleave pr95env to determine whether a similar restriction applies. In summary, it has been possible, through a combination of protein and nucleic acid biochemistry approaches, to identify several important functional regions on the env gene product of RSV. These studies have pinpointed the potential signals for initiating translocation of the nascent polypeptide across the endoplasmic reticulum, for addition of carbohydrate side chains, and for proteolytic cleavage of the precursor molecule. It has also been possible to identify regions of the glycoprotein that are important in host range determination and in anchoring the VGP complex in the viral membrane. Blobel (7) has suggested that "topogenic signals" exist on bitopic membrane proteins to target them from their site of synthesis to the plasma membrane. The delineation of the env

15 934 HUNTER ET AL. gene-coding region described here and its subcloning and expression in a eucaryotic expression vector (E., Hunter, manuscript in preparation) should provide an excellent system for exploring these hypotheses. ACKNOWLEDGMENTS We thank Tom Cornelius, Jane Ellen Smith, and Karen Butler for expert technical assistance, Melanie Pollard for careful preparation of this manuscript, and John Wills and John Coffin for discussions and critical reviews of the manuscript. J. Coffin, S. Hughes, J. Sorge, and A. Skalka generously provided data before publication. This work was supported by grant CA from the National Cancer Institute. E.H. is the recipient of research career development award CA 685 from the National Cancer Institute. LITERATURE CITED 1. Anderson, S. M., and J. H. Chen In vitro translation of avian myeloblastosis virus RNA. J. Virol. 4: Austen, B. M Predicted secondary structures of amino-terminal extension sequences of secreted proteins. FEBS Lett. 13: Bernhard, W Electron microscopy of tumor cells and tumor viruses. A review. Cancer Res. 18: Bhown, A. S., J. C. Bennett, J. E. Mole, and E. Hunter Purification and characterization of the gag gene products of avian-type C retroviruses by high-pressure liquid chromatography. Anal. Biochem. 112: Bhown, A. S., T. W. Cornelius, J. E. Mole, J. D. Lynn, W. A. Tidwell, and J. C. Bennett A simple modification on the vacuum system of the Beckman automated sequencer to improve the efficiency of Edman degradation. Anal. Biochem. 12: Bhown, A. S., J. E. Mole, F. Hunter, and J. C. Bennett High-sensitivity sequence determination of proteins quantitatively recovered from sodium dodecyl sulfate gels using an improved electrodialysis procedure. Anal. Biochem. 13: Blobel, G Intracellular protein topogenesis. Proc. Natl. Acad. Sci. U.S.A. 77: Blobel, G., and B. Dobberstein Transfer of proteins across membranes. I. Presence of proteolytically processed and unprocessed nascent immunoglobulin light chains on membrane-bound ribosomes of murine myeloma. J. Cell Biol. 67: Bolognesi, D. P Structural components of RNA tumor virus. Adv. Virus Res. 19: Bolognesi, D. P., H. Bauer, H. Gelderblom, and H. Gudrun Polypeptides of avian RNA tumor virus. IV. Components of the viral envelope. Virology 47: Bosselman, R. A., F. van Straaten, C. Van Beveren, I. M. Verma, and M. Vogt Analysis of the env gene of a molecularly cloned and biologically active Moloney mink cell focus-forming proviral DNA. J. Virol. 44: Both, G. W., and M. J. Sleigh Conservation and variation in the hemagglutinins of Hong Kong subtype influenza viruses during antigenic drift. J. Virol. 39: Braell, W. A., and H. F. Lodish The erythrocyte anion transport protein is cotranslationally inserted into microsomes. Cell 28: Buchhagen, D. L., and H. Hanafusa Intracellular precursors to the major glycoprotein of avian oncoviruses in chicken embryo fibroblasts. J. Virol. 25: Coffin, J. M., M. Champion, and F. Chabot Nucleotide sequence relationships between the genomes of an endogenous and an exogenous avian tumor virus. J. Virol. 28: Crittenden, L. B Two levels of genetic resistance to lymphoid leukosis. Avian Dis. 19: J. VIROL. 17. Cullen, S. E., and B. D. Schwartz An improved method for isolation of H-2 and Ia alloantigens with immunoprecipitation induced byprotein A-bearing staphylococci. J. Immunol. 117: Czernilofsky, A. P., A. D. Levinson, H. E. Varmus, J. M. Bishop, E. Tischer, and H. M. Goodman Nucleotide sequence of an avian sarcoma virus oncogene src and proposed amino acid sequence for gene product. Nature (London) 287: Dickson, C., and M. Atterwill Structure and processing of the mouse mammary tumor virus glycoprotein precursor Pr73e"n. J. Virol. 35: Diggelmann, H Biosynthesis of an unglycosylated envelope glycoprotein of Rous sarcoma virus in the presence of tunicamycin. J. Virol. 2: Duesberg, P. H., S. Kawal, L. H. Wang, P. K. Vogt, H. M. Murphy, and H. Hanafusa RNA of replicationdefective strains of Rous sarcoma virus. Proc. Natl. Acad. Sci. U.S.A. 72: England, J. M., D. P. Bolognesi, B. Dietzschold, and M. S. Halpern Evidence that a precursor glycoprotein is cleaved to yield the major glycoprotein of avian tumor virus. J. Virol. 21: Galehouse, D. M., and P. H. Duesberg Glycoproteins of avian tumor virus recombinants: evidence for intragenic crossing-over. J. Virol. 25: Garoff, H., A. M. Frischauf, K. Simons, H. Lehrach, and H. Delius Nucleotide sequence of cdna coding for Semliki Forest virus membrane glycoproteins. Nature (London) 288: Gething, M. J., J. Bye, J. Skehel, and M. Waterfield Cloning and DNA sequence of double-stranded copies of hemagglutinin genes from H2 and H3 strains elucidates antigenic shift and drift in human influenza virus. Nature (London) 287: Gething, M. J., J. M. White, and M. D. Waterfield Purification of the fusion protein of Sendai virus: analysis of the NH2-terminal sequence generated during precursor activation. Proc. NatI. Acad. Sci. U.S.A. 75: Gregory, R. A., and H. J. Tracy Isolation of two "big gastrins" from Zollinger-Ellison tumour tissue. Lancet ii: Habener, J. F., H. T. Chang, and J. T. Potts, Jr Enzymic processing of proparathyroid hormone by cellfree extracts of parathyroid glands. Biochemistry 16: Hackett, P. B., R. Swanstrom, H. E. Varmus, and J. M. Bishop The leader sequence of the subgenomic mrna's of Rous sarcoma virus is approximately 39 nucleotides. J. Virol. 41: Hardwick, J. M., and E. Hunter Rous sarcoma virus mutant LA3382 is defective in virion glycoprotein assembly. J. Virol. 4: Hayman, M. F Synthesis and processing of avian sarcoma virus glycoproteins. Virology 85: Hayward, W. S Size and genetic content of viral RNAs in avian oncovirus-infected cells. J. Virol. 24: a.Herr, W., V. Corbin, and W. Gilbert Nucleotide sequence of the 3' half of AKV. Nucleic Acids Res. 1: Hu, S. S. F., M. M. C. Lai, and P. K. Vogt Characterization of the env gene in avian oncoviruses by heteroduplex mapping. J. Virol. 27: Hughes, S. H Sequence of the long terminal repeat and adjacent segments of the endogenous avian virus Rous-associated virus. J. Virol. 43: Hunt, L. A., S. E. Wright, J. R. Etchinson, and D. F. Summers Oligosaccharide chains of avian RNA tumor virus glycoproteins contain heterogenous oligomannosyl cores. J. Virol. 29: Hunter, E Biological techniques for avian sarcoma viruses. Methods Enzymol. 58: Hunter, E., M. J. Hayman, R. W. Rongey, and P. K. Vogt An avian sarcoma virus mutant which is tempera-

16 VOL. 46, 1983 RSV env GENE SEQUENCE 935 ture sensitive for virion assembly. Virology 69: Joho, R. H., M. A. Billeter, and C. Weissman Mapping of biological functions on RNA of avian tumor viruses: location of regions required for transformation and determination of host range. Proc. Natl. Acad. Sci. U.S.A. 72: Katz, R. A., C. A. Omer, J. H. Weis, S. A. Mitsialis, A. J. Faras, and R. V. Guntaka Restriction endonuclease and nucleotide sequence analyses of molecularly cloned unintegrated avian tumor virus DNA: structure of large terminal repeats in circle junctions. J. Virol. 42: Kemmler, W., D. F. Steiner, and J. Borg Studies on the conversion of proinsulin to insulin. J. Biol. Chem. 248: Klemenz, R., and H. Diggelmann The generation of two envelope glycoproteins of Rous sarcoma virus from a common precursor polypeptide. Virology 85: Klenk, H. D., W. Garten, W. Keil, H. Niemann, F. X. Bosch, R. T. Schwarz, C. Scholtissek, and R. Rott Processing of the hemagglutinin in genetic variation among influenza viruses, p In Debi P. Nayak (ed.), Genetic variation among influenza viruses. Academic Press, Inc. New York. 43. Klenk, H. D., and R. Rott Cotranslational and postranslational processing of viral glycoproteins. Curr. Top. Microbiol. Immunol. 9: Kozak, M Possible role of flanking nucleotides in recognition of the AUG initiator codon by eukaryotic ribosomes. Nucleic Acids Res. 9: Krantz, M. J., Y. C. Lee, and P. P. Hung Characterization and comparison of the major glycoprotein from three strains of Rous sarcoma virus. Arch. Biochem. Biophys. 174: Krystal, M., R. M. Elliot, E. W. Benz, Jr., J. F. Young, and P. Palese Evolution of influenza A and B viruses: Conservation of structural features in the hemagglutinin genes. Proc. Natl. Acad. Sci. U.S.A. 79: Lai, M. M. C., and P. H. Duesberg Adenylic acidrich sequence in RNAs of Rous sarcoma virus and Rauscher mouse leukaemia virus. Nature (London) 25: Lai, M. M. C., and P. H. Duesberg Differences between the envelope glycoproteins and glycopeptides of avian tumor viruses released from transformed and from nontransformed cells. Virology 5: Lazarowitz, S. G., and P. W. Choppin Enhancement of infectivity of influenza A and B viruses by proteolytic cleavage of the hemagglutinin polypeptide. Virology 68: Leamnson, R. N., and M. S. Halpern Subunit structure of the glycoprotein complex of avian tumor viruses. J. Virol. 18: Lee, J. S., H. E. Varmus, and J. M. Bishop Virusspecific messenger RNAs in permissive cells infected by avian sarcoma virus. J. Biol. Chem. 254: Lennarz, N. J Lipid linked sugars in glycoprotein synthesis. Science 188: Lenz, J., Crowther, R., A. Straceski, and W. Haseltine Nucleotide sequence of the Akv env gene. J. Virol. 42: Mellon, P., and P. H. Duesberg Subgenomic, cellular Rous sarcoma virus RNAs contain oligonucleotides from the 3' half and the 5' terminus of virion RNA. Nature (London) 27: Min Jou, W., M. Verhoeyen, R. Devos, E. Saman, F. Fang, D. Huylebroeck, and W. Fiers Complete structure of the hemagglutinin gene from the human influenza A/ Victoria/3/75 (H3N2) strain as determined from cloned DNA. Cell 19: Moelling, K., and M. Hayami Analysis of precursors to the envelope glycoproteins of avian RNA tumor viruses in chicken and quail cells. J. Virol. 22: Nakamura, K., and R. W. Compans Effects of glucosamine, 2-deoxyglucose, and tunicamycin on glycosylation sulfation, and assembly of influenza viral proteins. Virology 84: Nakanishi, S., A. Inoue, T. Kita, M. Nakamura, A. C. Y. Chang, S. N. Cohen, and S. Numa Nucleotide sequence of cloned cdna for bovine corticotropin-blipotropin precursor. Nature (London) 278: Neuberger, A., A. Gottschalk, R. D. Marshall, and R. Spiro Carbohydrate-peptide linkages in glycoproteins and methods for their elucidation, p In A. Gottschalfk (ed.), The glycoproteins: their composition, structure and function. Elsevier/North Holland Publishing Co., Amsterdam. 6. Pani, P. K Genetics of resistance of fowl to infection by RNA tumor viruses. Proc. R. Soc. Med. 69: Pauli, G., W. Rohde, and E. Harms The structure of the Rous sarcoma virus glycoprotein complex. Arch. Virol. 58: Pawson, T., P. Meilon, P. H. Duesberg, and G. S. Martin env gene of Rous sarcoma virus: identification of the gene product by cell-free translation. J. Virol. 33: Pepinsky, R. B., D. Capiello, C. Wilkowski, and V. M. Vogt Chemical cross-linking of proteins in avian sarcoma and leukemia viruses. Virology 12: Porter, A. G., C. Barber, N. H. Carey, R. A. Hallewell, G. Threlfall, and J. S. Emtage Complete nucleotide sequence of an influenza virus heamagglutinin gene from cloned DNA. Nature (London) 282: Purchio, A. F., E. Erikson, and R. L. Erikson Translation of 35S and of subgenomic regions of avian sarcoma virus RNA. Proc. Natl. Acad. Sci. U.S.A. 74: Purchio, A. F., S. Jovanovich, and R. L. Erikson Sites of synthesis of viral proteins in avian sarcoma virusinfected chicken cells. J. Virol. 35: Rettenmier, C. W., S. M. Anderson, M. W. Riemen, and H. Hanafusa gag-related polypeptides encoded by replication-defective avian oncoviruses. J. Virol. 32: Rifkin, D. B., and R. W. Compans Identification of the spike proteins of Rous sarcoma virus. Virology 46: Rose, J. K., and C. J. Gallione Nucleotide sequences of the mrna's encoding the vesicular stomatitis virus G and M proteins determined from cdna clones containing the complete coding regions. J. Virol. 39: Rothman, J. E., F. N. Katz, and H. F. Lodish Glycosylation of a membrane protein is restricted to the growing polypeptide chain but is not necessary for insertion as a transmembrane protein. Cell 15: Rothman, J. E., and H. F. Lodish Synchronised transmembrane insertion and glycosylation of a nascent membrane protein. Nature (London) 269: a.Schwarz, D., R. Tizard, and W. Gilbert Nucleotide sequence of Rous sarcoma virus. Cell 32: Segrest, J. P. and R. J. Feldmann Membrane proteins: amino acid sequence and membrane penetration. J. Mol. Biol. 87: Segrest, J. P., R. L. Jackson, E. P. Andrews, and V. T. Marches Human erythrocyte membrane glycoprotein: a reevaluation of the molecular weight by SDSpolyacrylamide gel electrophoresis. Biochem. Biophys. Res. Commun. 44: Shealy, D. J., and R. R. Rueckert Proteins of Rousassociated virus 61, an avian retrovirus: common precursor for glycoproteins gp85 and gp35 and use of pactamycin to map translational order of proteins in the gag, pol, and env genes. J. Virol. 26: Shinnick, T. M., R. A. Lerner, and J. G. Sutcliffe Nucleotide sequence of Moloney murine leukaemia virus. Nature (London) 293: Stohrer, R., and E. Hunter Inhibition of Rous sarcoma virus replication by 2-deoxyglucose and tunicamycin: identification of an unglycosylated env gene prod-

17 936 HUNTER ET AL. J. VIROL. uct. J. Virol. 32: Tager, H. S., and D. F. Steiner Isolation of a glucagon-containing peptide: primary structure of a possible fragment of proglucagon. Proc. Natl. Acad. Sci. U.S.A. 7: Takatsuki, A., and G. Tamura Effect of tunicamycin on the synthesis of macro-molecules in cultures of chick embryo fibroblasts infected with Newcastle disease virus. J. Antibiot. 24: Tkacz, J. S., and J.. Lampen Tunicamycin inhibition of polyisoprenyl N-acetyl-glucosaminyl pyrophosphate formation in calf-liver microsomes. Biochem. Biophys. Res. Commun. 65: Tomita, M., and V. T. Marchesi Amino-acid sequence and oligosaccharide attachment sites of human erythrocyte glycophorin. Proc. Natl. Acad. Sci. U.S.A. 72: Tozawa, H., H. Bauer, T. Graf, and H. Gelderblom Strain-specific antigen of the avian leukosis sarcoma virus group. I. Isolation and immunological characterization. Virology 4: Vogt, P. K The genetics of RNA tumor viruses, p In H. Fraenkel-Conrat and R. R. Wagner (ed.), Comprehensive virology, vol. 9. Plenum Publishing Corp., New York. 83. Wang, L. H., P. H. Duesberg, S. Kawai, and H. Hanafusa The location of envelope-specific and sarcomaspecific oligonucleotides on the RNA of Schmidt-Ruppin Rous sarcoma virus. Proc. Natl. Acad. Sci. U.S.A. 73: Wang, L.-H., and D. W. Stacey Participation of subgenomic retroviral mrnas in recombination. J. Virol. 41: Ward, C. W Structure of the influenza virus hemagglutinin. Curr. Top. Microbiol. Immunol. 94/95: Weiss, S. R., H. E. Varmus, and J. M. Bishop The size and genetic composition of virus-specific RNAs in the cytoplasm of cells producing avian sarcoma-leukosis viruses. Cell 12: White, J., K. Matlin, and A. Helenius Cell fusion by Semliki Forest, influenza, and vesicular stomatitis viruses. J. Cell Biol. 89: Wickner, W The assembly of proteins into biological membranes: the membrane trigger hypothesis. Annu. Rev. Biochem. 48: Wiley, D. C., J. J. Skehel, and M. Waterfield Evidence from studies with a cross-linking reagent that the haemagglutinin of influenza virus is a trimer. Virology 79: Wilson, I. A., J. J. Skehel, and D. C. Wiley Structure of the hemagglutining membrane glycoprotein of influenza virus at 3 A resolution. Nature (London) 289:

Materials Protein synthesis kit. This kit consists of 24 amino acids, 24 transfer RNAs, four messenger RNAs and one ribosome (see below).

Materials Protein synthesis kit. This kit consists of 24 amino acids, 24 transfer RNAs, four messenger RNAs and one ribosome (see below). Protein Synthesis Instructions The purpose of today s lab is to: Understand how a cell manufactures proteins from amino acids, using information stored in the genetic code. Assemble models of four very

More information

Disease and selection in the human genome 3

Disease and selection in the human genome 3 Disease and selection in the human genome 3 Ka/Ks revisited Please sit in row K or forward RBFD: human populations, adaptation and immunity Neandertal Museum, Mettman Germany Sequence genome Measure expression

More information

Lecture 19A. DNA computing

Lecture 19A. DNA computing Lecture 19A. DNA computing What exactly is DNA (deoxyribonucleic acid)? DNA is the material that contains codes for the many physical characteristics of every living creature. Your cells use different

More information

Electronic Supplementary Information

Electronic Supplementary Information Electronic Supplementary Material (ESI) for Molecular BioSystems. This journal is The Royal Society of Chemistry 2017 Electronic Supplementary Information Dissecting binding of a β-barrel outer membrane

More information

ORFs and genes. Please sit in row K or forward

ORFs and genes. Please sit in row K or forward ORFs and genes Please sit in row K or forward https://www.flickr.com/photos/teseum/3231682806/in/photostream/ Question: why do some strains of Vibrio cause cholera and others don t? Methods Mechanisms

More information

NAME:... MODEL ANSWER... STUDENT NUMBER:... Maximum marks: 50. Internal Examiner: Hugh Murrell, Computer Science, UKZN

NAME:... MODEL ANSWER... STUDENT NUMBER:... Maximum marks: 50. Internal Examiner: Hugh Murrell, Computer Science, UKZN COMP710, Bioinformatics with Julia, Test One, Thursday the 20 th of April, 2017, 09h30-11h30 1 NAME:...... MODEL ANSWER... STUDENT NUMBER:...... Maximum marks: 50 Internal Examiner: Hugh Murrell, Computer

More information

Lecture 11: Gene Prediction

Lecture 11: Gene Prediction Lecture 11: Gene Prediction Study Chapter 6.11-6.14 1 Gene: A sequence of nucleotides coding for protein Gene Prediction Problem: Determine the beginning and end positions of genes in a genome Where are

More information

Lecture 10, 20/2/2002: The process of solution development - The CODEHOP strategy for automatic design of consensus-degenerate primers for PCR

Lecture 10, 20/2/2002: The process of solution development - The CODEHOP strategy for automatic design of consensus-degenerate primers for PCR Lecture 10, 20/2/2002: The process of solution development - The CODEHOP strategy for automatic design of consensus-degenerate primers for PCR 1 The problem We wish to clone a yet unknown gene from a known

More information

Homework. A bit about the nature of the atoms of interest. Project. The role of electronega<vity

Homework. A bit about the nature of the atoms of interest. Project. The role of electronega<vity Homework Why cited articles are especially useful. citeulike science citation index When cutting and pasting less is more. Project Your protein: I will mail these out this weekend If you haven t gotten

More information

Project 07/111 Final Report October 31, Project Title: Cloning and expression of porcine complement C3d for enhanced vaccines

Project 07/111 Final Report October 31, Project Title: Cloning and expression of porcine complement C3d for enhanced vaccines Project 07/111 Final Report October 31, 2007. Project Title: Cloning and expression of porcine complement C3d for enhanced vaccines Project Leader: Dr Douglas C. Hodgins (519-824-4120 Ex 54758, fax 519-824-5930)

More information

G+C content. 1 Introduction. 2 Chromosomes Topology & Counts. 3 Genome size. 4 Replichores and gene orientation. 5 Chirochores.

G+C content. 1 Introduction. 2 Chromosomes Topology & Counts. 3 Genome size. 4 Replichores and gene orientation. 5 Chirochores. 1 Introduction 2 Chromosomes Topology & Counts 3 Genome size 4 Replichores and gene orientation 5 Chirochores 6 7 Codon usage 121 marc.bailly-bechet@univ-lyon1.fr Bacterial genome structures Introduction

More information

Protein Structure Analysis

Protein Structure Analysis BINF 731 Protein Structure Analysis http://binf.gmu.edu/vaisman/binf731/ Iosif Vaisman COMPUTATIONAL BIOLOGY COMPUTATIONAL STRUCTURAL BIOLOGY COMPUTATIONAL MOLECULAR BIOLOGY BIOINFORMATICS STRUCTURAL BIOINFORMATICS

More information

Supplementary. Table 1: Oligonucleotides and Plasmids. complementary to positions from 77 of the SRα '- GCT CTA GAG AAC TTG AAG TAC AGA CTG C

Supplementary. Table 1: Oligonucleotides and Plasmids. complementary to positions from 77 of the SRα '- GCT CTA GAG AAC TTG AAG TAC AGA CTG C Supplementary Table 1: Oligonucleotides and Plasmids 913954 5'- GCT CTA GAG AAC TTG AAG TAC AGA CTG C 913955 5'- CCC AAG CTT ACA GTG TGG CCA TTC TGC TG 223396 5'- CGA CGC GTA CAG TGT GGC CAT TCT GCT G

More information

Figure S1. Characterization of the irx9l-1 mutant. (A) Diagram of the Arabidopsis IRX9L gene drawn based on information from TAIR (the Arabidopsis

Figure S1. Characterization of the irx9l-1 mutant. (A) Diagram of the Arabidopsis IRX9L gene drawn based on information from TAIR (the Arabidopsis 1 2 3 4 5 6 7 8 9 10 11 12 Figure S1. Characterization of the irx9l-1 mutant. (A) Diagram of the Arabidopsis IRX9L gene drawn based on information from TAIR (the Arabidopsis Information Research). Exons

More information

Det matematisk-naturvitenskapelige fakultet

Det matematisk-naturvitenskapelige fakultet UNIVERSITETET I OSLO Det matematisk-naturvitenskapelige fakultet Exam in: MBV4010 Arbeidsmetoder i molekylærbiologi og biokjemi I MBV4010 Methods in molecular biology and biochemistry I Day of exam: Friday

More information

Codon Bias with PRISM. 2IM24/25, Fall 2007

Codon Bias with PRISM. 2IM24/25, Fall 2007 Codon Bias with PRISM 2IM24/25, Fall 2007 from RNA to protein mrna vs. trna aminoacid trna anticodon mrna codon codon-anticodon matching Watson-Crick base pairing A U and C G binding first two nucleotide

More information

Hes6. PPARα. PPARγ HNF4 CD36

Hes6. PPARα. PPARγ HNF4 CD36 SUPPLEMENTARY INFORMATION Supplementary Table Positions and Sequences of ChIP primers -63 AGGTCACTGCCA -79 AGGTCTGCTGTG Hes6-0067 GGGCAaAGTTCA ACOT -395 GGGGCAgAGTTCA PPARα -309 GGCTCAaAGTTCAaGTTCA CPTa

More information

RPA-AB RPA-C Supplemental Figure S1: SDS-PAGE stained with Coomassie Blue after protein purification.

RPA-AB RPA-C Supplemental Figure S1: SDS-PAGE stained with Coomassie Blue after protein purification. RPA-AB RPA-C (a) (b) (c) (d) (e) (f) Supplemental Figure S: SDS-PAGE stained with Coomassie Blue after protein purification. (a) RPA; (b) RPA-AB; (c) RPA-CDE; (d) RPA-CDE core; (e) RPA-DE; and (f) RPA-C

More information

PCR analysis was performed to show the presence and the integrity of the var1csa and var-

PCR analysis was performed to show the presence and the integrity of the var1csa and var- Supplementary information: Methods: Table S1: Primer Name Nucleotide sequence (5-3 ) DBL3-F tcc ccg cgg agt gaa aca tca tgt gac tg DBL3-R gac tag ttt ctt tca ata aat cac tcg c DBL5-F cgc cct agg tgc ttc

More information

Supplemental material

Supplemental material Supplemental material Diversity of O-antigen repeat-unit structures can account for the substantial sequence variation of Wzx translocases Yaoqin Hong and Peter R. Reeves School of Molecular Bioscience,

More information

Quantitative reverse-transcription PCR. Transcript levels of flgs, flgr, flia and flha were

Quantitative reverse-transcription PCR. Transcript levels of flgs, flgr, flia and flha were 1 Supplemental methods 2 3 4 5 6 7 8 9 1 11 12 13 14 15 16 17 18 19 21 22 23 Quantitative reverse-transcription PCR. Transcript levels of flgs, flgr, flia and flha were monitored by quantitative reverse-transcription

More information

Supplemental Data Supplemental Figure 1.

Supplemental Data Supplemental Figure 1. Supplemental Data Supplemental Figure 1. Silique arrangement in the wild-type, jhs, and complemented lines. Wild-type (WT) (A), the jhs1 mutant (B,C), and the jhs1 mutant complemented with JHS1 (Com) (D)

More information

Supplemental Table 1. Mutant ADAMTS3 alleles detected in HEK293T clone 4C2. WT CCTGTCACTTTGGTTGATAGC MVLLSLWLIAAALVEVR

Supplemental Table 1. Mutant ADAMTS3 alleles detected in HEK293T clone 4C2. WT CCTGTCACTTTGGTTGATAGC MVLLSLWLIAAALVEVR Supplemental Dataset Supplemental Table 1. Mutant ADAMTS3 alleles detected in HEK293T clone 4C2. DNA sequence Amino acid sequence WT CCTGTCACTTTGGTTGATAGC MVLLSLWLIAAALVEVR Allele 1 CCTGTC------------------GATAGC

More information

Supporting information for Biochemistry, 1995, 34(34), , DOI: /bi00034a013

Supporting information for Biochemistry, 1995, 34(34), , DOI: /bi00034a013 Supporting information for Biochemistry, 1995, 34(34), 10807 10815, DOI: 10.1021/bi00034a013 LESNIK 10807-1081 Terms & Conditions Electronic Supporting Information files are available without a subscription

More information

Supporting Information

Supporting Information Supporting Information Transfection of DNA Cages into Mammalian Cells Email: a.turberfield@physics.ox.ac.uk Table of Contents Supporting Figure 1 DNA tetrahedra used in transfection experiments 2 Supporting

More information

Arabidopsis actin depolymerizing factor AtADF4 mediates defense signal transduction triggered by the Pseudomonas syringae effector AvrPphB

Arabidopsis actin depolymerizing factor AtADF4 mediates defense signal transduction triggered by the Pseudomonas syringae effector AvrPphB Arabidopsis actin depolymerizing factor mediates defense signal transduction triggered by the Pseudomonas syringae effector AvrPphB Files in this Data Supplement: Supplemental Table S1 Supplemental Table

More information

Supplement 1: Sequences of Capture Probes. Capture probes were /5AmMC6/CTG TAG GTG CGG GTG GAC GTA GTC

Supplement 1: Sequences of Capture Probes. Capture probes were /5AmMC6/CTG TAG GTG CGG GTG GAC GTA GTC Supplementary Appendixes Supplement 1: Sequences of Capture Probes. Capture probes were /5AmMC6/CTG TAG GTG CGG GTG GAC GTA GTC ACG TAG CTC CGG CTG GA-3 for vimentin, /5AmMC6/TCC CTC GCG CGT GGC TTC CGC

More information

for Programmed Chemo-enzymatic Synthesis of Antigenic Oligosaccharides

for Programmed Chemo-enzymatic Synthesis of Antigenic Oligosaccharides Supporting Information Design of α-transglucosidases of Controlled Specificity for Programmed Chemo-enzymatic Synthesis of Antigenic Oligosaccharides Elise Champion ±,,,, Isabelle André ±,,, Claire Moulis

More information

PGRP negatively regulates NOD-mediated cytokine production in rainbow trout liver cells

PGRP negatively regulates NOD-mediated cytokine production in rainbow trout liver cells Supplementary Information for: PGRP negatively regulates NOD-mediated cytokine production in rainbow trout liver cells Ju Hye Jang 1, Hyun Kim 2, Mi Jung Jang 2, Ju Hyun Cho 1,2,* 1 Research Institute

More information

Multiplexing Genome-scale Engineering

Multiplexing Genome-scale Engineering Multiplexing Genome-scale Engineering Harris Wang, Ph.D. Department of Systems Biology Department of Pathology & Cell Biology http://wanglab.c2b2.columbia.edu Rise of Genomics An Expanding Toolbox Esvelt

More information

Dierks Supplementary Fig. S1

Dierks Supplementary Fig. S1 Dierks Supplementary Fig. S1 ITK SYK PH TH K42R wt K42R (kinase deficient) R29C E42K Y323F R29C E42K Y323F (reduced phospholipid binding) (enhanced phospholipid binding) (reduced Cbl binding) E42K Y323F

More information

Supplementary Figure 1A A404 Cells +/- Retinoic Acid

Supplementary Figure 1A A404 Cells +/- Retinoic Acid Supplementary Figure 1A A44 Cells +/- Retinoic Acid 1 1 H3 Lys4 di-methylation SM-actin VEC cfos (-) RA (+) RA 14 1 1 8 6 4 H3 Lys79 di-methylation SM-actin VEC cfos (-) RA (+) RA Supplementary Figure

More information

Expression of Recombinant Proteins

Expression of Recombinant Proteins Expression of Recombinant Proteins Uses of Cloned Genes sequencing reagents (eg, probes) protein production insufficient natural quantities modify/mutagenesis library screening Expression Vector Features

More information

Lezione 10. Bioinformatica. Mauro Ceccanti e Alberto Paoluzzi

Lezione 10. Bioinformatica. Mauro Ceccanti e Alberto Paoluzzi Lezione 10 Bioinformatica Mauro Ceccanti e Alberto Paoluzzi Dip. Informatica e Automazione Università Roma Tre Dip. Medicina Clinica Università La Sapienza Lezione 10: Sintesi proteica Synthesis of proteins

More information

Supplementary Information. Construction of Lasso Peptide Fusion Proteins

Supplementary Information. Construction of Lasso Peptide Fusion Proteins Supplementary Information Construction of Lasso Peptide Fusion Proteins Chuhan Zong 1, Mikhail O. Maksimov 2, A. James Link 2,3 * Departments of 1 Chemistry, 2 Chemical and Biological Engineering, and

More information

Primer Design Workshop. École d'été en géné-que des champignons 2012 Dr. Will Hintz University of Victoria

Primer Design Workshop. École d'été en géné-que des champignons 2012 Dr. Will Hintz University of Victoria Primer Design Workshop École d'été en géné-que des champignons 2012 Dr. Will Hintz University of Victoria Scenario You have discovered the presence of a novel endophy5c organism living inside the cells

More information

Cat. # Product Size DS130 DynaExpress TA PCR Cloning Kit (ptakn-2) 20 reactions Box 1 (-20 ) ptakn-2 Vector, linearized 20 µl (50 ng/µl) 1

Cat. # Product Size DS130 DynaExpress TA PCR Cloning Kit (ptakn-2) 20 reactions Box 1 (-20 ) ptakn-2 Vector, linearized 20 µl (50 ng/µl) 1 Product Name: Kit Component TA PCR Cloning Kit (ptakn-2) Cat. # Product Size DS130 TA PCR Cloning Kit (ptakn-2) 20 reactions Box 1 (-20 ) ptakn-2 Vector, linearized 20 µl (50 ng/µl) 1 2 Ligation Buffer

More information

www.lessonplansinc.com Topic: Gene Mutations WS Summary: Students will learn about frame shift mutations and base substitution mutations. Goals & Objectives: Students will be able to demonstrate how mutations

More information

National PHL TB DST Reference Center PSQ Reporting Language Table of Contents

National PHL TB DST Reference Center PSQ Reporting Language Table of Contents PSQ Reporting Language Table of Contents Document Page Number PSQ for Rifampin 2-6 Comparison table for rpob Codon Numbering 2 rpob mutation list (new numbering system) 3-5 rpob interpretations 6 PSQ for

More information

PROTEIN SYNTHESIS Study Guide

PROTEIN SYNTHESIS Study Guide PART A. Read the following: PROTEIN SYNTHESIS Study Guide Protein synthesis is the process used by the body to make proteins. The first step of protein synthesis is called Transcription. It occurs in the

More information

Supporting Information

Supporting Information Supporting Information Barderas et al. 10.1073/pnas.0801221105 SI Text: Docking of gastrin to Constructed scfv Models Interactive predocking of the 4-WL-5 motif into the central pocket observed in the

More information

strain devoid of the aox1 gene [1]. Thus, the identification of AOX1 in the intracellular

strain devoid of the aox1 gene [1]. Thus, the identification of AOX1 in the intracellular Additional file 2 Identification of AOX1 in P. pastoris GS115 with a Mut s phenotype Results and Discussion The HBsAg producing strain was originally identified as a Mut s (methanol utilization slow) strain

More information

ΔPDD1 x ΔPDD1. ΔPDD1 x wild type. 70 kd Pdd1. Pdd3

ΔPDD1 x ΔPDD1. ΔPDD1 x wild type. 70 kd Pdd1. Pdd3 Supplemental Fig. S1 ΔPDD1 x wild type ΔPDD1 x ΔPDD1 70 kd Pdd1 50 kd 37 kd Pdd3 Supplemental Fig. S1. ΔPDD1 strains express no detectable Pdd1 protein. Western blot analysis of whole-protein extracts

More information

Supporting Online Information

Supporting Online Information Supporting Online Information Isolation of Human Genomic DNA Sequences with Expanded Nucleobase Selectivity Preeti Rathi, Sara Maurer, Grzegorz Kubik and Daniel Summerer* Department of Chemistry and Chemical

More information

Supporting Information

Supporting Information Supporting Information Table S1. Oligonucleotide sequences used in this work Oligo DNA A B C D CpG-A CpG-B CpG-C CpG-D Sequence 5 ACA TTC CTA AGT CTG AAA CAT TAC AGC TTG CTA CAC GAG AAG AGC CGC CAT AGT

More information

Supplementary Figure 1. Localization of MST1 in RPE cells. Proliferating or ciliated HA- MST1 expressing RPE cells (see Fig. 5b for establishment of

Supplementary Figure 1. Localization of MST1 in RPE cells. Proliferating or ciliated HA- MST1 expressing RPE cells (see Fig. 5b for establishment of Supplementary Figure 1. Localization of MST1 in RPE cells. Proliferating or ciliated HA- MST1 expressing RPE cells (see Fig. 5b for establishment of the cell line) were immunostained for HA, acetylated

More information

Supplemental Data. mir156-regulated SPL Transcription. Factors Define an Endogenous Flowering. Pathway in Arabidopsis thaliana

Supplemental Data. mir156-regulated SPL Transcription. Factors Define an Endogenous Flowering. Pathway in Arabidopsis thaliana Cell, Volume 138 Supplemental Data mir156-regulated SPL Transcription Factors Define an Endogenous Flowering Pathway in Arabidopsis thaliana Jia-Wei Wang, Benjamin Czech, and Detlef Weigel Table S1. Interaction

More information

Add 5µl of 3N NaOH to DNA sample (final concentration 0.3N NaOH).

Add 5µl of 3N NaOH to DNA sample (final concentration 0.3N NaOH). Bisulfite Treatment of DNA Dilute DNA sample to 2µg DNA in 50µl ddh 2 O. Add 5µl of 3N NaOH to DNA sample (final concentration 0.3N NaOH). Incubate in a 37ºC water bath for 30 minutes. To 55µl samples

More information

Thr Gly Tyr. Gly Lys Asn

Thr Gly Tyr. Gly Lys Asn Your unique body characteristics (traits), such as hair color or blood type, are determined by the proteins your body produces. Proteins are the building blocks of life - in fact, about 45% of the human

More information

Converting rabbit hybridoma into recombinant antibodies with effective transient production in an optimized human expression system

Converting rabbit hybridoma into recombinant antibodies with effective transient production in an optimized human expression system Converting rabbit hybridoma into recombinant antibodies with effective transient production in an optimized human expression system Dr. Tim Welsink Molecular Biology Transient Gene Expression OUTLINE Short

More information

Anti-Pim-1 (Cat#3247), anti-met (Cat#3127), anti-ron (Cat#2654), Anti-EGFR

Anti-Pim-1 (Cat#3247), anti-met (Cat#3127), anti-ron (Cat#2654), Anti-EGFR Supplementary Methods Antibodies Anti-Pim-1 (Cat#3247), anti-met (Cat#3127), anti-ron (Cat#2654), Anti-EGFR (Cat#2646), anti-igf1r (Cat#3018), anti-insr (Cat#3020), anti-akt (pan, Cat#4691), anti-phospho-akt

More information

Supporting Information. Copyright Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, 2006

Supporting Information. Copyright Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, 2006 Supporting Information Copyright Wiley-VCH Verlag GmbH & Co. KGaA, 69451 Weinheim, 2006 Copyright Wiley-VCH Verlag GmbH & Co. KGaA, 69451 Weinheim, 2006 Supporting Information for Expanding the Genetic

More information

Supplementary Materials for

Supplementary Materials for www.sciencesignaling.org/cgi/content/full/10/494/eaan6284/dc1 Supplementary Materials for Activation of master virulence regulator PhoP in acidic ph requires the Salmonella-specific protein UgtL Jeongjoon

More information

SAY IT WITH DNA: Protein Synthesis Activity by Larry Flammer

SAY IT WITH DNA: Protein Synthesis Activity by Larry Flammer TEACHER S GUIDE SAY IT WITH DNA: Protein Synthesis Activity by Larry Flammer SYNOPSIS This activity uses the metaphor of decoding a secret message for the Protein Synthesis process. Students teach themselves

More information

MacBlunt PCR Cloning Kit Manual

MacBlunt PCR Cloning Kit Manual MacBlunt PCR Cloning Kit Manual Shipping and Storage MacBlunt PCR Cloning Kits are shipped on dry ice. Each kit contains a box with cloning reagents and an attached bag with Eco-Blue Competent Cells (optional).

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Fig. 1 Characterization of GSCs. a. Immunostaining of primary GSC spheres from GSC lines. Nestin (neural progenitor marker, red), TLX (green). Merged images of nestin,

More information

iclicker Question #28B - after lecture Shown below is a diagram of a typical eukaryotic gene which encodes a protein: start codon stop codon 2 3

iclicker Question #28B - after lecture Shown below is a diagram of a typical eukaryotic gene which encodes a protein: start codon stop codon 2 3 Bio 111 Handout for Molecular Biology 4 This handout contains: Today s iclicker Questions Information on Exam 3 Solutions Fall 2008 Exam 3 iclicker Question #28A - before lecture Which of the following

More information

Supplemental Information. Human Senataxin Resolves RNA/DNA Hybrids. Formed at Transcriptional Pause Sites. to Promote Xrn2-Dependent Termination

Supplemental Information. Human Senataxin Resolves RNA/DNA Hybrids. Formed at Transcriptional Pause Sites. to Promote Xrn2-Dependent Termination Supplemental Information Molecular Cell, Volume 42 Human Senataxin Resolves RNA/DNA Hybrids Formed at Transcriptional Pause Sites to Promote Xrn2-Dependent Termination Konstantina Skourti-Stathaki, Nicholas

More information

SUPPORTING INFORMATION

SUPPORTING INFORMATION SUPPORTING INFORMATION Investigation of the Biosynthesis of the Lasso Peptide Chaxapeptin Using an E. coli-based Production System Helena Martin-Gómez, Uwe Linne, Fernando Albericio, Judit Tulla-Puche,*

More information

Y-chromosomal haplogroup typing Using SBE reaction

Y-chromosomal haplogroup typing Using SBE reaction Schematic of multiplex PCR followed by SBE reaction Multiplex PCR Exo SAP purification SBE reaction 5 A 3 ddatp ddgtp 3 T 5 A G 3 T 5 3 5 G C 5 3 3 C 5 ddttp ddctp 5 T 3 T C 3 A 5 3 A 5 5 C 3 3 G 5 3 G

More information

FROM DNA TO GENETIC GENEALOGY Stephen P. Morse

FROM DNA TO GENETIC GENEALOGY Stephen P. Morse 1. GENES, CHROMOSOMES, AND DNA Chromosomes FROM DNA TO GENETIC GENEALOGY Stephen P. Morse (steve@stevemorse.org) Every human cell = 46 chromosomes (1 to 22 in pairs, 2 sex chromosomes) Male: sex chromosomes

More information

INTRODUCTION TO THE MOLECULAR GENETICS OF THE COLOR MUTATIONS IN ROCK POCKET MICE

INTRODUCTION TO THE MOLECULAR GENETICS OF THE COLOR MUTATIONS IN ROCK POCKET MICE The Making of the The Fittest: Making of the Fittest Natural Selection Natural and Adaptation Selection and Adaptation Educator Materials TEACHER MATERIALS INTRODUCTION TO THE MOLECULAR GENETICS OF THE

More information

hcd1tg/hj1tg/ ApoE-/- hcd1tg/hj1tg/ ApoE+/+

hcd1tg/hj1tg/ ApoE-/- hcd1tg/hj1tg/ ApoE+/+ ApoE+/+ ApoE-/- ApoE-/- H&E (1x) Supplementary Figure 1. No obvious pathology is observed in the colon of diseased ApoE-/me. Colon samples were fixed in 1% formalin and laid out in Swiss rolls for paraffin

More information

Protein Synthesis. Application Based Questions

Protein Synthesis. Application Based Questions Protein Synthesis Application Based Questions MRNA Triplet Codons Note: Logic behind the single letter abbreviations can be found at: http://www.biology.arizona.edu/biochemistry/problem_sets/aa/dayhoff.html

More information

BIOSTAT516 Statistical Methods in Genetic Epidemiology Autumn 2005 Handout1, prepared by Kathleen Kerr and Stephanie Monks

BIOSTAT516 Statistical Methods in Genetic Epidemiology Autumn 2005 Handout1, prepared by Kathleen Kerr and Stephanie Monks Rationale of Genetic Studies Some goals of genetic studies include: to identify the genetic causes of phenotypic variation develop genetic tests o benefits to individuals and to society are still uncertain

More information

Introduction to Bioinformatics Dr. Robert Moss

Introduction to Bioinformatics Dr. Robert Moss Introduction to Bioinformatics Dr. Robert Moss Bioinformatics is about searching biological databases, comparing sequences, looking at protein structures, and more generally, asking biological questions

More information

Table S1. Bacterial strains (Related to Results and Experimental Procedures)

Table S1. Bacterial strains (Related to Results and Experimental Procedures) Table S1. Bacterial strains (Related to Results and Experimental Procedures) Strain number Relevant genotype Source or reference 1045 AB1157 Graham Walker (Donnelly and Walker, 1989) 2458 3084 (MG1655)

More information

Supplemental Data. Bennett et al. (2010). Plant Cell /tpc

Supplemental Data. Bennett et al. (2010). Plant Cell /tpc BRN1 ---------MSSSNGGVPPGFRFHPTDEELLHYYLKKKISYEKFEMEVIKEVDLNKIEPWDLQDRCKIGSTPQNEWYFFSHKDRKYPTGS 81 BRN2 --------MGSSSNGGVPPGFRFHPTDEELLHYYLKKKISYQKFEMEVIREVDLNKLEPWDLQERCKIGSTPQNEWYFFSHKDRKYPTGS 82 SMB

More information

1. DNA, RNA structure. 2. DNA replication. 3. Transcription, translation

1. DNA, RNA structure. 2. DNA replication. 3. Transcription, translation 1. DNA, RNA structure 2. DNA replication 3. Transcription, translation DNA and RNA are polymers of nucleotides DNA is a nucleic acid, made of long chains of nucleotides Nucleotide Phosphate group Nitrogenous

More information

Creation of A Caspese-3 Sensing System Using A Combination of Split- GFP and Split-Intein

Creation of A Caspese-3 Sensing System Using A Combination of Split- GFP and Split-Intein Supplementary Information Creation of A Caspese-3 Sensing System Using A Combination of Split- GFP and Split-Intein Seiji Sakamoto,* Mika Terauchi, Anna Hugo, Tanner Kim, Yasuyuki Araki and Takehiko Wada*

More information

Purification: Step 1. Lecture 11 Protein and Peptide Chemistry. Cells: Break them open! Crude Extract

Purification: Step 1. Lecture 11 Protein and Peptide Chemistry. Cells: Break them open! Crude Extract Purification: Step 1 Lecture 11 Protein and Peptide Chemistry Cells: Break them open! Crude Extract Total contents of cell Margaret A. Daugherty Fall 2003 Big Problem: Crude extract is not the natural

More information

Purification: Step 1. Protein and Peptide Chemistry. Lecture 11. Big Problem: Crude extract is not the natural environment. Cells: Break them open!

Purification: Step 1. Protein and Peptide Chemistry. Lecture 11. Big Problem: Crude extract is not the natural environment. Cells: Break them open! Lecture 11 Protein and Peptide Chemistry Margaret A. Daugherty Fall 2003 Purification: Step 1 Cells: Break them open! Crude Extract Total contents of cell Big Problem: Crude extract is not the natural

More information

Overexpression Normal expression Overexpression Normal expression. 26 (21.1%) N (%) P-value a N (%)

Overexpression Normal expression Overexpression Normal expression. 26 (21.1%) N (%) P-value a N (%) SUPPLEMENTARY TABLES Table S1. Alteration of ZNF322A protein expression levels in relation to clinicopathological parameters in 123 Asian and 74 Caucasian lung cancer patients. Asian patients Caucasian

More information

Gene synthesis by circular assembly amplification

Gene synthesis by circular assembly amplification Gene synthesis by circular assembly amplification Duhee Bang & George M Church Supplementary figures and text: Supplementary Figure 1. Dpo4 gene (1.05kb) construction by various methods. Supplementary

More information

Supplementary Information

Supplementary Information Supplementary Information A general solution for opening double-stranded DNA for isothermal amplification Gangyi Chen, Juan Dong, Yi Yuan, Na Li, Xin Huang, Xin Cui* and Zhuo Tang* Supplementary Materials

More information

Glutathione (GSH)-Decorated Magnetic Nanoparticles for Binding Glutathione-S-transferase (GST) Fusion Protein and Manipulating Live Cells

Glutathione (GSH)-Decorated Magnetic Nanoparticles for Binding Glutathione-S-transferase (GST) Fusion Protein and Manipulating Live Cells Glutathione (GSH)-Decorated Magnetic Nanoparticles for Binding Glutathione-S-transferase (GST) Fusion Protein and Manipulating Live Cells Yue Pan, Marcus J. C. Long, Xinming Li, Junfeng Shi, Lizbeth Hedstrom,

More information

Table S1. Sequences of mutagenesis primers used to create altered rdpa- and sdpa genes

Table S1. Sequences of mutagenesis primers used to create altered rdpa- and sdpa genes Supplementary Table and Figures for Structural Basis for the Enantiospecificities of R- and S-Specific Phenoxypropionate/α-Ketoglutarate Dioxygenases by Tina A. Müller, Maria I. Zavodszky, Michael Feig,

More information

Molecular Level of Genetics

Molecular Level of Genetics Molecular Level of Genetics Most of the molecules found in humans and other living organisms fall into one of four categories: 1. carbohydrates (sugars and starches) 2. lipids (fats, oils, and waxes) 3.

More information

Complexity of the Ruminococcus flavefaciens FD-1 cellulosome reflects an expansion of family-related protein-protein interactions

Complexity of the Ruminococcus flavefaciens FD-1 cellulosome reflects an expansion of family-related protein-protein interactions Complexity of the Ruminococcus flavefaciens FD-1 cellulosome reflects an expansion of family-related protein-protein interactions Vered Israeli-Ruimy 1,*, Pedro Bule 2,*, Sadanari Jindou 3, Bareket Dassa

More information

Supplemental Table 1. Primers used for PCR.

Supplemental Table 1. Primers used for PCR. Supplemental Table 1. Primers used for PCR. Gene Type Primer Sequence Genotyping and semi-quantitative RT-PCR F 5 -TTG CCC GAT CAC CAT CTG TA-3 rwa1-1 R 5 -TGT AGC GAT CAA GGC CTG ATC TAA-3 LB 5 -TAG CAT

More information

evaluated with UAS CLB eliciting UAS CIT -N Libraries increase in the

evaluated with UAS CLB eliciting UAS CIT -N Libraries increase in the Supplementary Figures Supplementary Figure 1: Promoter scaffold library assemblies. Many ensembless of libraries were evaluated in this work. As a legend, the box outline color in top half of the figure

More information

Genomic Sequence Analysis using Electron-Ion Interaction

Genomic Sequence Analysis using Electron-Ion Interaction University of Aizu, Graduation Thesis. March, 25 s1985 1 Genomic Sequence Analysis using Electron-Ion Interaction Potential Masumi Kobayashi s1985 Supervised by Hiroshi Toyoizumi Abstract This paper proposes

More information

Case 7 A Storage Protein From Seeds of Brassica nigra is a Serine Protease Inhibitor

Case 7 A Storage Protein From Seeds of Brassica nigra is a Serine Protease Inhibitor Case 7 A Storage Protein From Seeds of Brassica nigra is a Serine Protease Inhibitor Focus concept Purification of a novel seed storage protein allows sequence analysis and determination of the protein

More information

Assignment 13. In the Griffith experiment, why did mice die when injected with live R bacteria plus heatkilled

Assignment 13. In the Griffith experiment, why did mice die when injected with live R bacteria plus heatkilled Assignment 13 1. Multiple-choice (1 point) In the Griffith experiment, why did mice die when injected with live R bacteria plus heatkilled S bacteria? Some of the S bacteria were still alive. The R bacteria

More information

Promoter 1. Promoter 2. Enhancer 2

Promoter 1. Promoter 2. Enhancer 2 Essays 1. An animal normally has two copies of the Xan gene. The protein made by this gene helps the animal clear petroleum-based pollutants from its body. Tine Xan What we know about the Xan gene is shown

More information

Legends for supplementary figures 1-3

Legends for supplementary figures 1-3 High throughput resistance profiling of Plasmodium falciparum infections based on custom dual indexing and Illumina next generation sequencing-technology Sidsel Nag 1,2 *, Marlene D. Dalgaard 3, Poul-Erik

More information

Degenerate Code. Translation. trna. The Code is Degenerate trna / Proofreading Ribosomes Translation Mechanism

Degenerate Code. Translation. trna. The Code is Degenerate trna / Proofreading Ribosomes Translation Mechanism Translation The Code is Degenerate trna / Proofreading Ribosomes Translation Mechanism Degenerate Code There are 64 possible codon triplets There are 20 naturally-encoding amino acids Several codons specify

More information

DNA sentences. How are proteins coded for by DNA? Materials. Teacher instructions. Student instructions. Reflection

DNA sentences. How are proteins coded for by DNA? Materials. Teacher instructions. Student instructions. Reflection DNA sentences How are proteins coded for by DNA? Deoxyribonucleic acid (DNA) is the molecule of life. DNA is one of the most recognizable nucleic acids, a double-stranded helix. The process by which DNA

More information

Supporting Information

Supporting Information upporting Information hiota et al..73/pnas.159218 I Materials and Methods Yeast trains. Yeast strains used in this study are described in Table 1. TOM22FLAG, a yeast haploid strain for expression of C-terminally

More information

Electronic Supplementary Information

Electronic Supplementary Information Electronic Supplementary Material (ESI) for ChemComm. This journal is The Royal Society of Chemistry 2014 Electronic Supplementary Information Multiplexed Detection of Lung Cancer Cells at the Single-Molecule

More information

Genes and Proteins. Objectives

Genes and Proteins. Objectives Genes and Proteins Lecture 15 Objectives At the end of this series of lectures, you should be able to: Define terms. Explain the central dogma of molecular biology. Describe the locations, reactants, and

More information

Genomics and Gene Recognition Genes and Blue Genes

Genomics and Gene Recognition Genes and Blue Genes Genomics and Gene Recognition Genes and Blue Genes November 1, 2004 Prokaryotic Gene Structure prokaryotes are simplest free-living organisms studying prokaryotes can give us a sense what is the minimum

More information

II 0.95 DM2 (RPP1) DM3 (At3g61540) b

II 0.95 DM2 (RPP1) DM3 (At3g61540) b Table S2. F 2 Segregation Ratios at 16 C, Related to Figure 2 Cross n c Phenotype Model e 2 Locus A Locus B Normal F 1 -like Enhanced d Uk-1/Uk-3 149 64 36 49 DM2 (RPP1) DM1 (SSI4) a Bla-1/Hh-0 F 3 111

More information

Additional Table A1. Accession numbers of resource records for all rhodopsin sequences downloaded from NCBI. Species common name

Additional Table A1. Accession numbers of resource records for all rhodopsin sequences downloaded from NCBI. Species common name 1 2 3 Additional Table A1. Accession numbers of resource records for all rhodopsin sequences downloaded from NCBI. Species common name Scientific name Accession number Accession number (introns) Codons

More information

An engineered tryptophan zipper-type peptide as a molecular recognition scaffold

An engineered tryptophan zipper-type peptide as a molecular recognition scaffold SUPPLEMENTARY MATERIAL An engineered tryptophan zipper-type peptide as a molecular recognition scaffold Zihao Cheng and Robert E. Campbell* Supplementary Methods Library construction for FRET-based screening

More information

Luo et al. Supplemental Figures and Materials and Methods

Luo et al. Supplemental Figures and Materials and Methods Luo et al. Supplemental Figures and Materials and Methods The supplemental figures demonstrate that nuclear NFAT is situated at PODs, overexpressed PML does not increase NFAT nuclear localization, and

More information

Case 7 A Storage Protein From Seeds of Brassica nigra is a Serine Protease Inhibitor Last modified 29 September 2005

Case 7 A Storage Protein From Seeds of Brassica nigra is a Serine Protease Inhibitor Last modified 29 September 2005 Case 7 A Storage Protein From Seeds of Brassica nigra is a Serine Protease Inhibitor Last modified 9 September 005 Focus concept Purification of a novel seed storage protein allows sequence analysis and

More information

SUPPORTING INFORMATION FILE

SUPPORTING INFORMATION FILE Intrinsic and extrinsic connections of Tet3 dioxygenase with CXXC zinc finger modules Nan Liu, Mengxi Wang, Wen Deng, Christine S. Schmidt, Weihua Qin, Heinrich Leonhardt and Fabio Spada Department of

More information

Supporting Information. Trifluoroacetophenone-Linked Nucleotides and DNA for Studying of DNA-protein Interactions by 19 F NMR Spectroscopy

Supporting Information. Trifluoroacetophenone-Linked Nucleotides and DNA for Studying of DNA-protein Interactions by 19 F NMR Spectroscopy Supporting Information Trifluoroacetophenone-Linked Nucleotides and DNA for Studying of DNA-protein Interactions by 19 F NMR Spectroscopy Agata Olszewska, Radek Pohl and Michal Hocek # * Institute of Organic

More information