Characterization and Derivation of the Gene Coding for Mitochondrial Carbamyl Phosphate Synthetase I of Rat*

Size: px
Start display at page:

Download "Characterization and Derivation of the Gene Coding for Mitochondrial Carbamyl Phosphate Synthetase I of Rat*"

Transcription

1 - THE JOURNAL OF BOLOGCAL CHEMSTRY Vol. 260, No. 16, ssue of August 5, pp by The Americen Society of Biological Chemists, nc. Printed in U.S.A. Characterization and Derivation of the Gene Coding for Mitochondrial Carbamyl Phosphate Synthetase of Rat* (Received for publication, February 28, 1985) Hiroshi Nyunoya, Karen E. Brogfie, Esther E. Widgren, and C. J. Lusty From the Molecular Genetics Laboratory, The Public Health Research nstitute of the City of New York, nc., New Yo&, New York The nucleotide sequence of rat carbamyl phosphate synthetase mrna has been determined from the complementary DNA. The mrna comprises minimally 5,645 nucleotides and codes for a polypeptide of 164,564 Da corresponding to the precursor form of the rat liver enzyme. The primary sequence of mature rat cabamyl phosphate synthetase indicates that the precursor is cleaved at one of two leucines at residues 38 or 39. The derived amino acid sequence of carbamyl phosphate synthetase is homologous to the sequences of carbamyl phosphate synthetase of Escherichia coli and yeast. The sequence homology extends along the entire length of the rat polypeptide and encompasses the entire sequences of both the small and large subunits of the E. coli and yeast enzymes. The protein sequence data provide strong evidence that the carbamyl phosphate synthetase gene of rat, the carab gene of E. coli, and the CPAl and CPA2 genes of yeast were derived from common ancestral genes. Part of the rat carbamyl phosphate synthetase gene has been characterized with two nonoverlapping phage clones spanning 28.7 kilobases of rat chromosomal DNA. This region contains 13 exons ranging in size from 68 to 195 base pairs and encodes the 453 carboxyl-terminal amino acids of the rat protein. Southern hybridization analysis of rat genomic DNA indicates the carbamyl phosphate synthetase gene to be present in single copy. *These studies were supported by Grant GM from the National nstitutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked aduertisernent in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. encoding the NR-dependent carbamyl phosphate synthetase evolved from the genes coding for the glutamine-dependent carbamyl phosphate synthetases by gene fusion and subsequent mutation of the glutamine binding and catalytic domain (5). n this communication, we present the entire nucleotide sequence of carbamyl phosphate synthetase mrna. The sequence of the mrna confirms our previous conclusion that the mitochondrial enzyme is a hybrid poleptidencoded by a gene derived from two separate genes: one gene coding for a subunit that catalyzes glutamine amide transfer (glutamine subunit) and the second coding for a synthetase subunit. Analysis of the rat gene shows it to occur in single copy and to be composed of multiple exons separated by long intervening sequences. MATERALS AND METHODS DNA and Enzymes-Enzymes were purchased from P-L Biochemicals, New England Biolabs, and Amersham Corp. Plasmid DNA was isolated by the method of Birnboim and Doly (10) and purified by chromatography on Sepharose 6B. Carbamyl Phosphate Synthetase cdna Clones-The construction of rat (Wistar strain) liver cdna libraries and the isolation of a set of recombinant plasmids (pkb4, pkb21, phn107, and phn234) carrying cdna inserts complementary to rat liver carbamyl phosphate synthetase mrna have been described previously (5, 11). The cdna insert (2.1 kb ) in the plasmid pkb4 was confirmed by hybrid-selected in oitro translation to direct the synthesis of a 165- kda polypeptide corresponding to the precursor form of rat liver carbamyl phosphate synthetase (11). A near full-length cdna insert (5.3 kb) was carried in plasmid phn234. Based on the size of carbamyl Carbamyl phosphate synthetase catalyzes the synthesis of phosphate synthetase mrna, the cdna insert was only nucleotides shorter than the expected length of the message. To select carbamyl phosphate from NHs, HCOZ, and 2ATP and refor clones containing the 5 terminal sequences, we screened our quires N-acetylglutamate as an allosteric activator (1). NH3- Okayama-Berg rat liver cdna library (1.8 X 10 transformants) by dependent carbamyl phosphate synthetase is found only in colony hybridization using as a probe an 800-bp Psll-Xho fragment ureotelic animals primarily in the liver and small intestine from the 5 end of phn234. This screen yielded four new clones, (21, where the enzyme plays an important role in removing phn291, phn292, phn293, and phn295, with identical cdna excess NH, from the cell. The ability of carbamyl phosphate inserts of 1.5 kb. Restriction analysis of the four new clones showed that the cdna inserts overlapped by 600 bp the 5 region of the synthetase to function efficiently at low concentrations of cdna insert in phn234. The four clones, however, contained 500 bp NH3 is recognized to be an important step in the evolution of of a sequence not homologous to carbamyl phosphate synthetase ureotelic metabolism (3,4). mrna. A complete screen of the cdna library (6 X O7 transform- n recent studies, we have described the isolation of cdna ants) with a 175-bp Rsa-Kpn fragment from the 5 proximal seclones complementary to rat carbamyl phosphate synthetase quence of the cdna insert of phn234 yielded another 68 confirmed positive clones. Of 28 clones analyzed by restriction mapping, four mrna (5). Partial DNA sequence analysis of the cdna has contained cuna inserts identical to the insert in phn234, and 24 shown the gene encoding carbamyl phosphate synthetase to contained cdna inserts identical to the inserts of phn291,292,293, be evolutionarily related to the genes encoding the glutamine- and 295. dependent carbamyl phosphate synthetases of Escherichia coli DNA Sequence Analysis-DNA sequence analysis was performed (6, 7) and yeast (8, 9). We have proposed that the gene 9346 by the method of Maxam and Gilbert (12). Restriction endonuclease fragments were isolated from agarose gels, 5 end labeled with [-y- P] ATP (6000 Cifmmol, Amersham Gorp.), and polynucleotide kinase The abbreviations used are: kb, kilobase pairs; bp, base pairs; RF, replicative form; SDS, sodium dodecyl sulfate; PTH, phenylthiohydantoin; dansyl, 5-dimethylaminonaphthalene--sulfonyl.

2 and single strands were separated by electrophoresis on polyacrylamide gels. The nucleotide sequences of both isolated single strands were determined. Northern Blot Hybridization-The size of carbamyl phosphate synthetase mrna was estimated by Northern analysis of rat liver poly(a+) RNA after denaturation in 2.2 M formaldehyde and separation on agarose gels containing 1.1 M formaldehyde (13). RNA was transferred to nitrocellulose and hybridized as described by Thomas (14) with a radioactive probe prepared by nick translation. The gels were calibrated with a mixture of E. coli and yeast rrnas as described (8),@X174 RF DNA (5386 bp) and mp13 RF DNA (-7200 bp). Amino Acid Composition, NH2- and Carboxyl-terminal Analyses- The amino acid composition, NH,-terminal sequence, and carboxylterminal amino acids of mature carbamyl phosphate synthetase were determined on the rat liver enzyme judged by SDS-polyacrylamide gel electrophoresis to be 98% pure. Lyophilized salt-free samples (154 wg) of the protein were hydrolyzed with 6 N HC1 (containing a crystal of phenol) in evacuated sealed tubes for 22 h at 110 "C. The amino acids were analyzed on an automatic amino acid analyzer (Beckman, model 119CL) which had been calibrated with 10 nmol of a standard mixture. The average of three determinations was used in the calculations. The NH, terminus of the enzyme was identified according to the method of Gros and Labouesse (15). Dansyl-amino acids released after acid hydrolysis of the dansylated protein for 4 and 18 h at 110 "C were identified on polyamide thin layers (16). The NHZterminal sequence of the protein was determined by automated Edman degradation with a Beckman Sequencer, model 890C, using the 0.1 M Quadrol buffer system. Carbamyl phosphate synthetase was desalted on Sephadex G-25 equilibrated with 0.06 M triethylamine/ trifluoroacetic acid, ph 8.9, and a sample (4 mg) was used for sequence analysis. Thiazolinones released at each cycle were converted to phenylthiohydantoins and identified by high performance liquid chromatography. The carboxyl-terminal sequence of the enzyme was determined by digestion of the protein with phenylmethylsulfonyl fluoride-treated carboxypeptidase A according to the procedures described by Ambler (17). Amino acids released by carboxypeptidase A digestion were identified by amino acid analysis and corrected for autodigestion with two controls; one control contained all components except carbamyl phosphate synthetase; the second contained all components except carboxypeptidase A. Genomic Clones of Carbamyl Phosphate Synthetase -Genomic clones of rat carbamyl phosphate synthetase were isolated from a recombinant phage library of rat chromosomal DNA generously provided to us by Drs. James Bonner and Thomas D. Sargent (California nstitute of Technology). The genomic library consists of kb EcoR fragments of rat (Sprague-Dawley strain) nuclear DNA cloned in the X phage, Charon 4A (18). Bacteriophage were grown in E. coli DP50supF in NZYCDT medium, and recombinant phage (650,000 plaques) were screened by plaque hybridization (19) using as a probe a 770-bp Pstl fragment from a cdna clone (pcpsrl) kindly provided by Dr. William E. O'Brien (Baylor College of Medicine). After plaque purification, recombinant bacteriophage DNA from E. coli DP50supF was isolated as described by Blattner et al. (20) and Maniatis et al. (21). EcoR and Hind111 fragments of the rat nuclear DNA inserts were subsequently subcloned in puc8. Hybridization Analysis of Genomic DNA-Rat chromosomal DNA was isolated from livers of Wistar strain rats according to the procedure of Blin and Stafford (22). Genomic DNA (20 pg) was digested to completion with appropriate restriction endonucleases, separated on 0.75% agarose gels, and transferred to nitrocellulose as described by Southern (23). Hybridization with radioactive probes was carried out in 5 X SSC, 5 X Denhardt's, 100 pg/ml salmon sperm DNA, 50 mm sodium phosphate, ph 8, 10 mm EDTA, 0.1% SDS, 50% formamide, and 10% dextran sulfate overnight at 42 "C. The blots were washed two times for 10 min at room temperature with 2 X SSC, 0.1% SDS, followed by two washes for min at 65 "C in 0.1 X SSC, 0.1% SDS. Blots were exposed to Kodak XAR-5 film with an intensifying screen for h. RESULTS Carbamyl Phosphate Synthetase cdna Clones-Recombinant plasmids carrying cdna inserts complementary to the entire sequence of rat liver carbamyl phosphate synthetase mrna were isolated from two different cdna libraries (5). The cdna inserts of representative plasmids isolated from the two libraries are presented in Fig. 1. phn234 had the Rat Carbamyl Phosphate Synthetase Gene 9347 phn107 phn234 '41 phn A 70 PKB2 ph N A 70 pk04... A A00 kb FG. 1. Restriction maps of carbamyl phosphate synthetase cdna clones. The cdna inserts shown by the dark lines were carried either in pbr322 (pkb4, pkb21) or in the pbr322-psv40 vector of Okayama and Berg (24) (phn107, phn234, phn291, and phn292). Three clones (pkb4, phn291, and 292) contain sequences unrelated to carbamyl phosphate synthetase which are indicated by the dashed lines. A,, denotes the number of adenines in the poly(a) tracts as determined by sequence analysis. Est E Acc Pst Hlndm - c". "---r, FG. 2. Strategy for determining the cdna sequence. Appropriate restriction fragments were 5' end labeled, and the single strands were isolated on polyacrylamide gels. Arrows above the restriction sites indicate the direction and the lengths of the sequences obtained. The location of the coding region of the cdna is indicated by the open bar above the restriction map. longest cdna insert starting with a poly(a) tract and including 4215 nucleotides of the coding sequence. This clone, however, lacked the 5' nontranslated leader and part of the NH2-terminal coding region. The 5' region missing from phn234 was isolated on plasmid phn291. The latter is one of four recombinant plasmids, all of which had identical cdna inserts overlapping with the 5' proximal region of phn234 but extending an additional 400 nucleotides further upstream. This upstream region included the entire NHp-terminal coding region of the mrna as well as 139 nucleotides of 5' nontranslated leader. phn291 as well as the other three

3 9348 Rat Carbamyl Phosphate Synthetase Gene AAGATCGCTGTGCAGTCAGCCTTCAGCCCCAACTGCACTTCTCCACACAGCTTTCCTTCCCACTGGTTACAAGCAAATTGGACAACAAATCTCATGAG -- LC_ Met Thr Ary le Leu Thr Ala Cys Lys Val Val Lys Thr Leu Lys 15 ATATTTGTGATTTAATTTTAGTCACAAAACATCTCAAA ATG ACG AGG AT TTG ACA GCA TGC AAA GT GTG AAG ACA CTG AAG 45 "- Sex Gly Phe Gly Leu Ala Asn Val Thr Ser Lys Arg Gln Trp Asp Phe Ser dry Pro Gly le Arg Leu Leu ser $0 AGT GGC TTT GGT TTG GCC AAT GTG ACT TCG AAG CGA CAG TGG GAC TTC TCT AGA CCT GGC ATC AGG CTC CT TCT 120 Val Lys Ala Gln Thr Ala His le Val Leu Glu Asp Gly Thr Lys Met Lys Gly Tyr Ser Phe Gly His Pro Ser 65 GTG AAG GCA CAG ACA GCA CAC ATT GTT CTG GAA GAT GGA ACT AAG ATG AAG GGC TAC TCC TTT GGC CAT CCC TCC 195 Ser Val Ala Gly Glu Val Val Phe Asn Thr Gly Leu Gly Gly Tyr Ser Glu Ala Leu Thr Asp Pro Ala Tyr Lys 90 TCG GTT GCT GGC GAA GTG GT TTT AAT ACT GGC TTA GGA GGG TAC TCG GAA GCA CTT ACT GAT CCT GCC TAC AAG 270 Gly Gln le Leu Thr Met Ala Asn Pro le le Gly Asn Gly Gly Ala Pro Asp Thr Thr Ala Arg Asp Glu Leu 115 GGG CAG AT CTC ACC ATG GCC AA CCT ATC ATT GGG AAT GGT GGG GCC CCT GAT ACA ACG GCA CGT GAT GAA CTG 341 Gly Leu Asn Lys Ty+ Met Glu Ser Asp Gly le Lys Val Ala Gly Leu Leu Val Leu Asn Tyr Ser His Asp Tyr 1'60 GGA CTG AAT AAG TAC ATG GAG TCT GAT GGA ATC BA GTG GCG GGT CTG CTG GTG CTG AA TAC AGT CAT GAC TAC 420 Asn His Trp Leu Ala Thr Lys Sei- Leu Gly Gln Trp Leu Gln Glu Glu Lys Val Pro Ala le Ty5 Gly Val Asp 165 AAC CAC TGG CTG GCC ACC AAG AGT CTG GGT CAG TGG CTG CAG GAG GAG AAG GTC CCT GCA ATT TAT GGA GT GAT 495 Thr Arg Met Leu Thr Lys le le Arg Asp Lys Cly Thr Met Leu Gly Lys le Glu Phe Glu Gly Gln Ser Val 190 ACA AGA ATG CTG ACT AAA ATA ATT CGG GAT AAG GGT ACC ATG CTT GGG AAG ATT GAG TTT GAG GGC CAG TCT GTG 570 Asp Phe Val Asp Pro Asn Lys Gln Asn Leu le Ala Glu Val Ser Thr Lys Asp Val Lys Val Phe Gly Lys Gly 215 GAC TTT GTG GAT CCT AAT AAG CAG AA TTG ATT GCC GAG GTTCA ACC AA GAT GTC AAG GTG TTT GGC AAA GGA 645 Asn Pro Thr Lys Val Val Ala Val Asp Cys Gly le Lys Asn Asn Val le dry Leu Leu Val Lys dry Gly Ala 240 AAC CCC ACG AAA GT GTA GCC GTG GAC TGT GGG AT AAA AAC AAT GTC ATC CGC CTG CTA GTT AAG CGA GGA GCG 720 Glu Val His Leu Val PTO Trp Asn His Asp Phe Thr Gln Met Asp Tyr Asp Gly Leu Leu le Ala Gly Gly Pro 265 GAA GTG CA TTG GTC CCC TGG AAT CAT GAC TTC AC CAG ATG GAC TAT GAC GGA CTT CTG ATC GCT GGA GGA CCT 795 Gly Asn Pro Ala Leu Ala GR Pro Leu le Gln Asn Val Lys Lys le Leu Glu Ser Asp Arg Lys Glu Pro Leu 290 GGG ARC CCA GCT CTG GCA CAG CCA CA ATT CAG AAC GTG AAG AAG ATT TTG GAG AGT GACGC AAA GAG CCG TTG 870 Phe Gly le Ser Thr Gly Asn le le Thr Gly Leu Ala Ala Gly Ala Lys Ser Tyl Lys Met Ser Met Ala Asn 315 TTT GGA ATC AGT ACA GGA AAC AT AT ACA GGA TTG GCT GCT GGC GCC AAA TCC TAC AAG ATG TCC A ti GCC AAC 945 dry Gly Gln Asn Gln Pro Val Leu Asn le Thr Asn dry Gln Ala Phe le Thr Ala Gln Asn His Gly Tyr Ala 340 AGA GGA CAG AACAA CCT GTTTG AAT ATC ACA AAC AGA CAG GCT TTC AT ACT GCT CAG AAT CAT GGC TAT GCT 1020 Leu Asp Asn Thr Leu Pro Ala Gly Trp Lys Pro Leu Phe Val Asn Val Asn Asp Gln Thr Asn Glu Gly le Met 365 CTG GAC AAC ACC CTCCT GCT GGC TGG AAA CCA CTG TTT GTC AAT GTC AAT GAC CAACAAT GAGGG ATT ATG 1095 His G lu Ser LyS Pro Phe Phe Ala Val Gln Phe His Pro Glu Val Ser Pro Gly Pro Thr Asp Thr Glu Tyr Leu 390 CAT GAG AGC AAA CCC TTC TTC GCT GTG CAG TTC CAC CCA GAG GTC AGC CCG GGG CCA ACA GAC ACT GAG TA CTA 1170 Phe Asp Ser Phe Phe Ser Leu le Lys Lys Gly Lys Gly Thr Thr le Thr Ser Val Leu Pro Lys Pro Ala Leu 415 TTT GAT TCC TTC TTC TCG CTG ATA AAG AAG GGC A GGC ACC ACC ATT ACC TCA GTT CTG CCC AAG CCA GCA TTG 1245 Val Ala Ser dry Val Glu Val Ser Lys Val Leu le Leu Gly Ser Gly Gly Leu Ser Le Gly Gln Ala Gly Glu S'lo GTG GCA TCT CGA GTC GAG GTT TCC AAG GTC CTT ATC CTA GGA TCA GGA GGC CTG TCC ATT GGT CAG GCA GGT GAA 1320 Phe Asp Tyr Ser Gly Ser Gln Ala Val Lys Ala Met Lys Glu Glu Asn Val Lys Thr Val Leu Met Asn P0 Asn '165 TTT GATAC TCC GGA TCT CAG GCT GTA AAA GCC ATG AAG GAA GAA AAC GTC AAA ACA GTC CTG ATG AAC CCG AAC 1395 le Ala Ser Val Gln Thr Asn Glu Val Gly Leu Lys Gln Ala Asp Ala Val Tyr Phf Leu Pro le Thr PTO Gin q90 ATT GCA TCC GTG CAG ACC AAC GAG GTG GGA TTG AAG CAG GCA GAT GCA GTC TAC TTT CTC CCT ATC ACCCC CAG 1\70 Phe Val Thr Glu Val le y Ala s l Au K pro Asp Gly Leu le Leu Gly Met Gly Gly Gln Thr Ala Leu Asn "' TTT GTC ACA GAA GTT ATC AAG GCT GAA CGG CCC GAT GGG TT ATT CTG GGC ATG GGT GGC CAG ACA GCT CTG AAC 1545 Cys G1 y Val G1 u Leu Phe Lys dry G1 y Val Leu Lys G1 u Tyr Gly Val Lys Val Leu Gly Tbr Ser Val Glu Ser '" TGT GGA GG GAA CTA TTC AAG AGG GGT GTG CTC AA GAA TAT GGT GTG AAG GTC TTG GGA ACA TCG GTC GAG TCC 1620 le Met Ala Thr Glu Asp Ary Gln Leu Phe Ser Asp Lys Leu Asn Glu le Asn Glu Lys le Ala Pro Ser Phe 565 ATT ATG GCC ACA GAA GAC AGG CAG CTT TTC TCA GAC AAG CTG AAT GAG ATC AAC GAG AAG ATT GCT CCT AGC TT 1R95 Ala Val Glu Ser Met Glu Asp Ala Leu Lys Ala Ala Asp Thr le Gly Tyr Pro Val Met le Ary Ser Ala Tyr 590 GCA GT GAA TCA ATG GAG GAT GCC TTG AAG GCA GCA GAC ACC ATT GGC TAC CCT GTG ATG ATT CGG TCT GCG TAT 1770 Ala Leu Gly Gly Leu Gly Ser Gly le Cys Pro Asn Lys Glu Thr Leu Met Asp Leu Gly Thr Lys Ala Phc Ala 615 GCT CTG GGT GGG TTA GGC TCC GGC ATC TGT CCC AAC AAG GAG ACC TT ATG GAT CTT GGC ACA AAG GCA TTT GCT 18'15 FG. 3. cdna sequence of carbamyl phosphate synthetase mrna. The sequence is that of the sense strand. The coding sequence of the mrna begins at nucleotide +1 and ends at nucleotide The derived amino acid sequence is shown above the cdna sequence. An inverted repeat centered over an upstream ATG codon in the 5' leader is underlined. The arrows (1) denote the experimentally determined NHB termini of mature carbamyl phosphate synthetase. Downstream of the termination codon, the sequence AATAAA signaling polyadenylation is underlined. pa indicates the poly(a) addition site. A, indicates the poly(a) tract which varied from adenines in the different clones. 4 4

4 ASP ASP Asn cys Val Thr Val Cys Asn Met Glu Asn Val Asp Ala Met l Val y His Thr Gly nsp Ser Val Val 665 GAT GAT AAC TGT GTC ACA GTC TGT AAC ATG GAG AAT GTT GAC GCC AT GGT GTT CAC ACA GGT GATCA GTT GTT 1995 va1 Ala pro Gln Thr Leu Ser ASn A J Glu Phe Gln Met Leu Arq Arq Thr Ser le Asn Val Val Arg His 690 GTG GCC CCG GCC CAG ACC CTC TCC AAT GCA GAG TT CAG ATG TTG A64 CGC AC TC ATC AAT GTT GTT CGC CAC 2070 teu 1 le l GU y cys Asn le Gln Phe Ala Leu His Pro Thr Ser Met Glu Tyr CyS le 1e Glu Val 715 TTG GGC ATT GT GGT GAA Tf ic AAC ATT CAG TTT GCT CTT CA CCC ACTCC AT GAA TAC TGT ATC ATT GAA GTG 214s Asn Ala Arg Leu Ser Arq Ser Ser Ala Leu Ala Ser Lys Ala Thr Gly Tyr Pro Leu Ala Phe le Ala Ala LYS 740 AAT GCC AGG CTC TCC AGG AG TCT GC CTG GCC TCC AAG K C ACT GGC TAC CCA CTG GCG TC ATC GCA GC AAG 2220 le Ala Leu Gly le Pro Leu Pro Glu le Lys Asn Val Val Ser Gly Lys Thr Ser Ala Cys Phe Glu Pro Ser 765 ATC GCT CTA GG AT CCA CTT CCA GA ATC AAG AAT GTT GTG TCT GGG AAG ACC TCA GCC TGC TTC GAA CCT AGC 229s Leu Asp Tyr Met Val Thr Lys le Pro Arg Trp Asp Leu Asp Arg Phe His Gly Thr Ser Ser Arq le Gly Ser 790 CT GATAC AT GTG ACC AAG ATT CCT CGC TGG GAC CTT GA CGTTT CAT GGA ACA TCC AGCGA ATT GGT AGC 2370 Ser Met Lys Ser Val Gly Glu Val Met Ala le Gly Arg Thr Phe Glu Glu Ser Phe Gln Lys Ala Leu Arg Met 815 TC ATG AA AGT GTA GGC GA GTC ATG GCC ATT GGT CGC ACC TTT GA GAAGTT CAG AA GCT CTG AGG ATG 2445 CyS H i s Pro Ser Val Asp Gly Phe Thr Pro Arg Leu Pro Met Asn Lys Glu Trp Pro Ala Asn Leu Asp Leu Arg 840 TGC CAT CCA TCT GTG GAT GGG TTC ACT CCC CGT CTC CCA ATG AAT AAG GAA TGG CCA GCA AAC CTG GAT CTG AGG 2520 Lys Glu Leu Ser Glu Pro Ser Ser Thr Arg le Tyr Ala le Ala Lys Ala Leu Glu Asn Asn Met Ser Leu Asp R6S AA GAG CTG TCT GAA CCC TCC AGC AC CGC ATC TAT GCC ATT GCT AAG GCC TTG GA AAC AAC ATG TC CTT GAC 2595 GlU Val LyS Leu Thr Ser le Asp Lys Trp Phe Leu Tyr Lys Met Arq Asp le Leu Asn Met Asp Lys Thr 890 GA ATC GTG AAG CTC ACA TCC ATT GAC AAG TGG TT TTG TAT AAG ATG CGT GAC AT TT AAC ATG GAT AAG ACA Leu Lys Gly Leu Asn Ser Gfu Ser Val Thr Glu Glu Thr Leu Arg G Ala R Lys Glu le Gly Phe Ser Asp Lys 915 CTG AAA GGG CTT AAC AGT GAG TCT GTT ACA GAA GA ACT CTG AGA CAG GCC AAA GAG ATT GGG TTC TCA GAC AAG 2745 Gln le Ser Lys Cys Leu Gly Leu Thr Glu Ala G Thr R Arq Glu Leu Ary Leu Lys Lys Asn le His Pro Trp 940 CAG ATT TCA AAA TGT TTA GGA CTG ACC GAG GCT CAG AC AGA GAG CTG AGA TTG AAG AA AAC AT CA CC TGG 2820 Val Lys Gln le Asp Thr Leu Ala Ala Glu Tyr Pro Ser Val Thr Asn Tyr Leu Tyr Val Thr Tyr Asn Gly Gln qb5 GTT AAA CAG ATT GAT ACA TTG GCT GCA GAA TAC CCA TCA GTG ACA AA TAC CTG TAT GTT ACC TAC AAT GGC CAG 2895 Glu His Asp le Lys Phe Asp Glu His Gly le Met Val Leu Gly Cys Gly Pro Tyr His le Gly Ser Ser Val 990 GAG CAC GAC ATC AAA TTT GAT GAA CAT GGA ATAT GTG CTG GGC TGT GGC CCA TA CAC ATT GGC AGC AGT GTG 2970 Glu Phe Asp Trp Cys Ala Val Ser Ser le Arg Thr Leu Arg Gln Leu G1 y Lys Lys Thr Val Val Val Asn Cy5 i 0 15 GAA TTT GATGG TGT GCT GTC TCC AGT ATC CGC ACA CTG CGC CAA CTT GGC AAG AAG AC GT GT GTG AATGT 3045 Asn Pro Glu Thr Val Ser Thr Asp Phe Asp Glu Cys Asp Lys Leu Tyr Phe Glu Glu Leu Ser Leu Glu Arg le 1040 AA CCG GAG ACT GTG AGC ACT GAC TTT GAT GAG TGT GAC AAA CTC TAC TTT GAA GAG CTG TC TTG GAG AG ATC 3120 Leu Asp le Tyr H i s Gln Glu Ala Cys Asn Gly Cys le le Ser Val Gly Gly Gln le Pro Asn Asn Leu Ala 1065 CA GAT ATC TAC CAC CAG GAG GCA TGT AAT GGC TGC ATC ATA TCA GTC GGG GGC CAG ATT CC AAC AAC TTG GCG 3195 Val Pro Leu Tyr Lys Asn Gly Val Lys le Met Gly Thr Ser Pro Leu Gln le Asp Arq Ala Glu Asp Arq Scr 1090 GTT CCG CTA TAC AAG AAC GGT GTC AAG ATC AT GGT ACC AGT CCT CT GAG ATC GAT AG GCT GA GAT CGC TCC 3270 le Phe Ser Ala Val Leu Asp Glu Leu Lys Val Ala G.ln Ala Pro Trp Lys Ala Val Asn Thr Leu Rsn Glu Ala 1115 ATC TTC TCG GCT GTC TTA GAT GAG CTG AAG GTG GC CAG GCT CC TGG AAA GCT GTT AAC AC TTG AAC GAG GCG 3345 Leu Glu Phe Ala Asn Ser Val Gly Tys Pro Cys Leu Leu Arg Pro Ser Ty' Val Leu Ser Gly Ser Ala Met Asn 1140 CTG GAA TTT GC AAC TCT GTG GGC TAT CCC TGC TTA CTG AGA CC TCC TAT GT TTG AGT GGG TCT GCC ATG AAC 3470 Val Val Phe Ser Glu Asp Glu Met L s A rq Phe Leu Glu Glu Ala Thr Ar Val Ser Gln Glu H i s Pro Val Val 1165 GTG GTA TTC TCT GAG GAT GAG ATG A AGG TTC CTT GAG GAG GCC ACT CG' GTC TCT CAG GAA CAC CCA GTG GTG 319s Leu Thr Lys Phe le Glu Gly Ala Arg Glu Val Glu Met Asp Ala Val Gly Lys Glu Gly Arq Val le Ser His 1190 CTG ACC AAG TTT ATT GAG GGG GCT CGG GAA GT GAG AT GAC GCT GTT GGC AAA GAA GGA CGG GTC ATC TC CAT 3570 Ala le Ser Glu H i s Val Glu Asp Ala Gly Val H i s Ser Gly Asp Ala Thr Leu Met Leu Pro Thr Gln Thr fle 1215 GCC ATC TCT GAA CAT GTT GAA GAT GCA GGT GT CAC TCA GGG GAT GCC ACA CTG ATG CTA CCT ACG CAG ACC ATC 3645 Ser Gln Gly Ala le Glu Lys Val Lgs Asp Ala Thr Arg Lgs le Ala Lys Ala Phe Ala le Ser Gly Pro Phe 12b0 AG CAA GGA GCC ATT GA AAG GTG AAG GAT GCC ACA CG AAG ATT GC AAG GC TTT GCC ATC TCT GGG CCA TC 3720 Asn Val Gln Phe Leu Val Lys Gly Asn Asp Val Leu Val le Glu Cys Asn Leu Arq Ala Ser Arq Ser Phe Pro 1265 AAT GTT CAG TTT CTT GTC AAA GGA AAT GAT GTC TTG GTG ATT GAG TGC AAT CTG AGA GCC TCT CGA TCC TT CCC 3795 FG. 3-continued.

5 9350 Rat a r bphosphate a Synthetase Gene Phe Val Ser Lys Thr Leu Gly Val Asp Phe le Asp Val Ala Thr Lys Val Met le Gly Glu Ser Val s Glu p TTT GTC TCC AAG ACT CTT GGG GTG GAC TTC ATT GAT GTG GCC ACC AAG GTG ATG ATC GGA GAG AGT GTT GAT GAG Lys His Leu Pro Thr Leu Glu Gln Pro le le Pro Ser Asp Tyr Val Ala le Lys Ala pro Met Phe Ser Trp AAG CAT CTA CCC ACA CTG GAA CAA CCC ATC ATC CCC TCT GAC TAT GTT GCC ATT AAG GCT CCC ATG TT TCC TGG Pro Arg Leu Arg Asp Ala Asp Pro 1P Leu Arg CyS Glu Met Ala Ser T hr Gly Glu Val Ala Cys Phe Gly Glu CCC CGA CTG AGG GAT GCT GAT CCT ATT CTG AGA TGT GAG ATG GC TCT ACT GGA GA GTG GCG TGT TTT GGT GAG Gly le His Thr Ala Phe Leu Lys Ala Met Leu Ser Thr Gly Phe Lys le Pro Gln Lys Gly le Leu le Gly GGC ATT CAT ACA GCC TTC CTA AAG GCA ATG CTG TCC ACA GGG TTT AAG ATA CCT CAG AAG GGC ATT CTG ATT GGC le Gln Gln Ser Phe Arg Pro Arg Phe Leu Gfy Val Ala Glu Gln Leu His Asn Glu Gly Phe Lys Leu Fhe Ala ATC CAG CAA TCA TTC CGT CCA AGA TTC CTT GGT GTT GCT GAG CAG TTA CAC AAT GAA GGT TTC AAG CTT TTT GCC Thr Glu Ala Thr Ser Asp TTp Leu Asn Ala Asn Asn Val Pro Ala Thr Pro Val Ala Trp Pro Ser Gln Glu Gly ACA GAA GCC ACA TCA GAC TGG CTC AAC GCC AAC AAT GTT CCT GCC AC CCA GT GC TGG CCA TCT CA GAA GGA Gln Asn Pro Ser Leu Ser Ser le Arq Lys Leu le Arg Asp Gly Ser le Asp Leu Val le Asn Leu Pro Asn CAG AAT CCC AGC CTC TCT TCC ATC AGA AAG TTG ATA AGA GAC GGA AGC ATT GAC CTA GTG ATT AAC CTC CCC AAT Asn Asn Thr Lys Phe Val His A sp Asn Tyr Val le Arg Arg Thr Ala Val Asp Ser Gly le Ala Leu Leu Thr AAC AAC ACC AAA TTT GTC CAT GAT AAT TAT GTG ATT CGG AGG ACA GCT GTG GAC AGT GGA ATT GCT CTG CTC ACC Asn Phe Gln Val Thr Lys Leu Phe Ala Glu Ala Val Gln Lys Ala Alg Thr Val Asp Ser Lys Ser Leu Phe His AAT TTC CAG GTG ACC AAA CTT TTT GCT GAG GCT GTG CAG AAA GCT CGT ACT GTG GAC TCC AAG AGT CTG TT CAC Tyr Arg Gln Tyr Ser Ala Gly Lys Ala Ala *** TAC AGG CAG TAC AGC GCT GGA AA GCA GCA TAG AGCAATATGCTGGiCTAGTGAATTCATCCTTCAGTCAGCAGGAGCCACACTGTA CCAAAGGGACTGGTCCCAGCCTATCGCCAGAAATGGATGTTGTTGCCAAACCTTATTTTTGGTTTCCCTGTTCTTAGTAGTCTGTATCTTTAACCACTG 4656 AATTGTTGTCAGTCACTGTTTCAAAACCATCAGGTTCTTCCCAAGTCTCTTGTTCTTCACAAGGAATCCAACCATTCATACATTATCTTTTGTAGACCG ACTTGTAACATTTCATGGTGCTGGTTAACAACAATTGAAGTAAAAATAGTTGCTGCCCTATCTCCTAAATTCTATACTCTAAACATACTCTCATTTTTA AAGTAGCTTCTGCCACACCCAGAGTCCACTAACTGGACAGATTACCATGAAATTTCATTAAAATAAAATGTGGATTTAATTTATGAATGGCATTATGAT GTTTGTGTATTGGTATTTTTTGGGAAATGATTCTCATTTACATAGCCAACCTAGAAATAGCATTTCTTCTTCCCATAATATACAATGTGAGATTGGATT TCTTTGATAGATCCTTGCTTGAGGTTTTGGAGGATGTCAGCAGATGTCAGAGAACACAGAAGTGGTTATATTCTTGATATTCTTTTTAGATTTTAACAT AAGCCAAGTGAAACATATAGACTCATGATATTTGCCTTTGGCTAAAATTTTGGAACACTTTAGAAGTTCCTGTTCCTTTTGTATGATGGATTCTGATC TCACTTGCATTCTGCATTCCACACATATTACGCATATGGCCTCCATCCCTTGACATGCTAGATGTTTTTCTTTCAAGGTAGCTTTTGTTGCTTTTAAGT PA GTTTTGTTTATTACTGTGCCTTAATTGTGAACTTTTAAATAAAATACTATTAGTGAn FG. %--continued E. identical plasmids contained another sequence unrelated to carbamyl phosphate synthetase. This sequence consisting of 528 nucleotides was found to have a reading frame encoding the sequence oflow molecular weight kininogen (data not shown). The mechanism by which the hybrid cdna was formed during reverse transcription is not clear at present. Nucetide Sequence of Carbamyl Phste mrna-the nucleotide sequence of carbamyl phosphate synthetase mrna was derived from the cdna inserts of the five recombinant clones, pkb4, pkb21, phn107, phn234, and phn291, according to the strategy outlined in Fig. 2. The entire nucleotide sequence was confirmed from the complementary strands. All of the labeled sites were crossed with a second set of overlapping fragments. The cdna sequence corresponding to carbamyl phosphate synthetase mrna is presented in Fig. 3. The sequence (not including the poly(a) tract) is 5545 nucleotides in length. The mrna, minimally, has a 5 nontranslated leader of 139 nucleotides, a continuous reading frame of4500 nucleotides, and a 3 nontranslated region of 905 nucleotides. The hexanucleotide AATAAA located at nucleotide through 5392 conforms to a signal for polyadenylation (25). Fourteen nucleotides downstream at nucleotide is the poly(a) addition site. A poly(a) tract of adenines was found in the cdna sequence of the cdna insert from pkb4; a tract of 41 adenines was sequenced from the cdna insert of phn234. The coding sequence begins with the ATG at nucleotide +1 Snthete and ends with a termination codon (TAG) at nucleotide +4,501, followed at nucleotide +4,519 with a second termination signal (TAGTGA). The 4,500-nucleotide-long reading frame initiated with the ATG at nucleotide +1 codes for a polypeptide of 1,500 amino acids with a calculated molecular weight of 164,564. This vaue is in excellent agreement with the molecular weight 165,O) of the precursor form of rat liver carbamyl phosphate synthetase (26). The ATG at nucleotide +l is preceded by a purine (A) at -3 and a C at -4, as found in translational start sites of most eukaryotic genes (27, 28). n contrast, an upstream ATG located at nucleotide -44 in the 5 leader sequence is preceded by a pyrimidine (C) at -3 and 12 bp downstream is followed by a termination codon. Both features are common to other eukaryotic genes that contain ATG codons upstream of the

6 Carbamyl Phosphate Synthetase Rat Gene TABLE Leu Leu Leu Leu CCU Pro 22 CAU CCC Pro 20 CAC CCA P r o 20 CAA CCG Pro 6 CAG AlJl AJC AlJA AUG le le le Met ACU ACC ACA ACG T h r 19 AAU Asn 34 T h r 29 AAC Asn 41 T h r 3 5 AAA Lys 28 T h r 5 AAG Lys '71 AGU AGC AGA AGG Ser Ser Arg Arg GN1 Val Val Val Val GCU GCC GCA GCG Ala 41 GAU Asp 41 Ala 40 GAC Asp 32 Ala 29 GAA Glu 44 Ala 8 GAG Glu 5 2 GGU GGC GGA (XG Gly Gly Gly Gly GUC GLA His 14 CGU Arg 6 His 14 CGC Arg 10 Gln 6 CGA Arg 9 Gln 46 CGG Arg '7 ClJlJ ClJC CLJA CJG GLJG ***, termination codon. 13 Senuencr 1.?.P, " 1 Cvclr ('odon usage' in thr carbnmylphosphate synthrtasr mrna The initiation codon is included in the tabulation. lkju P h e 33 UCU Ser 24 UAU T y r 13 UGU Cys 14 ULC P h e 28 UCC Ser 29 UAC T y r 23 UGC Cys i 0 l1lja Leu 9 UCA Ser 1'7 UAA ***' 0 UGA *** JG Leu 24 UCG Ser 6 UAG *** 1 UGG T r p 15 : 7? 4 5 f. 7 P 9 Ala Gln Tht' P-la Vi :'...; Ala i LPU,..,.., s e n n c? Ser n.wi : 9351 Val Lyr - ;,/.,?,$,;,. \la1 Lvs Ala?,lo..-., s,' '501. " ". ;.'.-.'. :.,.:. 7 q,.: Thr - ;,x.' i..:.. : " " FG. 5. NHz-terminal sequenceof mature ratliver carbamyl phosphate synthetase. Automated Edman degradation from the NH, terminus ofthe purifiedprotein ( nmol) wasperformed in a Reckman Sequencer, model 89OC. 'TH derivatives were identified by high performance liquid chromatography, and the peak areas to a werequantitated by automaticintegrationandcomparison mixture of known standards. Theyield (nmol) ofthe1'th derivatives released in each cycle is indicated in the linvs hrlorc, cwchseyucwcc,. PTH-serine and PTH-threonine were qualitatively identified. The yieldof PTH-serine was low (3.9 and 1.6 nmolwererecoveredin cycles 1 and 2, respectively). T h e yield of glutamine was derived from the sum of PTH-glutamine and PTH-glutamic acid. TARX 1, Amino acid compositionof maturr carbamyl phosphatf. svnthrtasr Values are given as molof residue/mol of enzyme. 83 (goy Lysine 28Histidine Arginine Aspartic acid Asparagine Threonine 102 Serine Glutamic acid Glutamine Proline Glycine Alanine "ys Valine Methionine 95soleucine Leucine Tyrosine Phenylalanine Tryptophan kb acid amino ' 103d ND' ND Total residues FG. 4. Size of carbamyl phosphate synthetase transcript. Rat liver poly(a') RNA (10 p g ) was denatured in2.2 M formaldehyde, separated on 1"; agarose gel under denaturing conditions. and transferred t o nitrocellulose. The RNA was hybridized with a nick-translated cdna probe internal to the coding region. Predicted from DNA sequence Amino acid analysis" Amino acid 1462 "The values (nearest integer) are averagesof 3 determinations of a 22-h hydrolysate. Value obtained from different hydrolysate (22 hl. ' Corrected for 5 % decomposition. Corrected for 10% decomposition. ND, not determined. ' CodonUsage-Codon usage inthe carbamyl phosphate major translational initiation site (27, 28). n the carbamyl synthetase mrnais summarized in Table. Allof the phosphate synthetaseleader, the upstreamatg a t nucleotide possible 61 codons are utilized in the sequence. Some codons are used less frequently, particularly those codons with CG -44 is centered within an imperfect inverted repeat (underlined in Fig. 3). A similar potential stem-loop structure is also and UA pairs. GUA (Val), for example, is used only in four of found around an upstream ATG codon in the 5' leader se- 123 valine codons, and UCG (Ser) is used in six of 105 serine quence of the yeast gene CPAl, which codes for the small codons. The preferredcodons aregug(val), AAG(Lys), subunit of carbamyl phosphate synthetase(8). CAG(G1n) and CUG(Leu). Northern Analysis-The size of carbamyl phosphate synthe nucleotide sequence of the carbamyl phosphate synthetase mrna was consistent amongall of the cdna inserts thetase mrna was estimated by Northern blot hybridizaof the recombinant clones, with the exception of two differ- tion of rat liver poly(a+) RNA with a nick-translated probe ences a t nucleotides and of the cdna insertof from the coding region of the cdna. As shown in Fig. 4, radioautographs revealed amajorradioactiveband whose pkb21. Both differences (C+T) at position3603 and (G+A) a t position 3669 occur in the third position of valine codons length was estimated to be5.7 f 0.3 kb. The size of the transcript is consistent with sequence analysis of the correand most probably are due toa polymorphism.

7 po MTRLTACKVVKTLKSGFGLANVTSKRQWDFSRPGRLLSVKAQTAHVEDGTKMKGYSFGHPSSVAGEVVFNTGLGGYSEALTDPAYKGQLTMANP 1 MKSALLVLEDGTQFHGRAGATGSAVGEVVFNTSMTGYQELTDPSYSRQVTLTYPH i GNGGAPDTTARDELGLNKYMESDGKVAGLLVLNYSHDYNHWLATKSLWLEEKVPAYGVDTRMLTKRDKGTMLGKEFEGQSVDFVDPNKQN- ll 1 1 / 1 / //ll! GNVGTNDAOE ESSQVHAQGLVRDLPLASNFRNTEDLSYLKRHNVAADDTRKLTRLLREKGAQNGCAGDNPDAALALEKARA AEVSTKDVKVFGKG NPTK VVAVDCGKNNVRLLVKRGAEVHLVPWNHDFTQ---MDYDGLLAGGPGNPALAQ ll S PLQNVKKLESDRKE-PLFGSTNjfiLAAGAKSYKNSANRGqNQPVLNTNRQAFTANHGYALD-TLPAGWKPLFVNVNDTNEGMH-ESK / llll.1 / ll / 1 1 ' 1 lllllll 1 / ATAQKFLETD---PVFGCLGHQLALASGAKTVKMKFGHHGGNHPVKDYEKNVVMTAQNHGFAVDEATLPANLRVTHKSLFDGTLQG-HRTTK l.00 PFFAVQFHPEVPGPTDTEYLFSFFSLKKGKGTTlTSVLPKPALVASRVEVSKVLLGSGLSGQAGEFDYSSQAVAMKEENVKTVLMNPNASV ll1 1 bacsfbglibalbdbhbaapheeqyrkak= NH-PKRTDKSL?GAGPVGQACEFDYSGAQACALREEGYRVLVNSNAT SO5 QTNEVGLKQADAVYFLPTPFVTEKAERPDGLLGMGGQTAlNCGELKRGVLKEYVKVLGTSVE$MATEDRQLFSDKLNENEKAPSFAVES ll lllllllll / ll 1 TDP---EMADATYEPHWEVVRKEKERPDAVLPTMGGQTALNCALLERQGVLEEFGVMATADADKAEDRRRFDVAMKKGLETARSGAH S O MEDALKAADTGYPVM!RSAYALGGLGSGPNKETLMDLTKAFAMTNQ--LVERSVTGKEEYEVVRDADDNCVTVCNENVDAMGVHTGDSVVVA / 1 / / t ll 1 /ll t MEEALAVAADVGFPCRPSFTMGGSGGGAYNREEFEECARGLDLSPKELLDESLGWKEYEM VVRDKNDNCVSENFDAMGHTGDSTV PAQTLSNAEFMLRRTSNVVRHLGV-GECNFALHPTSMEYCEVNARLSRSSALASKATGYPLAFAAKALGPLPEKNVVS-GKTSACFEPS l l / ll 1 1 PAQTLTDKEYQMRNASMAVLREGVfTGGSNVQFAVPKNGRLVEMNPRVSRSSALASKATGFPAKVAAKLAVGYTLDELMNDTGGRTPASFEP e50 LDYMVTKPRWDLORFHGTSSRGSSMKSVGEVMGRTFEESFKALRMCHPSVDGFTPRLPMNK-EWPANLDLRKfLSEPSSTYAAKALENNMSL DYVVTKPRFNFEKFAGANDRLTTKSVGfVGRTQESLKALRLEVATFDPKVSLDDP--ALTKRRLKAGADRWYADAFRAGLSV / ill//! lillllllllll / / / / / DEVKLTSDKWFLYKMRDLNMDKTLKGLNSESVEETLRQAKEGFSDKQSKCLGLTEAQTRELRLKKNHPWVKQDTLAAYPSVTNYLYVTYNG l l 1 ll DGVFNLTNDRWFLVQEELVRLEEKVAEVGTGLNADFLRQLKRKGFAARLAKLAGVREAERKLRDQYDLHPVYKRVDTCAAEFATDTAYMYSTYEE c QEHDKFDEHG-MVLGCGPYHGSSVEFDWCAVSSiRTLRQLGKKVVVNCNPTVSTDFDECKLYFEELSLERLDYHEACGCSVGGQPNN //// / // 1 / ll!llllllll /! ll 1 ECEANPSTDREKMVLGGGPNRGQGEFDCVHASLALREDYEMVNNPElVSTDYDTSDRLYFEPVTLEDVLEVRlEKPKGVVQYGGTPLK ,O US0 SZO LAVPLYKNGVKMGTSPLQDRAEDRSFSAVLDELVAOAPWKAVNTLNEALEFANSVGYPCLLRPSYVLSGSAMNVVFSEEMKFLEEATRVSQHP 1 ll 1 LARALEAAGVPVGTSPADRAEDRERFQHAVERLKLKQPANATVTAEMVEKAKEGYPLVVRPSYVLGGRAMEVYDEADLRRYFQTAVSVSNDAP 6 SO 1200 VVLTKFEGAREVEMDAVGKEGRVSHASEHVEDAVHSGDATLMLPTTSGAEKVKDA?RKAKAFASGPFNVQFLVKGNVLVECNLRASRS 1 ll / / lllllll 1 //// / 700 VLDHFLDDAVEVDVDACDfiEMVLGGMEHEQAGVHSfiDSACSLPAYTSEQDVMRQVKLAFELQVRGLMNVFAVKNNEVYLEVNPRAAT FPFVSKTLGVDFDVATKVMGESVDEKHLPTLEQPPSDYVAKAPMFSWPRLRDADPLRCEMASTGEVACFGEGHTAFLKAML---SlGFKPK VFVSKA?GVPLAKVAARVMAGKSLAEQ-GVTKE--VPP-YYSVKEVVLPFNKFGVDPLLGPEMRSTGEVMGVRTFAEAFAKAQLGSNSTMKKHGR- eso 900 1*PP GLGQQSFRPRFLGVAEQLHNEGFKLFATEATSDWLNNNVPATPVAWPSQEGQNPSLSSRKLRDGSDLVXNLPNNNTKFVHDNVRRTAV5SG / l l 1 ll1 -ALLSVREGDKERVVDLAAKLLKQGFELDATHGTAVLGEAGNPRLVN-KVHEPH----!QDRKNGEYTYNTT-SRRAEDSRVRRSALQYK ALLTNFqVTKLFAEAVQKARTVDSKSLFHYRQYSAGKAi-cwf 1 VHYDTTLNGG--FATMALNADATEKVSVQ MHAQK-C 1050 FEG. 6. Homology of the derived amino acid sequences of rat carbamyl phosphate synthetase and the small and large subunits of E. cozi carbamyl phosphate synthetase (CPS). The sequences of the rat carbamyl phosphate synthetase precursor (top line) and of E. coli carbamyl phosphate synthetase (bottom line) were manually aligned for maximal amino acid identities, indicated by the vertical lines. Gaps in the sequence represent deletions or insertions, The arrows (1) denote the NH2 termini of the mature rat enzyme. The COOH terminus of the small subunit and the NH2 terminus of the large subunit of the E. coli enzyme are also shown in the figure. The reactive cyst.eine (Cys 269) in the glutamine-active site of the E. coli small subunit is marked with a large dot (0) XU50

8 9353 tronsciiption EXONS E GENE, 1 l l, t., l l 1 ' 1 ' ECoR Hlndrn 1 ". BamHl Xho Kpn PSt Hpo CLONES & '700,500 RAT CPll FG. 7. Dot matrix analyses of the homology of the amino acid sequences of carbamyl phosphate synthetases of rat, yeast, and E. coli. A, matrix of the rat carbamyl phosphate synthetase sequence versus the sequences of the glutamine and synthetase subunits of E. coli carbamyl phosphate synthetase. B, matrix of the rat carbamyl phosphate synthetase sequence versus the sequences of the small and large subunits of yeast carbamyl phosphate synthetase C, matrix plot of the internal homology in the synthetase component of rat carbamyl phosphate synthetase 1. The main diagoa represents the homology between the sequences of the two proteins. The shorter diagonals aboue and below the main line represent the reciprocal internal homologies in the synthetase components of the three enzymes. E. coli CPS, E. coli carbamyl phosphate synthetase (6, 7); yeast CPS, yeast carbamyl phosphate synthetase (8, 9); rat CFS, rat carbamyl phosphate synthetase (this work). E. coir CPS S Y ' S L S K " T F " K V A Q M N D! G R P!? yeast CPS R i ' R ' 1 4 ' 4 T K 4 i " ' e i " T 4 l! Y T P P! K T V N ' e? a CPSl ( RtSRSSASKATGVPLAFAAKlALGlPLPEKNVVSGK-TSACFEPSLQ i. coli CfS RiinTVPFVsKATCVPLAKVAAUVAKSLAE--Q----GlK"-ElP ',%,,,,,,/, 3, yeast CPS RASRSFPFVSVLGVNF AVAFGDVP------PVD---NKK... at CPS )Oz RRSRPFVSKTVFOVATKVNGESQ--KH---LPL-EPPS..... *. ' FG. 8. Alignment of previously proposed nucleotide-binding domain NHz- in and carboxyl-terminal halves of carbamyl phosphate synthetases (CPS) of E. coli, yeast, and rat. The sequences shown in the upper three lines of the figure are located in the NHp-half of the various synthetase subunits; the sequences in the lower three lines of the figure are located in the COOH-half of the synthetases. The alignment of the E. coli and yeast sequences is the same as proposed in Ref. 9. denticalandfunctionally conserved residues in each domain are connected by the uertical lines. dentical and conserved residues between the two domains are marked by the dots ( 0) at the bottom of the figure. sponding cdna. Assuming a poly(a) tract of 100 adenines and a 5' leader of 139 nucleotides, the minimal full-length transcript is calculated to be 5645 nucleotides. dentificaton of Mature Carbaml Phosphat Synthetase -Rat liver carbamyl phosphate synthetase has previously been shown to be synthesized as a precursor 5 kda larger than the mature enzyme (26). The NHs-terminal start of the mature protein was determined in the present study by analysis of the amino-terminal residue and also by a partial sequence of the NHp-terminal region of the mature protein. Dansylation and hydrolysis of the protein yielded dansylleucine and dansyl-serine, although serine appeared to be a minor component. A leucine NH2 terminus has also been reported by Clarke (29) for rat liver carbamyl phosphate synthetase prepared by a different procedure. TWO unambious NH-terminal sequences displaced by a single amino acid were obtained by automated Edman deg kb FG. 9. Map of part of the carbamyl phosphate synthetase gene of rat. A region of the carbamyl phosphate synthetase gene encoding nucleotides 3142 through 4528 of the mrna is represented by the heavy dark line. The 13 exons are shown as the vertical bars, which are drawn to scale. Exons 9 and 10 are located within the spans indicated by the parentheses. Exon 13 extends through t.he EcoR site at the 3' end of the cloned nuclear DNA. The arrow at the top of the figure denotes the direction of transcription of the gene. The open burs below the restriction maps represent the rat nuclear DNA inserts isolated in the recombinant phage clones Xl0,cps and X1,cps. radation (Fig. 5). The first sequence started at Leu 39 and the second at Ser 40. Both sequences matched the sequence encoded by the mrna for the 10 steps analyzed (compare Figs. 3 and 5). The average yield calculated from the stable amino acid derivatives released in cycles 3-5 indicated an aproximately 1:1 ratio of the two NH2-terminal sequences. While these results suggest two adjacent processing sites involving cleavage on the carbonyl side of Leu 38 and/or Leu 39, the possibility of proteolytic removal of the NH2-terminal leucine during the isolation of the enzyme cannot excluded. be Digestion of purified carbamyl phosphate synthetase with carboxypeptidase A released 1.6 mol of alaninelmol of protein and mol/mol of (tyrosine, lysine, glycine, and serine). Since lysine is known to be a poor substrate for carboxypeptidase A, these data indicate the COOH-terminal sequence to start with two alanines followed by lysine, followed by glycine, tyrosine, and serine; the order of the latter three being indeterminate. These data are consistent with the sequence of carbamyl phosphate synthetase encoded in the cloned mrna (Fig. 3). The correct identification of the coding sequence is also supported by the experimentally determined amino acid composition and the molecular weight of mature carbamyl phosphate synthetase. The amino acid composition of the purified rat enzyme is in excellent agreement with the composition of the mature protein predicted from the mrna sequence (Table 11). Assuming that the mature protein starts Leu at 39 of the cdna sequence, its overall length of 1462 amino acid residues corresponds to a molecular weight of 160,304. This agrees well with the molecular weight of 158,700 previously determined by physical methods (30). Homology of Carbamyl Phosphate Synthetase and Gluta- mine-dependent Carbamyl Phosphate Synthetases of Yeast and E. coli-the derived amino acid sequence of carbamyl phosphate synthetase is homologous to the amino acid sequences of both yeast and E. coli carbamyl phosphate synthetases. The amino acid sequences of the rat and E. coli proteins are shown in Fig. 6. The sequence homology between

9 9354 FG. 10. Location of exons in cdna sequence and nucleotide sequence of intron-exon boundaries. The exons are numbered 1-13 in the direction of transcription of the gene. The nucleotide sequences of the exons and the intron-exon boundaries were determined by sequence analysis of restriction fragments of the rat genomic DNA inserts isolated in the phage clones, Xl0,cps and h1,cps. The sequences of exons 9 and 10 were not determined. The sequence of exon 13 extends through the EcoR site at the 3' end of the cloned DNA. Exon a 9 Rat Carbamyl Phosphate Synthetase Gene Loca t ion in =DNA Length ntron-exon-ntron Boundaries AlaCySASnCly ValASnThrLeu bp TGGCTCGTTTCTTCCAG GCATGTAATGGC,.....GTTAACACTTTG/ GTAAGGAGAGCAACACG % AsnGluAlaLeu TyrValLeuSe TTCCCCCTCTTAATTAG] AACGAGGCGCTG......TATGTTTTGAG GTAATATATTGTTTTCC rglyseralamet ArgValSerCln CCCTTCACCCTTTCCAG/TGGGTCTGCCATG.....CGAGTCTCTCAG GTAGTGTCCCATTTTCT Gluh'isProVal LysCluClyArg GTTTCTCTTGTTGGCAG GAACACCCAGTG......AAAGAAGGACGG/ GTATGTGTTAGTGCTTT ValleSerfis AlaleCluLys AACTCTTCTTTTGACAG GTCATCTCCCAT GCCATTGAAAAG/ GTCATCATTTAGAAACG ValLYsAsPAla AsnAspValLeu ATTGTCCTTTTCTATAG GTGAAGGATGCC......AATGATGTCTTG GTAAGAAATATTAATG ValleCluCys ValAlaleLys TGAACTTATCTCCTTAG GTGATTGAGTGC..... GTTGCCATTAAG GTAATATTTTGCAATGT AlaProMetPhe SerThrGlyClu CCTGTTGCGTCTGACAGlGCTCCCATGTTT..,...TCTACTGGAGAGl GTAAATAGTTAATGATC % LeuPheAlaThr SerleArgLy TTCTATTTTAAATGCAG CTTTTTGCCACA...,...TCCATCAGAAA( GTAAGAACCGAATAGCC sleuleargasp ThrASnPheCln TTATTTTTTTCTTTTAG GTTGATAAGAGAC..... ACCAATTTCCAG 1 GTGTGTTTCCTCTTTTA ,124 ValThrLysLeu TTGACATTTTCTTTCAG 1 GTGACCAAACTT......, Elongated subunit gene - 1 Synthetase (duplicated) 1 1 Addition of leader sequence i CPS gens FG. 11. Proposed evolutionary derivation of mammalian carbamyl phosphate synthetase. The dark bur represents the length of the sequences derived from the gene of the ancestral glutamine subunit (8). The stippled bar indicates sequences derived from an unidentified gene (8). The open bur represents sequences derived from the gene of the ancestral kinase (6); Ed, leader sequence. the rat and bacterial enzymes extends along the entire length of the rat polypeptide and encompasses the entire amino acid sequence of both the small and the large subunits of the E. coli enzyme. The alignment of the two sequences shown in Fig. 6 required an average 1.8 deletions or insertions/100 amino acid residues. Of 1389 possible matches, 582 (41.9%) of the amino acid residues are identical. A similar alignment of the rat and yeast amino acid sequences exhibits 45% identical amino acids and required fewer deletions or insertions. The evolutionary relationship of the rat, yeast, and E. coli enzymes is even more clearly demonstrated by dot matrix (31) analyses, where the sequence homology between the proteins is scored by using a mutation data matrix (32). As shown in Fig. 7, when the amino acid sequence of carbamyl phosphate synthetase is compared to those of the E. coli and yeast proteins, three lines of identity are evident. The main diagonal shows the homology of the interspecies sequences. t extends over the entire length of the rat hybrid polypeptide and the small and large subunits of the E. coli and yeast enzymes. The shorter diagonal lines above and below the main diagonal show the conservation of the internal duplication of the synthetase subunit in the three species. The duplicated nature of the synthetase has previously been described for the E. coli and yeast enzymes (6, 9).,The present data show the synthetase component of the rat enzyme to have the same duplication. The two halves of the rat synthetase component exhibit 23% identical residues. Comparable values for the yeast and E. coli synthetases are 28.5 and 35%, respectively. Since the greatest sequence conservation among the three species occurs in the NHz-terminal halves of the synthetase subunits, the decreasing homology is due primarily to divergence in the carboxylterminal halves of the three enzymes. Functional Domains-We have previously proposed three functional domains in E. coli and yeast carbamyl phosphate synthetases. Two of the domains were suggested to be involved in the binding of ATP (9), and the third domain was identified as the site of glutamine hydrolysis (8). The main feature of the glutamine hydrolytic site was the presence of a reactive cysteine residue, which had previously been shown to be essential for amidotransferase activity (33, 34). The absence of this cysteine residue in the sequence of rat carbamyl phosphate synthetase led us to conclude that the glutamine site is modified such that it couldno longer catalyze the hydrolysis of glutamine (5). The postulated ATP-binding sites in the E. coli and yeast enzymes (9) are essentially conserved in the amino acid sequence of rat carbamyl phosphate synthetase. The ATPbinding sites are located in two regions of the NHZ-terminal half of the E. coli synthetase subunit. These sites have their counterparts in the carboxyl-terminal half. One of the sites (residues ) exhibits sequence similarities to the ATPbinding site of phosphoglycerate kinase and to the predicted dinucleotide fold of glutamate dehydrogenase (9). n rat carbamyl phosphate synthetase, the analogous domain is located between residues (Fig. 8). This domain is highly conserved between the E. coli and rat sequences. Of 50 amino acid residues, 31 are identical and 10 represent functionally conserved substitutions. The corresponding domain (residues ) in the carboxyl-terminal half of the rat synthetase component is also conserved (19 identities and 5 functionally conserved substitutions) (Fig. 8). The second ATP-binding site (residues in the NHZ-

10 half and residues in the COOH-half of the E. coli large subunit) is based on sequence similarities with the glycine-rich loop of the ATP-binding site of adenylate kinase and the p-subunit of F1-ATPase (9). The analogous domains in the rat sequence (residues and ) are less conserved. Of 55 amino acid residues, 17 are identical (data not shown). A fourth functional domain of rat carbamyl phosphate synthetase is absent in both the E. coli and the yeast enzymes. This domain consists of 38 residues, starting from the NHz-terminal methionine. This sequence contains 8 basic amino acid residues, 1 acidic residue, and a Pro-Gly sequence 4 residues before the start of the mature enzyme, these features being common to signal sequences that direct proteins for import into mitochondria (35). Characterization of Rat Carbamyl Phosphate Synthetase Gene-A X phage library of rat chromosomal DNA (18) was screened with a 770-bp cdna probe containing the carboxylterminal coding sequence of carbamyl phosphate synthetase. The recombinant phage isolated from the screen were characterized by restriction analysis and grouped into two sets of clones carrying unique nonoverlapping EcoR fragments of rat nuclear DNA. Two representative clones Xl0,cps and X1,cps with nuclear DNA inserts of 13.5 and 15.2 kb, respectively, were used to derive the structure of the gene region coding for the carboxyl-terminal 453 amino acids of carbamyl phosphate synthetase. The exons in both clones were localized by Southern hybridization and by sequence analysis. The two clones contained the sequence of the cdna starting from nucleotide through The coding region comprising 1359 nucleotides was ascertained to be split into 13 separate exons whose positions are shown in Fig. 9. With the exception of exons 9 and 10, all the remaining 11 exons were sequenced and the boundaries identified (Fig. 10). The localization of exons 9 and 10 was determined by restriction mapping and Southern hybridization within the limits indicated by the brackets in the figure. Even though the two clones did not have overlapping sequences, it was possible to show that they represented contiguous segments of the gene. Southern hybridization analysis of rat genomic DNA digested with Hind indicated a 2.6-kb fragment that hybridized to a cdna probe containing the sequence encompassed by exons 3 into 7 (see Fig. 9). Based on the location of the Hind sites in X10,cps and Xl,cps, the 2.6-kb Hind fragment should include the EcoR site common to the two clones. Of course, we cannot exclude the possibility that the genomic DNA may have another EcoR site (<lo0 bp) separating the two cloned fragments. The sequences of the 11 exons which range from 68 to 195 bp in length showed no discrepancies with the sequence of the cdna reported in Fig. 3. All the intron-exon boundaries conformed to the GT-AG rule (36) and to the consensus sequence compiled by Mount (37). Assuming the same 20:l ratio for intron/exon lengths for the rest of the gene, we estimate that the rat carbamyl phosphate synthetase gene should be included in approximately kb of rat chromosomal DNA. Extensive Southern hybridization analysis of rat genomic DNA indicates that the carbamyl phosphate synthetase gene is present in single copy (data not shown). DSCUSSON Carbamyl phosphate synthetase of E. coli and of yeast are each composed of two different subunits encoded by two different genes (38). The smaller subunit catalyzes the trans- Rat Carbamyl Phosphate Synthetase Gene 9355 fer of the amide-n from glutamine to a catalytic center for carbamyl phosphate synthesis located on the larger synthetase subunit (39, 40). Previous studies of the E. coli (6, 7) and of the yeast (8, 9) genes have shown that the glutamine subunits are homologous and related to other amidotransferases (8). The genes for the larger synthetase subunits were found to have undergone a gene duplication resulting in a polypeptide with two homologous halves (6, 9). The present studies were undertaken to establish the relation of the mammalian NHa-utilizing enzyme carbamyl phosphate synthetase to the glutamine-specific carbamyl phosphate synthetases of bacteria and yeast. The entire coding sequence of the rat liver carbamyl phosphate synthetase mrna has been determined from overlapping cdna clones. The message includes a short 5' leader of at least 139 nucleotides, 4500 nucleotides of coding sequence, and a long 3' nontranslated extension of 905 nucleotides. Analysis of two genomic clones selected from a X library indicates tht the gene contains multiple exons. The two clones studied represent a total of 28.7 kb of genomic DNA of which only 1388 bp are present in the mrna. The.predicted primary translation product of the carbamyl phosphate synthetase mrna is a 164,564-Da protein. This precursor is cleaved at Leu 38 and/or Leu 39 to yield a mature carbamyl phosphate synthetase of 160,304 Da. These molec- ular weights are in very good agreement with previously published data on the sizes of the precursor and mature forms of the rat enzyme estimated on SDS-polyacrylamide gels (26, 29,30) and by sedimentation equilibrium in guanidine hydrochloride (30). The derived amino acid sequence of rat carbamyl phosphate synthetase has revealed several important facts that bear on its evolutionary origin. First, the complete sequence of the message has confirmed our previous conclusion that the mammalian enzyme is a fusion polypeptide of a glutamine amide transfer component and of a synthetase component (5). The amino acid sequences of the glutamine component located at the NH, end of the enzyme and of the fused synthetase component are both homologous to the separate subunits of the E. coli and yeast enzymes. The homology is unambiguous and extends across the carboxyl-terminal end of the small subunit and the NH, terminus of the larger synthetase subunit. This suggests that the mammalian gene was formed by a simple gene fusion event by either a mechanism similar to that proposed for the fusion of the his genes (41) or by nonhomologous (illegitimate) recombination (42, 43). Comparison of the protein sequences suggests the fusion probably occurred some time after the fungi diverged from the animal line, but before the separation of the chordate line, which includes cartilaginous and bony fishes, amphibians, and mammals. n all these organisms, arginine-specific carbamyl phos- phate synthetase consists of a single 160-kDa polypeptide, suggesting fusion to be an early evolutionary event. A probable scheme depicting the evolution of the mammalian enzyme is presented in Fig. 11. This scheme incorporates our earlier suggestion that the small glutamine subunit of glutaminedependent carbamyl phosphate synthetase was derived from a fusion of an ancestral gene coding for the glutamine subunit with another unidentified gene (8). The synthetase subunit of the prokaryotic and fungal enzymes arose by a tandem duplication of an ancestral kinase (6). The present study provides strong evidence that the single gene of mammalian carbamyl phosphate synthetases was formed by a later fusion of the genes for the glutamine and synthetase subunits. Since all mammalian carbamyl phosphate synthetases are localized in the mitochondrion, their evolution requires the further acqui-

11 9356 Rat Carbamyl Phosphate Synthetase Gene sition of a signal peptide directing the protein for import into 8. Nyunoya, H., and Lusty, C. J. (1984) J. Biol. Chern. 259, the organelle Even though the two halves of the synthetase component 9. Lusty, C. J., Widgren, E. E., Broglie, K. E., and Nyunoya, H. (1983) J. Biol. Chem. 258, have diverged to a greater extent in mammalian carbamyl 10. Birnboim, H. C., and Doly, J. (1979) Nucleic Acids Res. 7, phosphate synthetase (23%) than in yeast (28.5%) and E. coli 1523 (35%), certain regions are highly conserved. Among these is 11. Ryall, J., Rachubinski, R. A., Nguyen, M., Rozen, R., Broglie, K. a domain which we previously proposed to be a possible E., and Shore, G. C. (1984) J. Biol. Chem. 259, nucleotide-binding site of the synthetase. This domain has 12. Maxam, A. M., and Gilbert, W. (1980) Methods Enzymol. 65, been conserved in both the NH,- (residues ) and 13. Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular carboxyl-terminal (residues ) halves of rat liver Cloning: A Laboratory Manual, pp , Cold Spring Harcarbamyl phosphate synthetase. The conservation of these bor Laboratory, Cold Spring Harbor, NY sequences in the context of a general loss of sequence identity 14. Thomas, P. S. (1980) Proc. Natl. Acad. Sci. U. S. A. 77, between the two halves of the mammalian synthetase provides Gros, C., and Labouesse, B. (1969) Eur. J. Biochem. 7, additional evidence that both domains play an important 16. Woods, K.R., and Wang, K.-T. (1967) Biochirn. Biophys. Acta catalytic function. 133, Another functional domain previously identified to be in- 17. Ambler, R. P. (1967) Methods Enzymol. 11, volved in the transfer of glutamine amide-n has also been 18. Sargent, T. D., Wu, J.-R., Sala-Trepat, J. M., Wallace, R.B., conserved in the glutamine component of mammalian carba- Reyes, A. A,, and Bonner, J. (1979) Proc. Natl. Acad. Sci. U. S. myl phosphate synthetase. As noted previously, however, a A. 76, cysteine residue shown in other amidotransferases to be essential for glutamine hydrolysis has been substituted by a serine (5). This substitution accounts for the inability of the mammalian synthetase to derive NH3 from glutamine. Paluh et al. (44) have recently found that the site-specific substitution of a glycine for the homologous cysteine (Cys 84) in the catalytic site of anthranilate synthase Component 1 abolishes glutamine but not NH3 utilization by anthranilate synthase. An important property of mammalian carbamyl phosphate synthetase is its almost absolute requirement of acetylglutamate for enzymatic activity. Since acetylglutamate is not used in the reaction, it acts as an allosteric activator of the synthetase (45). Acetylglutamate has been shown to bind with high affinity to mammalian carbamyl phosphate synthetase (45), although the binding site has not been identified. Acetylglutamate could bind either to the modified glutamine domain or to some new site in the fused protein. Although the former possibility is attractive, there is evidence from studies of carbamyl phosphate synthetase 11 of teleost fish that acetylglutamate binds to a site separate from the glutamine-binding site (46). Carbamyl phosphate synthetase 11 utilizes glutamine like the prokaryotic enzyme but requires acetylglutamate for activity (47,48). Casey and Anderson (46) have recently shown acetylglutamate and glutamine bind to two separate but interacting sites. We, therefore, favor the idea that in the mammalian enzyme, acetylglutamate interacts with a site distinct from the glutamine domain. Acknowledgments-We express our appreciation to Dr. Gert Kreibich and E. D. Dharmgrongartama for the protein sequence analysis, and to Dr. Lois T. Hunt of the National Biomedical Foundation for computer and dot matrix analyses of the protein sequences. REFERENCES 1. Jones, M. E. (1965) Annu. Reu. Biochem. 34, Jones, M. E., Anderson, A. D., Anderson, C., and Hodes, S. (1961) Arch. Biochem. Biophys. 95, Jones, M. E,, and Lipmann, F. (1960) Proc. Natl. Acad. Sci. U. S. A. 46, Campbell, J. W. (1965) Nature 208, Nyunoya, H., Broglie, K. E., and Lusty, C. J. (1985) Pm. Natl. Acad. Sci. U. S. A. 82, Nyunoya, H., and Lusty, C. J. (1983) Proc. Natl. Acad. Sci. U. S. A. 80, Piette, J., Nyunoya, H., Lusty, C. J., Cunin, R., Weyens, G., Crabeel, M., Charlier, D., Glansdorff, N., and Pierard, A. (1984) Proc. Natl. Acad. Sci. U. S. A. 81, Benton, W. D., and Davis, R. W. (1977) Science 196, Blattner, F. R., Williams, B. G., Blechl, A. E., Denniston-Thornpson, K., Faber, H. E., Furlong, L.-A., Grunwald, D. J., Kiefer, D. O., Moore, D. D., Schumm, J. W., Sheldon, E. L., and Smithies, 0. (1977) Science 196, Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular Cloning: A Laboratory Manual, pp , p. 373, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 22. Blin, N., and Stafford, D. W. (1976) Nucleic Acids Res. 3, Southern, E. (1979) Methods Enzymol. 68, Okayama, H., and Berg, P. (1982) Mol. Cell. Biol. 2, Proudfoot, N. J., and Brownlee, G. G. (1974) Nature 252, Shore. G. C.. Carircnan., P... and Ravmond. Y. (1979) J. Biol. Chem. 254, Kozak. M. (1984) Nucleic Acids Res Johansen, H., Schumperli, D., and Rosenberg, M. (1984) Proc. Natl. Acad. Sci. U. S. A. 81, Clarke, S. (1976) J. Biol. Chem. 251, Lusty, C. J. (1978) Eur. J. Biochem. 85, George, D. G., Yeh, L.3 L., and Barker, W. C. (1983) Biochem. Biophys. Res. Commun. 115, Davhoff. M. 0.. Barker. W. C., and Hunt. L. T. (1983) Methods Enzymol. 91, Tso. J. Y.. Hermodson. M.A.. and Zalkin. H. (1980).. J. Biol. Chem. 255, Dawid,. B., French, T. C., and Buchanan, J. M. (1963) J. Biol. Chem. 238, Kaput, J., Goltz, S., and Blobel, G. (1982) J. Biol. Chem. 257, Breathnach, R., Benoist, C., O Hare, K., Gannon, F., and Chambon, P. (1978) Proc. Natl. Acad. Sci. U. S. A. 75, Mount, S. M. (1982) Nucleic Acids Res. 10, Pierard, A., Grenson, M., Glansdorff, N., and Wiame, J. M. (1973) in The Enzymes of Glutamine Metabolism (Prusiner, S., and Stadtman, E. R., eds) pp , Academic Press, New York 39. Trotta, P. P., Pinkus, L. M., Haschemeyer, R. H., and Meister, A. (1974) J. Biol. Chem. 249, PiBrard, A., and Schroter, B. (1978) J. Bacteriol. 134, Yourno, J., Kohno, T., and Roth, J. R. (1970) Nature 228, Anderson, R. P., and Roth, J. R. (1977) Annu. Reo. Microbiol. 31, Tisty, T. D., Albertini, A. M., and Miller, J. H. (1984) Cell 37, Paluh, J. L., Zalkin, H., Betsch, D., and Weith, H. L. (1985) J. Biol. Chem. 260, Alonso, E., and Rubio, V. (1983) Eur. J. Biochem. 135, Casey, C.A., and Anderson, P. M. (1983) J. Biol. Chem. 258, Tramell, P. R., and Campbell, J. W. (1971) Comp. Biochem. Physiol. 40B, Anderson, P. M. (1980) Science 208,

Materials Protein synthesis kit. This kit consists of 24 amino acids, 24 transfer RNAs, four messenger RNAs and one ribosome (see below).

Materials Protein synthesis kit. This kit consists of 24 amino acids, 24 transfer RNAs, four messenger RNAs and one ribosome (see below). Protein Synthesis Instructions The purpose of today s lab is to: Understand how a cell manufactures proteins from amino acids, using information stored in the genetic code. Assemble models of four very

More information

Lecture 19A. DNA computing

Lecture 19A. DNA computing Lecture 19A. DNA computing What exactly is DNA (deoxyribonucleic acid)? DNA is the material that contains codes for the many physical characteristics of every living creature. Your cells use different

More information

Protein Synthesis. Application Based Questions

Protein Synthesis. Application Based Questions Protein Synthesis Application Based Questions MRNA Triplet Codons Note: Logic behind the single letter abbreviations can be found at: http://www.biology.arizona.edu/biochemistry/problem_sets/aa/dayhoff.html

More information

Disease and selection in the human genome 3

Disease and selection in the human genome 3 Disease and selection in the human genome 3 Ka/Ks revisited Please sit in row K or forward RBFD: human populations, adaptation and immunity Neandertal Museum, Mettman Germany Sequence genome Measure expression

More information

Codon Bias with PRISM. 2IM24/25, Fall 2007

Codon Bias with PRISM. 2IM24/25, Fall 2007 Codon Bias with PRISM 2IM24/25, Fall 2007 from RNA to protein mrna vs. trna aminoacid trna anticodon mrna codon codon-anticodon matching Watson-Crick base pairing A U and C G binding first two nucleotide

More information

ORFs and genes. Please sit in row K or forward

ORFs and genes. Please sit in row K or forward ORFs and genes Please sit in row K or forward https://www.flickr.com/photos/teseum/3231682806/in/photostream/ Question: why do some strains of Vibrio cause cholera and others don t? Methods Mechanisms

More information

Homework. A bit about the nature of the atoms of interest. Project. The role of electronega<vity

Homework. A bit about the nature of the atoms of interest. Project. The role of electronega<vity Homework Why cited articles are especially useful. citeulike science citation index When cutting and pasting less is more. Project Your protein: I will mail these out this weekend If you haven t gotten

More information

Lecture 10, 20/2/2002: The process of solution development - The CODEHOP strategy for automatic design of consensus-degenerate primers for PCR

Lecture 10, 20/2/2002: The process of solution development - The CODEHOP strategy for automatic design of consensus-degenerate primers for PCR Lecture 10, 20/2/2002: The process of solution development - The CODEHOP strategy for automatic design of consensus-degenerate primers for PCR 1 The problem We wish to clone a yet unknown gene from a known

More information

1. DNA, RNA structure. 2. DNA replication. 3. Transcription, translation

1. DNA, RNA structure. 2. DNA replication. 3. Transcription, translation 1. DNA, RNA structure 2. DNA replication 3. Transcription, translation DNA and RNA are polymers of nucleotides DNA is a nucleic acid, made of long chains of nucleotides Nucleotide Phosphate group Nitrogenous

More information

PROTEIN SYNTHESIS Study Guide

PROTEIN SYNTHESIS Study Guide PART A. Read the following: PROTEIN SYNTHESIS Study Guide Protein synthesis is the process used by the body to make proteins. The first step of protein synthesis is called Transcription. It occurs in the

More information

PGRP negatively regulates NOD-mediated cytokine production in rainbow trout liver cells

PGRP negatively regulates NOD-mediated cytokine production in rainbow trout liver cells Supplementary Information for: PGRP negatively regulates NOD-mediated cytokine production in rainbow trout liver cells Ju Hye Jang 1, Hyun Kim 2, Mi Jung Jang 2, Ju Hyun Cho 1,2,* 1 Research Institute

More information

Det matematisk-naturvitenskapelige fakultet

Det matematisk-naturvitenskapelige fakultet UNIVERSITETET I OSLO Det matematisk-naturvitenskapelige fakultet Exam in: MBV4010 Arbeidsmetoder i molekylærbiologi og biokjemi I MBV4010 Methods in molecular biology and biochemistry I Day of exam: Friday

More information

Lecture 11: Gene Prediction

Lecture 11: Gene Prediction Lecture 11: Gene Prediction Study Chapter 6.11-6.14 1 Gene: A sequence of nucleotides coding for protein Gene Prediction Problem: Determine the beginning and end positions of genes in a genome Where are

More information

Supplementary. Table 1: Oligonucleotides and Plasmids. complementary to positions from 77 of the SRα '- GCT CTA GAG AAC TTG AAG TAC AGA CTG C

Supplementary. Table 1: Oligonucleotides and Plasmids. complementary to positions from 77 of the SRα '- GCT CTA GAG AAC TTG AAG TAC AGA CTG C Supplementary Table 1: Oligonucleotides and Plasmids 913954 5'- GCT CTA GAG AAC TTG AAG TAC AGA CTG C 913955 5'- CCC AAG CTT ACA GTG TGG CCA TTC TGC TG 223396 5'- CGA CGC GTA CAG TGT GGC CAT TCT GCT G

More information

NAME:... MODEL ANSWER... STUDENT NUMBER:... Maximum marks: 50. Internal Examiner: Hugh Murrell, Computer Science, UKZN

NAME:... MODEL ANSWER... STUDENT NUMBER:... Maximum marks: 50. Internal Examiner: Hugh Murrell, Computer Science, UKZN COMP710, Bioinformatics with Julia, Test One, Thursday the 20 th of April, 2017, 09h30-11h30 1 NAME:...... MODEL ANSWER... STUDENT NUMBER:...... Maximum marks: 50 Internal Examiner: Hugh Murrell, Computer

More information

Molecular Level of Genetics

Molecular Level of Genetics Molecular Level of Genetics Most of the molecules found in humans and other living organisms fall into one of four categories: 1. carbohydrates (sugars and starches) 2. lipids (fats, oils, and waxes) 3.

More information

Electronic Supplementary Information

Electronic Supplementary Information Electronic Supplementary Material (ESI) for Molecular BioSystems. This journal is The Royal Society of Chemistry 2017 Electronic Supplementary Information Dissecting binding of a β-barrel outer membrane

More information

A Zero-Knowledge Based Introduction to Biology

A Zero-Knowledge Based Introduction to Biology A Zero-Knowledge Based Introduction to Biology Konstantinos (Gus) Katsiapis 25 Sep 2009 Thanks to Cory McLean and George Asimenos Cells: Building Blocks of Life cell, membrane, cytoplasm, nucleus, mitochondrion

More information

The combination of a phosphate, sugar and a base forms a compound called a nucleotide.

The combination of a phosphate, sugar and a base forms a compound called a nucleotide. History Rosalin Franklin: Female scientist (x-ray crystallographer) who took the picture of DNA James Watson and Francis Crick: Solved the structure of DNA from information obtained by other scientist.

More information

How life. constructs itself.

How life. constructs itself. How life constructs itself Life constructs itself using few simple rules of information processing. On the one hand, there is a set of rules determining how such basic chemical reactions as transcription,

More information

Project 07/111 Final Report October 31, Project Title: Cloning and expression of porcine complement C3d for enhanced vaccines

Project 07/111 Final Report October 31, Project Title: Cloning and expression of porcine complement C3d for enhanced vaccines Project 07/111 Final Report October 31, 2007. Project Title: Cloning and expression of porcine complement C3d for enhanced vaccines Project Leader: Dr Douglas C. Hodgins (519-824-4120 Ex 54758, fax 519-824-5930)

More information

Protein Synthesis: Transcription and Translation

Protein Synthesis: Transcription and Translation Review Protein Synthesis: Transcription and Translation Central Dogma of Molecular Biology Protein synthesis requires two steps: transcription and translation. DNA contains codes Three bases in DNA code

More information

Supporting information for Biochemistry, 1995, 34(34), , DOI: /bi00034a013

Supporting information for Biochemistry, 1995, 34(34), , DOI: /bi00034a013 Supporting information for Biochemistry, 1995, 34(34), 10807 10815, DOI: 10.1021/bi00034a013 LESNIK 10807-1081 Terms & Conditions Electronic Supporting Information files are available without a subscription

More information

Biomolecules: lecture 6

Biomolecules: lecture 6 Biomolecules: lecture 6 - to learn the basics on how DNA serves to make RNA = transcription - to learn how the genetic code instructs protein synthesis - to learn the basics on how proteins are synthesized

More information

Chemistry 121 Winter 17

Chemistry 121 Winter 17 Chemistry 121 Winter 17 Introduction to Organic Chemistry and Biochemistry Instructor Dr. Upali Siriwardane (Ph.D. Ohio State) E-mail: upali@latech.edu Office: 311 Carson Taylor Hall ; Phone: 318-257-4941;

More information

G+C content. 1 Introduction. 2 Chromosomes Topology & Counts. 3 Genome size. 4 Replichores and gene orientation. 5 Chirochores.

G+C content. 1 Introduction. 2 Chromosomes Topology & Counts. 3 Genome size. 4 Replichores and gene orientation. 5 Chirochores. 1 Introduction 2 Chromosomes Topology & Counts 3 Genome size 4 Replichores and gene orientation 5 Chirochores 6 7 Codon usage 121 marc.bailly-bechet@univ-lyon1.fr Bacterial genome structures Introduction

More information

Degenerate Code. Translation. trna. The Code is Degenerate trna / Proofreading Ribosomes Translation Mechanism

Degenerate Code. Translation. trna. The Code is Degenerate trna / Proofreading Ribosomes Translation Mechanism Translation The Code is Degenerate trna / Proofreading Ribosomes Translation Mechanism Degenerate Code There are 64 possible codon triplets There are 20 naturally-encoding amino acids Several codons specify

More information

(a) Which enzyme(s) make 5' - 3' phosphodiester bonds? (c) Which enzyme(s) make single-strand breaks in DNA backbones?

(a) Which enzyme(s) make 5' - 3' phosphodiester bonds? (c) Which enzyme(s) make single-strand breaks in DNA backbones? EXAMPLE QUESTIONS AND ANSWERS 1. Topoisomerase does which one of the following? (a) Makes new DNA strands. (b) Unties knots in DNA molecules. (c) Joins the ends of double-stranded DNA molecules. (d) Is

More information

Level 2 Biology, 2017

Level 2 Biology, 2017 91159 911590 2SUPERVISOR S Level 2 Biology, 2017 91159 Demonstrate understanding of gene expression 2.00 p.m. Wednesday 22 November 2017 Credits: Four Achievement Achievement with Merit Achievement with

More information

UNIT I RNA AND TYPES R.KAVITHA,M.PHARM LECTURER DEPARTMENT OF PHARMACEUTICS SRM COLLEGE OF PHARMACY KATTANKULATUR

UNIT I RNA AND TYPES R.KAVITHA,M.PHARM LECTURER DEPARTMENT OF PHARMACEUTICS SRM COLLEGE OF PHARMACY KATTANKULATUR UNIT I RNA AND TYPES R.KAVITHA,M.PHARM LECTURER DEPARTMENT OF PHARMACEUTICS SRM COLLEGE OF PHARMACY KATTANKULATUR RNA, as previously mentioned, is an acronym for ribonucleic acid. There are many forms

More information

DNA sentences. How are proteins coded for by DNA? Materials. Teacher instructions. Student instructions. Reflection

DNA sentences. How are proteins coded for by DNA? Materials. Teacher instructions. Student instructions. Reflection DNA sentences How are proteins coded for by DNA? Deoxyribonucleic acid (DNA) is the molecule of life. DNA is one of the most recognizable nucleic acids, a double-stranded helix. The process by which DNA

More information

Just one nucleotide! Exploring the effects of random single nucleotide mutations

Just one nucleotide! Exploring the effects of random single nucleotide mutations Dr. Beatriz Gonzalez In-Class Worksheet Name: Learning Objectives: Just one nucleotide! Exploring the effects of random single nucleotide mutations Given a coding DNA sequence, determine the mrna Based

More information

Supplemental Data Supplemental Figure 1.

Supplemental Data Supplemental Figure 1. Supplemental Data Supplemental Figure 1. Silique arrangement in the wild-type, jhs, and complemented lines. Wild-type (WT) (A), the jhs1 mutant (B,C), and the jhs1 mutant complemented with JHS1 (Com) (D)

More information

ANCIENT BACTERIA? 250 million years later, scientists revive life forms

ANCIENT BACTERIA? 250 million years later, scientists revive life forms ANCIENT BACTERIA? 250 million years later, scientists revive life forms Thursday, October 19, 2000 U.S. researchers say they have revived bacteria that have been dormant for more then 250 million years,

More information

Biomolecules: lecture 6

Biomolecules: lecture 6 Biomolecules: lecture 6 - to learn the basics on how DNA serves to make RNA = transcription - to learn how the genetic code instructs protein synthesis - to learn the basics on how proteins are synthesized

More information

Hes6. PPARα. PPARγ HNF4 CD36

Hes6. PPARα. PPARγ HNF4 CD36 SUPPLEMENTARY INFORMATION Supplementary Table Positions and Sequences of ChIP primers -63 AGGTCACTGCCA -79 AGGTCTGCTGTG Hes6-0067 GGGCAaAGTTCA ACOT -395 GGGGCAgAGTTCA PPARα -309 GGCTCAaAGTTCAaGTTCA CPTa

More information

BIOSTAT516 Statistical Methods in Genetic Epidemiology Autumn 2005 Handout1, prepared by Kathleen Kerr and Stephanie Monks

BIOSTAT516 Statistical Methods in Genetic Epidemiology Autumn 2005 Handout1, prepared by Kathleen Kerr and Stephanie Monks Rationale of Genetic Studies Some goals of genetic studies include: to identify the genetic causes of phenotypic variation develop genetic tests o benefits to individuals and to society are still uncertain

More information

Chapter 3: Information Storage and Transfer in Life

Chapter 3: Information Storage and Transfer in Life Chapter 3: Information Storage and Transfer in Life The trapped scientist examples are great for conceptual purposes, but they do not accurately model how information in life changes because they do not

More information

Lezione 10. Bioinformatica. Mauro Ceccanti e Alberto Paoluzzi

Lezione 10. Bioinformatica. Mauro Ceccanti e Alberto Paoluzzi Lezione 10 Bioinformatica Mauro Ceccanti e Alberto Paoluzzi Dip. Informatica e Automazione Università Roma Tre Dip. Medicina Clinica Università La Sapienza Lezione 10: Sintesi proteica Synthesis of proteins

More information

CONVERGENT EVOLUTION. Def n acquisition of some biological trait but different lineages

CONVERGENT EVOLUTION. Def n acquisition of some biological trait but different lineages CONVERGENT EVOLUTION Def n acquisition of some biological trait but different lineages Living Rock cactus Baseball plant THE QUESTION From common ancestor or independent acquisition? By Lineage By Convergence

More information

Evolution of protein coding sequences

Evolution of protein coding sequences Evolution of protein coding sequences Kinds of nucleo-de subs-tu-ons Given 2 nucleo-de sequences, how their similari-es and differences arose from a common ancestor? We assume A the common ancestor: Single

More information

Primer Design Workshop. École d'été en géné-que des champignons 2012 Dr. Will Hintz University of Victoria

Primer Design Workshop. École d'été en géné-que des champignons 2012 Dr. Will Hintz University of Victoria Primer Design Workshop École d'été en géné-que des champignons 2012 Dr. Will Hintz University of Victoria Scenario You have discovered the presence of a novel endophy5c organism living inside the cells

More information

iclicker Question #28B - after lecture Shown below is a diagram of a typical eukaryotic gene which encodes a protein: start codon stop codon 2 3

iclicker Question #28B - after lecture Shown below is a diagram of a typical eukaryotic gene which encodes a protein: start codon stop codon 2 3 Bio 111 Handout for Molecular Biology 4 This handout contains: Today s iclicker Questions Information on Exam 3 Solutions Fall 2008 Exam 3 iclicker Question #28A - before lecture Which of the following

More information

7.016 Problem Set 3. 1 st Pedigree

7.016 Problem Set 3. 1 st Pedigree 7.016 Problem Set 3 Question 1 The following human pedigree shows the inheritance pattern of a specific disease within a family. Assume that the individuals marrying into the family for all generations

More information

Genomics and Gene Recognition Genes and Blue Genes

Genomics and Gene Recognition Genes and Blue Genes Genomics and Gene Recognition Genes and Blue Genes November 1, 2004 Prokaryotic Gene Structure prokaryotes are simplest free-living organisms studying prokaryotes can give us a sense what is the minimum

More information

Deoxyribonucleic Acid DNA. Structure of DNA. Structure of DNA. Nucleotide. Nucleotides 5/13/2013

Deoxyribonucleic Acid DNA. Structure of DNA. Structure of DNA. Nucleotide. Nucleotides 5/13/2013 Deoxyribonucleic Acid DNA The Secret of Life DNA is the molecule responsible for controlling the activities of the cell It is the hereditary molecule DNA directs the production of protein In 1953, Watson

More information

Expression of Recombinant Proteins

Expression of Recombinant Proteins Expression of Recombinant Proteins Uses of Cloned Genes sequencing reagents (eg, probes) protein production insufficient natural quantities modify/mutagenesis library screening Expression Vector Features

More information

www.lessonplansinc.com Topic: Gene Mutations WS Summary: Students will learn about frame shift mutations and base substitution mutations. Goals & Objectives: Students will be able to demonstrate how mutations

More information

Chapter 10. The Structure and Function of DNA. Lectures by Edward J. Zalisko

Chapter 10. The Structure and Function of DNA. Lectures by Edward J. Zalisko Chapter 10 The Structure and Function of DNA PowerPoint Lectures for Campbell Essential Biology, Fifth Edition, and Campbell Essential Biology with Physiology, Fourth Edition Eric J. Simon, Jean L. Dickey,

More information

Figure S1. Characterization of the irx9l-1 mutant. (A) Diagram of the Arabidopsis IRX9L gene drawn based on information from TAIR (the Arabidopsis

Figure S1. Characterization of the irx9l-1 mutant. (A) Diagram of the Arabidopsis IRX9L gene drawn based on information from TAIR (the Arabidopsis 1 2 3 4 5 6 7 8 9 10 11 12 Figure S1. Characterization of the irx9l-1 mutant. (A) Diagram of the Arabidopsis IRX9L gene drawn based on information from TAIR (the Arabidopsis Information Research). Exons

More information

Human Gene,cs 06: Gene Expression. Diversity of cell types. How do cells become different? 9/19/11. neuron

Human Gene,cs 06: Gene Expression. Diversity of cell types. How do cells become different? 9/19/11. neuron Human Gene,cs 06: Gene Expression 20110920 Diversity of cell types neuron How do cells become different? A. Each type of cell has different DNA in its nucleus B. Each cell has different genes C. Each type

More information

Describe the features of a gene which enable it to code for a particular protein.

Describe the features of a gene which enable it to code for a particular protein. 1. Answers should be written in continuous prose. Credit will be given for biological accuracy, the organisation and presentation of the information and the way in which the answer is expressed. Cancer

More information

Protein Structure Analysis

Protein Structure Analysis BINF 731 Protein Structure Analysis http://binf.gmu.edu/vaisman/binf731/ Iosif Vaisman COMPUTATIONAL BIOLOGY COMPUTATIONAL STRUCTURAL BIOLOGY COMPUTATIONAL MOLECULAR BIOLOGY BIOINFORMATICS STRUCTURAL BIOINFORMATICS

More information

Important points from last time

Important points from last time Important points from last time Subst. rates differ site by site Fit a Γ dist. to variation in rates Γ generally has two parameters but in biology we fix one to ensure a mean equal to 1 and the other parameter

More information

Y-chromosomal haplogroup typing Using SBE reaction

Y-chromosomal haplogroup typing Using SBE reaction Schematic of multiplex PCR followed by SBE reaction Multiplex PCR Exo SAP purification SBE reaction 5 A 3 ddatp ddgtp 3 T 5 A G 3 T 5 3 5 G C 5 3 3 C 5 ddttp ddctp 5 T 3 T C 3 A 5 3 A 5 5 C 3 3 G 5 3 G

More information

National PHL TB DST Reference Center PSQ Reporting Language Table of Contents

National PHL TB DST Reference Center PSQ Reporting Language Table of Contents PSQ Reporting Language Table of Contents Document Page Number PSQ for Rifampin 2-6 Comparison table for rpob Codon Numbering 2 rpob mutation list (new numbering system) 3-5 rpob interpretations 6 PSQ for

More information

Arabidopsis actin depolymerizing factor AtADF4 mediates defense signal transduction triggered by the Pseudomonas syringae effector AvrPphB

Arabidopsis actin depolymerizing factor AtADF4 mediates defense signal transduction triggered by the Pseudomonas syringae effector AvrPphB Arabidopsis actin depolymerizing factor mediates defense signal transduction triggered by the Pseudomonas syringae effector AvrPphB Files in this Data Supplement: Supplemental Table S1 Supplemental Table

More information

Fishy Amino Acid Codon. UUU Phe UCU Ser UAU Tyr UGU Cys. UUC Phe UCC Ser UAC Tyr UGC Cys. UUA Leu UCA Ser UAA Stop UGA Stop

Fishy Amino Acid Codon. UUU Phe UCU Ser UAU Tyr UGU Cys. UUC Phe UCC Ser UAC Tyr UGC Cys. UUA Leu UCA Ser UAA Stop UGA Stop Fishy Code Slips Fish 1 GGTTATAGAGGTACTACC Fish 2 GGCTTCAGAGGTACTACC Fish 3 CATAGCAGAGGTACTACC Fish 4 GGTTATTCTGTCTTATTG Fish 5 GGCTTCTCTGTCTTATTG Fish 6 CATAGCGCTGCAACTACC Fishy Amino Acid Codon UUU Phe

More information

FROM DNA TO GENETIC GENEALOGY Stephen P. Morse

FROM DNA TO GENETIC GENEALOGY Stephen P. Morse 1. GENES, CHROMOSOMES, AND DNA Chromosomes FROM DNA TO GENETIC GENEALOGY Stephen P. Morse (steve@stevemorse.org) Every human cell = 46 chromosomes (1 to 22 in pairs, 2 sex chromosomes) Male: sex chromosomes

More information

Multiplexing Genome-scale Engineering

Multiplexing Genome-scale Engineering Multiplexing Genome-scale Engineering Harris Wang, Ph.D. Department of Systems Biology Department of Pathology & Cell Biology http://wanglab.c2b2.columbia.edu Rise of Genomics An Expanding Toolbox Esvelt

More information

Supplemental Data. mir156-regulated SPL Transcription. Factors Define an Endogenous Flowering. Pathway in Arabidopsis thaliana

Supplemental Data. mir156-regulated SPL Transcription. Factors Define an Endogenous Flowering. Pathway in Arabidopsis thaliana Cell, Volume 138 Supplemental Data mir156-regulated SPL Transcription Factors Define an Endogenous Flowering Pathway in Arabidopsis thaliana Jia-Wei Wang, Benjamin Czech, and Detlef Weigel Table S1. Interaction

More information

Bioinformatics CSM17 Week 6: DNA, RNA and Proteins

Bioinformatics CSM17 Week 6: DNA, RNA and Proteins Bioinformatics CSM17 Week 6: DNA, RNA and Proteins Transcription (reading the DNA template) Translation (RNA -> protein) Protein Structure Transcription - reading the data enzyme - transcriptase gene opens

More information

Supplementary Materials for

Supplementary Materials for www.sciencesignaling.org/cgi/content/full/10/494/eaan6284/dc1 Supplementary Materials for Activation of master virulence regulator PhoP in acidic ph requires the Salmonella-specific protein UgtL Jeongjoon

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Exam Chapter 17 Genes to Proteins Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. The following questions refer to Figure 17.1, a simple metabolic

More information

Supplementary Information. Construction of Lasso Peptide Fusion Proteins

Supplementary Information. Construction of Lasso Peptide Fusion Proteins Supplementary Information Construction of Lasso Peptide Fusion Proteins Chuhan Zong 1, Mikhail O. Maksimov 2, A. James Link 2,3 * Departments of 1 Chemistry, 2 Chemical and Biological Engineering, and

More information

SAY IT WITH DNA: Protein Synthesis Activity by Larry Flammer

SAY IT WITH DNA: Protein Synthesis Activity by Larry Flammer TEACHER S GUIDE SAY IT WITH DNA: Protein Synthesis Activity by Larry Flammer SYNOPSIS This activity uses the metaphor of decoding a secret message for the Protein Synthesis process. Students teach themselves

More information

Supplement 1: Sequences of Capture Probes. Capture probes were /5AmMC6/CTG TAG GTG CGG GTG GAC GTA GTC

Supplement 1: Sequences of Capture Probes. Capture probes were /5AmMC6/CTG TAG GTG CGG GTG GAC GTA GTC Supplementary Appendixes Supplement 1: Sequences of Capture Probes. Capture probes were /5AmMC6/CTG TAG GTG CGG GTG GAC GTA GTC ACG TAG CTC CGG CTG GA-3 for vimentin, /5AmMC6/TCC CTC GCG CGT GGC TTC CGC

More information

Table S1. Bacterial strains (Related to Results and Experimental Procedures)

Table S1. Bacterial strains (Related to Results and Experimental Procedures) Table S1. Bacterial strains (Related to Results and Experimental Procedures) Strain number Relevant genotype Source or reference 1045 AB1157 Graham Walker (Donnelly and Walker, 1989) 2458 3084 (MG1655)

More information

Supplemental material

Supplemental material Supplemental material Diversity of O-antigen repeat-unit structures can account for the substantial sequence variation of Wzx translocases Yaoqin Hong and Peter R. Reeves School of Molecular Bioscience,

More information

Dierks Supplementary Fig. S1

Dierks Supplementary Fig. S1 Dierks Supplementary Fig. S1 ITK SYK PH TH K42R wt K42R (kinase deficient) R29C E42K Y323F R29C E42K Y323F (reduced phospholipid binding) (enhanced phospholipid binding) (reduced Cbl binding) E42K Y323F

More information

PCR analysis was performed to show the presence and the integrity of the var1csa and var-

PCR analysis was performed to show the presence and the integrity of the var1csa and var- Supplementary information: Methods: Table S1: Primer Name Nucleotide sequence (5-3 ) DBL3-F tcc ccg cgg agt gaa aca tca tgt gac tg DBL3-R gac tag ttt ctt tca ata aat cac tcg c DBL5-F cgc cct agg tgc ttc

More information

UNIT (12) MOLECULES OF LIFE: NUCLEIC ACIDS

UNIT (12) MOLECULES OF LIFE: NUCLEIC ACIDS UNIT (12) MOLECULES OF LIFE: NUCLEIC ACIDS Nucleic acids are extremely large molecules that were first isolated from the nuclei of cells. Two kinds of nucleic acids are found in cells: RNA (ribonucleic

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Ch 17 Practice Questions MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) Garrod hypothesized that "inborn errors of metabolism" such as alkaptonuria

More information

Add 5µl of 3N NaOH to DNA sample (final concentration 0.3N NaOH).

Add 5µl of 3N NaOH to DNA sample (final concentration 0.3N NaOH). Bisulfite Treatment of DNA Dilute DNA sample to 2µg DNA in 50µl ddh 2 O. Add 5µl of 3N NaOH to DNA sample (final concentration 0.3N NaOH). Incubate in a 37ºC water bath for 30 minutes. To 55µl samples

More information

INTRODUCTION TO THE MOLECULAR GENETICS OF THE COLOR MUTATIONS IN ROCK POCKET MICE

INTRODUCTION TO THE MOLECULAR GENETICS OF THE COLOR MUTATIONS IN ROCK POCKET MICE The Making of the The Fittest: Making of the Fittest Natural Selection Natural and Adaptation Selection and Adaptation Educator Materials TEACHER MATERIALS INTRODUCTION TO THE MOLECULAR GENETICS OF THE

More information

Supporting Information. Copyright Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, 2006

Supporting Information. Copyright Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, 2006 Supporting Information Copyright Wiley-VCH Verlag GmbH & Co. KGaA, 69451 Weinheim, 2006 Copyright Wiley-VCH Verlag GmbH & Co. KGaA, 69451 Weinheim, 2006 Supporting Information for Expanding the Genetic

More information

Supplementary Figure 1A A404 Cells +/- Retinoic Acid

Supplementary Figure 1A A404 Cells +/- Retinoic Acid Supplementary Figure 1A A44 Cells +/- Retinoic Acid 1 1 H3 Lys4 di-methylation SM-actin VEC cfos (-) RA (+) RA 14 1 1 8 6 4 H3 Lys79 di-methylation SM-actin VEC cfos (-) RA (+) RA Supplementary Figure

More information

DNA Begins the Process

DNA Begins the Process Biology I D N A DNA contains genes, sequences of nucleotide bases These Genes code for polypeptides (proteins) Proteins are used to build cells and do much of the work inside cells DNA Begins the Process

More information

Honors packet Instructions

Honors packet Instructions Honors packet Instructions The following are guidelines in order for you to receive FULL credit for this bio packet: 1. Read and take notes on the packet in full 2. Answer the multiple choice questions

More information

for Programmed Chemo-enzymatic Synthesis of Antigenic Oligosaccharides

for Programmed Chemo-enzymatic Synthesis of Antigenic Oligosaccharides Supporting Information Design of α-transglucosidases of Controlled Specificity for Programmed Chemo-enzymatic Synthesis of Antigenic Oligosaccharides Elise Champion ±,,,, Isabelle André ±,,, Claire Moulis

More information

strain devoid of the aox1 gene [1]. Thus, the identification of AOX1 in the intracellular

strain devoid of the aox1 gene [1]. Thus, the identification of AOX1 in the intracellular Additional file 2 Identification of AOX1 in P. pastoris GS115 with a Mut s phenotype Results and Discussion The HBsAg producing strain was originally identified as a Mut s (methanol utilization slow) strain

More information

ΔPDD1 x ΔPDD1. ΔPDD1 x wild type. 70 kd Pdd1. Pdd3

ΔPDD1 x ΔPDD1. ΔPDD1 x wild type. 70 kd Pdd1. Pdd3 Supplemental Fig. S1 ΔPDD1 x wild type ΔPDD1 x ΔPDD1 70 kd Pdd1 50 kd 37 kd Pdd3 Supplemental Fig. S1. ΔPDD1 strains express no detectable Pdd1 protein. Western blot analysis of whole-protein extracts

More information

RPA-AB RPA-C Supplemental Figure S1: SDS-PAGE stained with Coomassie Blue after protein purification.

RPA-AB RPA-C Supplemental Figure S1: SDS-PAGE stained with Coomassie Blue after protein purification. RPA-AB RPA-C (a) (b) (c) (d) (e) (f) Supplemental Figure S: SDS-PAGE stained with Coomassie Blue after protein purification. (a) RPA; (b) RPA-AB; (c) RPA-CDE; (d) RPA-CDE core; (e) RPA-DE; and (f) RPA-C

More information

Cat. # Product Size DS130 DynaExpress TA PCR Cloning Kit (ptakn-2) 20 reactions Box 1 (-20 ) ptakn-2 Vector, linearized 20 µl (50 ng/µl) 1

Cat. # Product Size DS130 DynaExpress TA PCR Cloning Kit (ptakn-2) 20 reactions Box 1 (-20 ) ptakn-2 Vector, linearized 20 µl (50 ng/µl) 1 Product Name: Kit Component TA PCR Cloning Kit (ptakn-2) Cat. # Product Size DS130 TA PCR Cloning Kit (ptakn-2) 20 reactions Box 1 (-20 ) ptakn-2 Vector, linearized 20 µl (50 ng/µl) 1 2 Ligation Buffer

More information

Gene synthesis by circular assembly amplification

Gene synthesis by circular assembly amplification Gene synthesis by circular assembly amplification Duhee Bang & George M Church Supplementary figures and text: Supplementary Figure 1. Dpo4 gene (1.05kb) construction by various methods. Supplementary

More information

Genomic Sequence Analysis using Electron-Ion Interaction

Genomic Sequence Analysis using Electron-Ion Interaction University of Aizu, Graduation Thesis. March, 25 s1985 1 Genomic Sequence Analysis using Electron-Ion Interaction Potential Masumi Kobayashi s1985 Supervised by Hiroshi Toyoizumi Abstract This paper proposes

More information

Supplemental Data. Bennett et al. (2010). Plant Cell /tpc

Supplemental Data. Bennett et al. (2010). Plant Cell /tpc BRN1 ---------MSSSNGGVPPGFRFHPTDEELLHYYLKKKISYEKFEMEVIKEVDLNKIEPWDLQDRCKIGSTPQNEWYFFSHKDRKYPTGS 81 BRN2 --------MGSSSNGGVPPGFRFHPTDEELLHYYLKKKISYQKFEMEVIREVDLNKLEPWDLQERCKIGSTPQNEWYFFSHKDRKYPTGS 82 SMB

More information

evaluated with UAS CLB eliciting UAS CIT -N Libraries increase in the

evaluated with UAS CLB eliciting UAS CIT -N Libraries increase in the Supplementary Figures Supplementary Figure 1: Promoter scaffold library assemblies. Many ensembless of libraries were evaluated in this work. As a legend, the box outline color in top half of the figure

More information

2

2 1 2 3 4 5 6 7 Supplemental Table 1. Magnaporthe oryzae strains generated in this study. Strain background Genotype Strain name Description Guy-11 H1:RFP H1:RFP Strain expressing Histone H1- encoding gene

More information

ENZYMES AND METABOLIC PATHWAYS

ENZYMES AND METABOLIC PATHWAYS ENZYMES AND METABOLIC PATHWAYS This document is licensed under the Attribution-NonCommercial-ShareAlike 2.5 Italy license, available at http://creativecommons.org/licenses/by-nc-sa/2.5/it/ 1. Enzymes build

More information

Folding simulation: self-organization of 4-helix bundle protein. yellow = helical turns

Folding simulation: self-organization of 4-helix bundle protein. yellow = helical turns Folding simulation: self-organization of 4-helix bundle protein yellow = helical turns Protein structure Protein: heteropolymer chain made of amino acid residues R + H 3 N - C - COO - H φ ψ Chain of amino

More information

TRANSCRIPTION. Renáta Schipp

TRANSCRIPTION. Renáta Schipp TRANSCRIPTION Renáta Schipp Gene expression Gene expression: - is the process by which information from a gene is used for the synthesis of gene products. These products are proteins, but in the case of

More information

A Circular Code in the Protein Coding Genes of Mitochondria

A Circular Code in the Protein Coding Genes of Mitochondria J. theor. Biol. (1997) 189, 273 290 A Circular Code in the Protein Coding Genes of Mitochondria DIDIER G. ARQUE` S* AND CHRISTIAN J. MICHEL *Equipe de Biologie The orique, Universite de Marne la Valle

More information

Supporting Information

Supporting Information Supporting Information Barderas et al. 10.1073/pnas.0801221105 SI Text: Docking of gastrin to Constructed scfv Models Interactive predocking of the 4-WL-5 motif into the central pocket observed in the

More information

Supplemental Table 1. Mutant ADAMTS3 alleles detected in HEK293T clone 4C2. WT CCTGTCACTTTGGTTGATAGC MVLLSLWLIAAALVEVR

Supplemental Table 1. Mutant ADAMTS3 alleles detected in HEK293T clone 4C2. WT CCTGTCACTTTGGTTGATAGC MVLLSLWLIAAALVEVR Supplemental Dataset Supplemental Table 1. Mutant ADAMTS3 alleles detected in HEK293T clone 4C2. DNA sequence Amino acid sequence WT CCTGTCACTTTGGTTGATAGC MVLLSLWLIAAALVEVR Allele 1 CCTGTC------------------GATAGC

More information

Genes and Proteins. Objectives

Genes and Proteins. Objectives Genes and Proteins Lecture 15 Objectives At the end of this series of lectures, you should be able to: Define terms. Explain the central dogma of molecular biology. Describe the locations, reactants, and

More information

MacBlunt PCR Cloning Kit Manual

MacBlunt PCR Cloning Kit Manual MacBlunt PCR Cloning Kit Manual Shipping and Storage MacBlunt PCR Cloning Kits are shipped on dry ice. Each kit contains a box with cloning reagents and an attached bag with Eco-Blue Competent Cells (optional).

More information

p-adic GENETIC CODE AND ULTRAMETRIC BIOINFORMATION

p-adic GENETIC CODE AND ULTRAMETRIC BIOINFORMATION p-adic GENETIC CODE AND ULTRAMETRIC BIOINFORMATION Branko Dragovich http://www.phy.bg.ac.yu/ dragovich dragovich@ipb.ac.rs Institute of Physics, Mathematical Institute SASA, Belgrade 6th International

More information

Enduring Understanding

Enduring Understanding Enduring Understanding The processing of genetic information is imperfect and is a source of genetic variation. Objective: You will be able to create a visual representation to illustrate how changes in

More information