Chapter 3 Expression of Genes

Size: px
Start display at page:

Download "Chapter 3 Expression of Genes"

Transcription

1 Part I Relationship between Cells and Genetic Information A protein gene is a piece of DNA that determines the amino acid sequence of a protein, and the synthesis of a protein based on genetic information is called gene expression. Specifically, genetic information refers to the nucleotide (base) sequence of DNA strands; mrna is synthesized using DNA as a template, by which the genetic information of the DNA is transcribed as the sequence of the mrna. The base sequence of mrna is defined as a series of genetic codes, and such codes in mrna are used to synthesize proteins on cytoplasmic granules called ribosomes. One piece of mrna code corresponds to one amino acid, and these amino acids are linked together following the order of the codes, thus synthesizing proteins. Protein synthesis is called translation, since the mechanism can be compared to a translation from information in one language (i.e., a base sequence) to that in another language (i.e., an amino acid sequence). 03 I. Transcription and Translation of Genes Central Dogma The genetic information of a protein specifically refers to the information that determines its primary structure (i.e., the amino acid sequence) and, at the substance level, to the nucleotide sequence (the base sequence) of DNA. The genetic information of DNA is copied to mrna (messenger RNA; see part II of this chapter) molecules synthesized using DNA as a template and is consequently converted to the amino acid sequence of a protein. The concept of genetic information flowing in one direction from DNA to mrna to proteins is called the central dogma of molecular biology (Fig. 3-1). This concept is a basic principle common to all organisms both prokaryotes and eukaryotes including bacteria and humans. mrna synthesis means the transcription of the genetic information in DNA (the base sequence) to the base sequence of RNA, while protein synthesis refers to the translation of information in one language (the sequence of mrna) into that of another (the amino acid sequence) (Fig. 3-2). Figure 3-1 Central dogma CSLS / THE UNIVERSITY OF TOKYO 47

2 Genetic Codes Specifically, the genetic information of DNA is its base sequence. On the other hand, the genetic code is defined as the base sequence of mrna transcribed using DNA as a template, and a particular three-base sequence known as a codon corresponds to one amino acid. There are 4 3 = 64 codons, encoding 20 amino acids (Fig. 3-2). As an example, 5 -AUG-3 a code of mrna corresponds to the amino acid methionine (Fig. 3-3). A protein consisting of 400 amino acids linked together is derived from 1,200 DNA bases and 1,200 mrna bases. Here, the 1,200-base section of the entire DNA is the gene for this protein. AUG encodes methionine in addition to serving as the initiation codon for protein synthesis. Following the determination of the first amino acid, the next three-base sequence determines the next amino acid, and so on. Figure 3-2 includes three termination codons. When protein synthesis proceeds to the termination codon (which does not encode any amino acids), the protein synthesis is terminated. The region between the initiation codon and the termination codon is called the coding region. Sense Strand of DNA Of the double strands of DNA, the one complementary to the template strand for RNA synthesis is called the sense strand (Fig. 3-3). The base sequence of mrna can be obtained by replacing the Ts in the sense strand with Us. Codons on the sense strand are almost the same as those on mrna, e.g., ATG on the sense Figure 3-2 Genetic code table Figure 3-3 Genes and genetic information CSLS / THE UNIVERSITY OF TOKYO 48

3 strand corresponds to AUG on mrna. The strand of the double-stranded genome DNA that serves as the sense strand depends on the gene. Gene Expression Gene function refers to the process of mrna synthesis based on genetic information and the resulting production of a protein this is also described as gene expression. The situation of a gene not functioning is described as the suppression of gene expression. 03 II. Transcription of Genes Types of RNA RNA is roughly classified into the three types (Table 3-1) of mrna, rrna and trna, which are involved in protein synthesis using genetic information (Fig. 3-4). A number of small RNA molecules, such as snrna, are also known (see Fig. 3-10). Table 3-1 Types of RNA CSLS / THE UNIVERSITY OF TOKYO 49

4 Figure 3-4 Roles of the three RNA types mrna mrna (messenger RNA) transcribes the genetic information for the primary structure of a protein and carries the information to the protein synthesis system. Types of mrna are as numerous as those of genes, and since the size of proteins varies, the range of mrna sizes also varies greatly. mrna makes up less than 1% of the total amount of RNA in a cell. rrna *1 S: The Svedberg unit, which describes the rate of sedimentation by ultracentrifuge. Although higher molecular weights result in higher S values, there is no linear relationship between the molecular weight and the S value (e.g., a doubling of the molecular weight does not mean a doubled S value). rrna (ribosomal RNA) in prokaryotes consists of the three types of 5S, 16S and 23S*1 (Table 3-1), while that in eukaryotes includes 5S, 5.5S, 18S and 28S. Approximately 95% of the RNA found in a cell is rrna, and it forms complexes called ribosomes with many proteins. Ribosomes function as sites for protein synthesis. trna trna (transfer RNA) is a small type of RNA with a size of around 4S, consisting of less than 100 nucleotides. There are 40 to 50 known types, which represent CSLS / THE UNIVERSITY OF TOKYO 50

5 approximately 5% of RNA overall. During protein synthesis, they bind to amino acids and carry them to the site of protein synthesis. A particular trna binds to a particular amino acid; for example, trna bound with phenylalanine is denoted as trna Phe, and trna bound with methionine is denoted as trna Met. Cells contain these RNA types, and RNA not translated into the primary structure of a protein (i.e., that other than mrna) is collectively referred to as non-coding RNA. Although the regulation characteristics of RNA synthesis vary among RNA types, the basic method of synthesis is common to all RNA types. 03 Characteristics of Transcription In DNA synthesis, the entire sequence of the parental DNA strand is accurately copied from one end to the other, and the entire DNA region is passed on from the parent cell to the daughter cells. On the other hand, RNA transcription occurs for gene regions only, rather than for the whole DNA (Fig. 3-5). The DNA region shown in Figure 3-5A has five genes (a to e), meaning that five mrnas are synthesized. Genes c and d in the figure show that the other DNA strand is being read in the reverse direction. In fact, RNA transcription occurs on sections containing information for the amino acid sequence (i.e., the coding regions) as well as the extra portions on both sides of the sections (Fig. 3-5B). A promoter is a DNA region to which RNA polymerase attaches (discussed later). (A) (B) Figure 3-5 Transcription of RNA CSLS / THE UNIVERSITY OF TOKYO 51

6 Basics of Transcription In transcription, using one of the DNA double strands as a template, nucleotides are connected one by one to form base pairs with the template bases. Since Us are used in RNA in place of the Ts in DNA, base pairs are formed between the As of DNA and the Us of RNA. The RNA synthesis reaction is simply described as: [NMP] n + NTP [NMP] n+1 + PPi The direction of RNA synthesis is, like that of DNA synthesis, from 5 to 3, and is opposite to the direction of the template DNA; in other words, the directions of the RNA strand and the template DNA are opposite. In DNA synthesis, no reaction occurs when n =1, meaning that a primer is required; with RNA synthesis, however, reaction occurs at n =1, so no primer is needed. RNA Polymerase *2 Subunits: When multiple proteins form a complex and collectively exhibit functions, each constituent protein is called a subunit. The term subunit does not necessarily refer to one protein; as an example, a subunit of a ribosome is a complex of RNA and several dozen proteins. In E. coli, RNA is synthesized by one type of RNA polymerase. Eukaryotes have at least three RNA polymerase types (I, II and III), each playing a different role. Type I is mainly involved in the synthesis of rrna, Type II is involved in the synthesis of mrna, and Type III is involved in the synthesis of trna. Compared with prokaryote enzymes, their structure is much more complex and consists of more than 10 subunits*2. Upstream and Downstream of Genes, and Base Sequence Numbers Relative to a gene s position, the direction toward and beyond the point at which RNA synthesis is initiated is called the upstream of a gene, and the direction toward which RNA is synthesized is called the downstream of a gene (Fig. 3-5). The first base at which RNA synthesis is initiated is No. 1, followed by Nos. 2, 3, 4, etc. in the downstream direction. Conversely, in the direction toward the upstream of a gene (the promoter side), the bases are numbered -1, -2, -3, etc. It should be noted that there is no base numbered zero. Additionally, base No. 1 is where RNA synthesis starts, but it is not the first codon encoding an amino acid of a protein; the first codon is located further downstream (at a base with a larger number). CSLS / THE UNIVERSITY OF TOKYO 52

7 Binding of Polymerase to a Promoter Promoter regions in eukaryotes (Fig. 3-6) include unique base sequences recognized by general transcription factors*3 (proteins that promote transcription) such as TATA boxes and CCAAT boxes*4. Prokaryotes have several types of protein called σ-factors that promote the binding of RNA polymerase to a particular promoter. The processes generally referred to as recognition and binding mean that a protein and a DNA molecule come close and, if their surface structures fit, connect with each other. Eukaryotes have a more complex mechanism with a higher number of gene types and many kinds of promoter sequence; however, the basic mechanism of eukaryotes and prokaryotes is similar in that both have frequently used basic promoter sequences to which transcription factors bind, thereby recruiting RNA polymerase. *3 Basal transcription factors: Proteins needed when RNA polymerase binds to a promoter (transelements). These factors bind to a particular sequence on the promoter, which recruits RNA polymerase to DNA, thereby initiating RNA synthesis. *4 TATA and CCAAT boxes: DNA sequences in eukaryotes that are necessary when basal transcription factors bind to DNA. TATA boxes have the sequence TATAAA, while CCAAT boxes have the sequence GGCCAATCT, and transcription factors that recognize one of the two boxes exist. Many other sequences also exist. 03 Roles of Promoters and the Initiation of Transcription An important role of promoters is to determine the binding location and direction of RNA polymerase. Since RNA is synthesized in the 5 to 3 direction, the template DNA strand is read by RNA polymerase in the 3 to 5 direction. The basal transcription factors and RNA polymerase complex bound to DNA separates the DNA double strands, initiating RNA synthesis. Elongation of Transcription The 5 -triphosphate of the first nucleotide in the synthesized RNA strand stays connected, and the 5 end of the RNA is either pppa or pppg. The basal transcription factors involved in the binding of RNA polymerase do not move with the enzyme, and only the polymerase moves on DNA. The RNA strand synthesized is immediately released from DNA, and the two unwound DNA strands reform their original double strand on completion of RNA synthesis. Figure 3-6 Structure of a promoter CSLS / THE UNIVERSITY OF TOKYO 53

8 Column The Possible Existence of More Non-coding RNA in Eukaryotes Until very recently, it had been commonly thought that only genes (rather than other regions) were transcribed from the genome DNA. This idea is correct for prokaryotes, since their genome DNA mostly consists of genes. Although one human cell contains approximately 1,000 times as much DNA as E. coli, humans have only five times as many genes as E. coli; genes represent only a small portion of genome DNA in humans. However, it was recently reported as a major revelation that most of the genome of eukaryotes is transcribed. According to a paper published in Science magazine in September 2005, a comprehensive analysis of transcription products in mice, in which the transcription origin of 4.5 million RNA molecules was investigated, showed that they were transcribed from 70 % of the entire DNA. It was surprising that so many RNA molecules were transcribed from DNA regions previously not thought to be genes; if this was the case in mice, it would also hold true for humans. This RNA is believed to be non-coding RNA functioning as expression regulation RNA. Termination of Transcription A DNA sequence that signals the termination of transcription in prokaryotes is called a terminator. A number of RNA dissociation mechanisms are known, such as synthesized RNA that forms a double-stranded shape (or hairpin structure) within itself, thereby detaching from the template DNA. The termination mechanism in eukaryotes is not clearly understood. Genes for rrna and trna The number of functioning (transcribed) genes in E. coli is over 2,000, a figure believed to be much higher in humans. Based on the information of mrna transcribed from these genes, proteins gene products are all synthesized on ribosomes. There must therefore be a large number of protein synthesis systems in order to deal with the translation of the numerous mrna molecules generated by all genes. This requires a large number of rrna and trna molecules within cells. These molecules are therefore actively transcribed, and there are many genes for them. It can be said that the genes have been amplified; this is a mechanism with finality. CSLS / THE UNIVERSITY OF TOKYO 54

9 Ⅲ. Post-transcription Modification (A) RNA Cleavage Several types of rrna are transcribed as a single strand of precursor RNA (Fig. 3-7A). Each rrna molecule is cleaved following transcription in a process known as trimming. This is a method with finality, since there are cases in which all types of rrna are needed. Some trna molecules are also included in and synthesized from single RNA strands. Like rrna, several trna molecules are synthesized together as one precursor RNA strand and cleaved (Fig. 3-7B). However, one strand does not contain all types of trna. One of the enzymes involved in trimming is RNaseP (which contains RNA), and in eubacteria, the enzymatic action is played by RNA rather than by proteins. More detailed trimming processes (not shown in Fig. 3-7) also occur. (B) 03 Figure 3-7 Cleavage of RNA precursors into rrna and trna Column RNA Replication and Reverse Transcription Genes are found in DNA from bacteria to humans, but some viruses, including phages, have single-stranded or double-stranded RNA as a gene carrier. Such viruses have a gene that encodes RNA replicase, and synthesize RNA using RNA as a template after infection. Some viruses that cause cancer in humans and other animals have double-stranded RNA as a gene carrier. Such viruses are called retroviruses; using reverse transcriptase, they reverse-transcribe DNA using RNA as a template after infection. The viral DNA generated is integrated to the DNA of the host cell, replicated with the host s cellular DNA and passed on to the progeny cells. At the same time, proteins are continuously produced by the oncogene integrated, turning the cell cancerous. CSLS / THE UNIVERSITY OF TOKYO 55

10 Base Modification *5 Minor bases: In addition to the five main base types, high molecular DNA and RNA also contain other bases. These are known as minor bases, and are thought to play important functions despite their small quantities. After the formation of an RNA strand, rrna and trna undergo base modification. mrna also undergoes base modification, but to a lesser extent. The main modification made to rrna is methylation, in which the methyl group of S-adenosylmethionine is transferred. trna receives many types of base modification, and compounds known as minor bases*5 (such as pseudouridine, 4-thiouridine, thymidine, dihydrouridine and 1-methylguanosine) are generated as a result. Minor bases are necessary for trna to function. Another important modification type is the enzymatic addition of a three-base sequence, CCA, to the 3 end of trna in eukaryotes. trna in prokaryotes has CCA at the 3 end from the beginning; the 3 end of trna in both eukaryotes and prokaryotes therefore has CCA. mrna Processing in Eukaryotes mrna in eukaryotes is first transcribed from DNA as pre-mrna (Fig. 3-8), which becomes complete mrna after going through the following three main changes (processing): Capping (Cap Formation) A special structure with a phosphate bond between 5 and 5 is added to the 5 end of mrna. No other nucleotide bonds that form a bond between 5 and 5 are known. This is called the cap structure (Fig. 3-9), and is essential when mrna is used for protein synthesis and binds to ribosomes via special proteins attached to the cap. mrna in prokaryotes, which have no cap structure, would not function in the protein synthesis apparatus of eukaryotes. Addition of Poly-A Figure 3-8 Modification leading up to the completion of mrna in eukaryotic cells A poly-a signal sequence (e.g., AAUAAA) is located near the 3 end of premrna, and following enzymatic cleavage at a site approximately 20 bases downstream from the sequence, many As (adenosines) are added to the end. The number of nucleotides added can be from several dozen to thousands. This synthesis does not require a template. Since even mrna molecules of the same type have different poly-a lengths, the size of the complete mrna varies. It is suggested that the poly-a strand is necessary for the initiation of protein synthesis CSLS / THE UNIVERSITY OF TOKYO 56

11 03 Figure 3-9 Cap structure and the inhibition of mrna degradation. In an experiment, mrna with poly-a can be condensed and purified by attaching it in a complementary fashion to oligo dts attached to the surface of a fine resin. Splicing The most remarkable part of processing in eukaryotes is splicing. Genes in eukaryotes consist of exons, which contain amino acid sequence information (codes), and introns, which do not, and pre-mrna containing both exons and introns is first synthesized. In splicing, only introns are removed from the premrna synthesized, and the exons remaining are connected to form mrna (Fig. 3-8). To connect two distant exons generated by the removal of introns, a spliceosome a complex containing non-coding snrna (small nuclear RNA) binds near the two breakpoints, pulling them together (Fig. 3-10). During the process of splicing, some introns may be retained, for example, or two introns and one exon between them may be removed altogether, thereby creating several types of complete mrna. This mechanism is called alternative splicing. As a result, several types of protein with different amino acid sequences can be synthesized from such mrna, each exhibiting different functions. By exploiting this mechanism, one gene can produce several protein types, thus functioning as Figure 3-10 Mechanism of splicing CSLS / THE UNIVERSITY OF TOKYO 57

12 (A) if it were several genes. This is the reason for the estimation that the number of human genes is virtually 100,000, despite the actual number being 26,000. IV. Translation of Genes Synthesis of Aminoacyl-tRNA trna (B) Most trna types have intramolecular double strands, and schematically consist of three loops and one stem (Fig. 3-11A). A fourth variable loop may or may not exist. An anticodon loop has an anticodon sequence that is paired with a codon (code) on mrna. The 3 end has the CCA-3 sequence common to all trna, and amino acids are bound to the sequence. The actual three-dimensional structure is compact, as shown in Figure 3-11B. Aminoacyl-tRNA Synthetase Figure 3-11 Structure of trna (A) Cloverleaf model (B) Tertiary structure The correct pairing of each amino acid with trna that has a corresponding anticodon is critically important in the same way that dictionaries are important in translation in accurately translating a base sequence to an amino acid sequence. Aminoacyl-tRNA synthetase binds an amino acid to the correct trna to form aminoacyl-trna. This enzyme reacts by recognizing amino acids and, at the same time, the trna structure. Formylmethionyl-tRNA *6 Formylation: The binding of the formyl group (-CHO). In the formylation of methionine, one of the hydrogen atoms of the amino group of methionine is replaced with the formyl group. When a protein is synthesized, the first amino acid code is for methionine. However, in prokaryotes, the first amino acid is formylmethionine formylated*6 methionine. The two types of trna that bind to methionine are trna Met and trna fmet, and following the binding of methionine to trna fmet (i.e., the formation of Met-tRNA fmet ) formylation occurs to create fmet-trna fmet. Met-tRNA Met is not formylated, and is used for the methionines inside proteins. In eukaryotes, formylation does not occur, and Met-tRNA Met may be used for the first or subsequent methionines. CSLS / THE UNIVERSITY OF TOKYO 58

13 Column Structure of E. Coli Ribosomes Ribosomes are schematically drawn in the shape of a flattened snowman consisting of large and small subunits, but their actual shape is complex (Column Fig. 3-1). The size of ribosome RNA in eukaryotes is larger, with a higher number of proteins. 03 Column Figure 3-1 Ribosomes of E. coli Ribosomes What is a ribosome? Ribosomes are the places where protein synthesis occurs. In both prokaryotes and eukaryotes, a ribosome is a pairing of one large subunit and one small subunit (Column Fig. 3-1). Each subunit is a complex consisting of rrna and many types of protein. Since RNA is larger and the number of protein types is CSLS / THE UNIVERSITY OF TOKYO 59

14 higher in eukaryotes, prokaryotes have 70S ribosomes and eukaryotes have 80S ribosomes. Ribosomes contain many types of protein, but quantitatively they are rich in RNA, with proteins covering only parts of the surface (two thirds are RNA and one third is proteins). In particular, the space between the two subunits the place where protein synthesis occurs consists almost entirely of RNA. Ribosomes bind to mrna and interact with aminoacyl-trna, activating both of them, and perform enzymatic reactions such as cleaving ester bonds between trna and peptide chains and forming peptide bonds between peptides and amino acids. These important functions of ribosomes are carried out by rrna. Ribosomes are considered to be ribozymes consisting of RNA with enzymatic activity. Column Initiation of Translation The initiation of translation is in fact a complex reaction (Column Fig. 3-2). First, initiation factors (IFs) dissociate the large and small subunits, and the small subunit is bound to Met-tRNA (fmet-trna in prokaryotes) with mrna and IFs attached. The large subunit then binds to it, forming a complex consisting of a ribosome, mrna and Met-tRNA. This is known as an initiation complex. Column Figure 3-2 Formation of initiation complexes in eukaryotic organisms An e as the first letter in the names of initiation factors represents eukaryotes. CSLS / THE UNIVERSITY OF TOKYO 60

15 Column Elongation of Peptide Chains The elongation reaction of peptide chains is also complex (Column Fig. 3-3). At the onset of this reaction, Met-tRNA is situated at the P site, and aminoacyl-trna bound with an elongation factor (EF) binds to the A site. The ester bonds between the first amino acid (methionine) and trna are then cut, and the methionine and an amino acid at the A site form peptide bonds. Reactions then occur from the action of other elongation factors, the vacated trna is transferred to the E site, and the peptidyl-trna*7 is transferred to the P site. At the same time, the ribosome moves three bases on the mrna. When the vacated trna is detached from the E site, the ribosome returns to its original state. In this way, the elongation reaction is continuously repeated. During this process, two GTP molecules are hydrolyzed for each amino acid added. The first amino acid of proteins synthesized is always methionine, which is synthesized from the side of the free amino group (N-terminal) to the carboxyl group (C-terminal). *7 Peptidyl-tRNA: trna that is bound with peptides and formed on ribosomes in the process of producing proteins. 03 Column Figure 3-3 Elongation Reaction of Peptide Chains CSLS / THE UNIVERSITY OF TOKYO 61

16 Structure of mrna Figure 3-12 Structure of mrna In both prokaryotes and eukaryotes, the functional structure of mrna schematically consists of a 5 non-coding region, a coding region and a 3 non-coding region arranged side by side (Fig. 3-12). AUG, the translation initiation codon, is located at the first part of the coding region. The 5 non-coding region in prokaryotes often contains a sequence complementary to 16S rrna, to which ribosomes bind. In eukaryotes, no such sequence exists; instead, there are proteins that bind to the cap structure at the 5 end, forming an appropriate bond between mrna and ribosomes. The 3 non-coding region of mrna in eukaryotes has a sequence related to the degradation rate of mrna. The coding region between the two non-coding regions encodes an amino acid sequence of a protein. Column Termination of Translation The termination reaction occurs when the next codon of mrna is the termination codon (Column Fig. 3-4). A releasing factor (RF) involved in this reaction enters the A site corresponding to the termination codon of mrna. A peptidyl-trna moves to the P site, and a peptide and trna are hydrolyzed by the enzymatic action of rrna, which releases the protein and subsequently trna and mrna, thereby terminating translation. Column Figure 3-4 Termination reaction of translation CSLS / THE UNIVERSITY OF TOKYO 62

17 Protein Synthesis During protein synthesis, a reaction continuously occurs in which three bases of mrna (a codon) and three bases of aminoacyl-trna (an anticodon) form pairs. In this reaction, amino acids are arranged by trna in accordance with the order of the mrna codes (Fig. 3-13), the amino acids and trna are dissociated and the amino acids are linked. Through this process, amino acids are connected following the order of the mrna codes. The series of reactions that occur on ribosomes is known as translation. One strand of mrna has multiple ribosomes attached that concurrently synthesize proteins, and longer strands of mrna have more ribosomes attached to them. Clusters of ribosomes bound to mrna are called polysomes (or polyribosomes), and cells that actively synthesize proteins have many polysomes. The rate at which amino acids are linked is thought to be around 20 per second in prokaryotes. Assuming that the average molecular weight of amino acids is 114, one minute is needed to synthesize a protein with a molecular weight of approximately 135,000. This means that most proteins, with their molecular weights being around 50,000, are generated within 30 seconds. The rate is slower in eukaryotes at around two amino acids per second. See the Column for more details on the rather complex processes of initiation, elongation and termination of translation. 03 Figure 3-13 Schamatic diagram of protein synthesis CSLS / THE UNIVERSITY OF TOKYO 63

18 Coordination between Transcription and Translation Figure 3-14 Coupling of transcription and translation in prokaryotes In prokaryotes, protein synthesis is initiated while mrna is still being synthesized. In genes, the reactions shown in Figure 3-14 occur; before the completion of an mrna molecule, the synthesis of other mrna molecules is initiated in series, and protein synthesis using the mrna is initiated. In some cases, mrna degradation is initiated from the 5 end while mrna synthesis is still occurring at the 3 end. The half-life of mrna in E. coli is therefore generally very short, lasting only several minutes. In eukaryotes, on the other hand, the pre-mrna synthesized undergoes processing, and the complete mrna is transferred from the nucleus into the cytoplasm. The mrna is not necessarily used for protein synthesis immediately in the cytoplasm. Thus, eukaryotes differ greatly from prokaryotes in that transcription and translation are spatially and temporally separated (see Fig. 4-3 in Chapter 4). In eukaryotes, the half-life of mrna varies from only several minutes to very long periods. Column 21st Amino Acid The amino acid known as cysteine has a sulfur atom. Another amino acid called selenocystein, which has selenium instead of sulfur, is found in various enzymes and proteins (albeit in minute amounts), and plays a number of important roles. Selenocystein is not produced by cysteine being modified with selenium after its integration into proteins. Selenocysteinyl-tRNA sec is formed by a special converting enzyme, and is used to synthesize proteins on ribosomes. trna sec recognizes UGA on mrna normally a termination codon. A special sequence is located immediately downstream of UGA, and selenocysteinyl-trna sec is used by the action of special translation factors to generate proteins that contain selenocysteine. CSLS / THE UNIVERSITY OF TOKYO 64

19 Summary Protein genes are DNA regions that determine the amino acid sequences of proteins. rrna and trna are categorized as non-coding RNA without protein information, and function as RNA. This RNA is also transcribed from rrna and trna genes on DNA. The information unit of protein genes is a three-base sequence on a DNA strand, which corresponds to one amino acid. Gene function (or gene expression) refers to the process by which RNA is synthesized based on genetic information and a protein is then synthesized using the RNA information. In RNA synthesis, the base sequence of a gene is read using one of the DNA double strands as a template, thereby synthesizing an RNA strand with a sequence complementary to the DNA strand. RNA synthesis is known as transcription because DNA sequence information is copied to the RNA sequence. An enzyme that synthesizes RNA is called an RNA polymerase. A DNA region to which RNA polymerase binds is called a promoter. The roles of promoters are to recruit RNA polymerase and determine the initiation point of transcription and the DNA strand to be used as the template. A sequence involving a gene with protein information is transcribed to an mrna sequence. mrna types are as numerous as gene types, and both correspond to the number of protein types. In prokaryotes, transcription and translation are coupled. mrna in eukaryotes is first transcribed in the form of precursors called pre-mrna, which undergo modifications in the nucleus (such as capping, poly-a addition and splicing) to become complete mrna. This is then transferred to the cytoplasm, where it is used for protein synthesis. A three-base set corresponding to one amino acid on an mrna strand transcribed from DNA is called a gene codon. The first AUG on mrna encodes methionine, and is also the initiation codon for protein synthesis. Protein synthesis occurs on granules called ribosomes. A three-base set of mrna (i.e., a codon) and a three-base set of anticodon aminoacyl-trna (an amino acid bound with trna) form pairs on a ribosome, through which amino acids are arranged by trna in accordance with the order of the mrna codes. A reaction is continuously repeated in which an amino acid and trna are dissociated and amino acids are then connected together. As a result, amino acids are linked following the order of the mrna codes, thus forming proteins. There are three types of code (known as termination codons) that do not correspond to any amino acids. Protein synthesis stops at one of the termination codons on mrna. 03 CSLS / THE UNIVERSITY OF TOKYO 65

20 Problems [1] Briefly explain the characteristics shared and not shared by the processes of replication and transcription. [2] Using the terms below, briefly outline how the genetic information of genomic DNA is eventually converted to proteins: Codon, messenger RNA (mrna), amino acid, aminoacyltrna (AA-tRNA), nucleus, cytoplasm, ribosome. [3] Briefly outline the process that occurs during mrna synthesis in eukaryotes between the transcription and completion of mrna. [4] In humans, one chromosomal genome set is inherited from each of the mother and father. 1) Consider a case in which a DNA sequence inherited from a parent has a mutation, resulting in illness. If a disease manifests itself as a phenotype only when mutation occurs in both copies of a gene derived from both the mother and father, is it a dominant or recessive hereditary disease? 2) For the scenario in 1), it is assumed that the mutation occurs at a site that encodes an amino acid. What kind of mutation is generally considered to occur in this case? 3) If a disease manifests itself as a phenotype when a mutation occurs in one of the two copies of a gene (derived from either the mother or the father), is it a dominant or recessive hereditary disease? 4) For the scenario in 3), it is assumed that the mutation occurs at a site that encodes an amino acid. What kind of mutation is generally considered to occur in this case? (Answers on p.251) CSLS / THE UNIVERSITY OF TOKYO 66