ENZYMES AND METABOLIC PATHWAYS

ENZYMES AND METABOLIC PATHWAYS This document is licensed under the Attribution-NonCommercial-ShareAlike 2.5 Italy license, available at http://creativecommons.org/licenses/by-nc-sa/2.5/it/

1. Enzymes build everything, but who builds the enzymes? Enzymes allow nutrients to be digested; they convert food into energy and new raw materials; they build body structures; they govern all cellular processes. Thus, enzymes may be considered the quintessence of life. But, who builds the enzymes? In other words, how is the primary structure of proteins determined? Quite simply, the specific amino acid sequence of a polypeptide is determined by the nucleotide sequences present on a macromolecule of DNA The particular nucleotide sequence that encodes for a particular protein is the gene for that protein. Thus, a gene is an ordered sequence of nucleotides located in a particular position on a particular chromosome that encodes a specific functional product. Most genes encode for enzymes.

2. Decoding genes requires RNA The DNA in genomes does not direct protein synthesis itself, but instead uses RNA as an intermediary molecule. When the cell needs a particular protein, the nucleotide sequence of the appropriate portion of the immensely long DNA molecule in a chromosome is first copied into RNA, in a process called transcription, which generates a messenger RNA (mrna). It is these RNA copies of segments of the DNA that are used directly as templates to direct the synthesis of the protein (a process called translation). The sequence of nucleotides in the mrna is read three nucleotides at a time. Each group of three, called a triplet codon,, stands for a specific amino acid.

3. DNA RNA protein The flow of genetic information in cells is therefore from DNA to RNA to protein. All cells, from bacteria to humans, express their genetic information in this way a principle so fundamental that it is termed the central dogma of molecular biology. The pathway from DNA to protein. The flow of genetic information from DNA to RNA (transcription) and from RNA to protein (translation) occurs in all living cells.

4. The flow of genetic information The expression of genetic information in all cells is very largely a one-way system: DNA specifies the synthesis of RNA and RNA specifies the synthesis of polypeptides, which subsequently form proteins. The first step, the synthesis of RNA using a DNA-dependent RNA polymerase occurs in the nucleus of eukaryotic cells and, to a limited extent, in mitochondria and chloroplasts, the only other organelles which have a genetic capacity in addition to the nucleus The second step, polypeptide synthesis, occurs in ribosomes, large RNA-protein complexes which are found in the cytoplasm and also in mitochondria and chloroplasts. The RNA molecules which specify polypeptide are known as messenger RNA (mrna). The expression of genetic information follows a colinearity principle: the linear sequence of nucleotides in DNA is decoded to give a linear sequence of nucleotides in RNA which can be decoded in turn in groups of three nucleotides (codons( codons) ) to give a linear sequence of amino acids in the polypeptide product.

5. Information flux in a living cell Structural proteins Carbohydrates Lipids Metabolic pathways Sugars Fatty acids Amino acids Enzymes Nucleotides DNA

6. The genetic code The table of correspondence between triplets and aminoacids is called the genetic code. Since there are four different nucleotides in DNA (or in RNA), there are 4 4 4 = 64 different possible codons. This means that there are more codons than aminoacids. The same genetic code is used by virtually all organisms on the planet. There are some exceptions in which a few of the codons have different meanings. Thus, the information to arrange aminoacids in a specific sequence with a particular function is coded in a sequence of nucleotides in DNA. The process and the machinery that generates a protein from a DNA sequence is called the protein synthesis apparatus

7. Codons and aminoacids Amino Acid ٣-Letter code ١-Letter code Alanine Ala A Arginine Arg R Asparagine Asn N Aspartic acid Asp D Cysteine Cys C Glutamic acid Glu E Glutamine Gln Q Glycine Gly G Histidine His H Isoleucine Ile I Leucine Leu L Lysine Lys K Methionine Met M Phenylalanine Phe F Proline Pro P Serine Ser S Threonine Thr T Tryptophan Trp W Tyrosine Tyr Y Valine Val V

8. Synonymous codons As the number of standard amino acids is 20 and the number of codons is 64, some codons specify for the same amino acid, meaning that the genetic code is redundant. This feature of the code is also called degeneracy. Only tryptophan and methionine have just a single codon each: all others are coded by two, three, four or six codons. Codons that specify the same amino acid are called synonyms.. For example, CAU and CAC are synonyms for histidine. Synonymous codons are not distributed randomly throughout the genetic code. An amino acid specified by two or more synonyms occupies adjacent positions in the genetic-code table, meaning that the first two bases are often sufficient to specify a given amino acid, as exemplified by GUU, GUC, GUA, and GUG (all coding for valine). Thus, most synonyms differ only in the last base of the triplet. When the last base makes a difference, it is its molecular class that matters, i.e. whether it is a purine (A or G, that are often synonyms) or a pirimidine (T or U, that are always synonims).

9. Number of codons and aa frequency The number of codons for a particular amino acid correlates with its frequency of occurrence in proteins Percent of each aa in proteins 0.12 0.1 0.08 0.06 0.04 0.02 Met Trp Asp Glu Lys Asn Phe Gln Tyr His Cys Ile Ala Gly Val Thr Pro Leu Ser Arg Correlation between the relative amino acid count in the the NCBI protein database (more than 1,500,000 sequences in 2003, Y axis) and the number of codons that specify a given amino acid in the universal genetic code (X axis). The logarithmic regression r line is drawn (y y = 0.032 2 ln(x) l + 0.019; r 2 = 0.618) 0 0 1 2 3 4 5 6 7 Number of codons per each aa

10. Overlapping and nonoverlapping codes The example illustrates the difference between an overlapping and a nonoverlapping code for a code with three letters (a triplet code). An overlapping code uses codons that employ some of the same nucleotides as those of other codons for the translation of a single protein. In a nonoverlapping code, a protein is translated by reading codons that do not share any of the same nucleotides.

11. The Code Has No Gaps or Overlaps At the time of the deciphering of the genetic code,, two t possibilities had to be considered The code could be either overlapping or nonoverlapping. An overlapping code would have the advantage that more information could be contained in a smaller space. Second, it was possible that the code had gaps, that is, some sort of punctuation mark or a "spacer" nucleotide or nucleotides between coding groups.in this situation, if an additional base were inserted into a codon, then only that codon would be affected. In an overlapping code, a mutation that changed one base would lead to the changing of three consecutive amino acids in the protein sequence. Genetic evidence, available even before the code had been deciphered, indicated that a single point mutation, that is, a change in a single nucleotide, affected only one amino acid and thus suggested a nonoverlapping code. In a code without punctuations or gaps, insertion of a single nucleotide would result in all codons from that point on being affected. This would in turn change the amino acid sequence in the protein from that point on. Again, genetic evidence ruled out a punctuated code, as base insertions do, in fact, affect the entire protein from the insertion point on, rather than just a single amino acid

12. Overlapping reading frames Although the use of an overlapping code was ruled out by the analysis of single proteins, nothing precluded the use of alternative reading frames to encode amino acids in two different proteins. Example of how the genetic code, a non-overlapping and commaless triplet code, can be read in two different frames. If translation of the mrna sequence shown begins at two different upstream start sites (not shown), then two overlapping reading frames are possible; in this case, the codons are shifted one base to the right in the lower frame. As a result, different amino acids are encoded by the same nucleotide sequence. Altough it is a rare occurrence, instances of such overlaps have been discovered in viral and cellular genes of prokaryotes and eukaryotes.

13. Human mitochondrial ATPase subunits 6 and 8 The genes for human mitochondrial ATPase subunits 6 and 8 are partially overlapping and translated in different reading frames from the same DNA sequence.. ATPase subunit 8 starts at nucleotide position 8366 and ends at 8569 (201 nuclotides = 67 aa); ; ATPase subunit 6 starts at 8527 and ends at 9204 (324 nucleotides = 108 aa).

14. Start and stop codons The starting point of a coding sequence is indicated by a chain initiation codon (start codon), so that the actual frame a protein sequence is translated in is defined by its presence in a sequence. The most common start codon (in the mrna) is AUG, which codes for methionine, so most amino acid chains start with methionine Three codons (UAG, UGA, and UAA) signal the termination of the protein, and are called stop s codons In large-scale l sequence data, presence of start and stop codons is used as a means to identify sequences that may correspond to genes. Nucleotide sequences s are fed into a computer, which then scans all possible reading frames in the search for a gene-sized stretch of DNA, beginning with an ATG initiation codon and ending with a stop codon. These stretches are called open reading frames (ORFs). Any open reading frames of at least 100 codons are candidates for genes.

15. The near universality of the genetic code It was originally thought that the genetic code was the same in all organisms, though,, in reality, the genetic code is not universal. The standard code holds for the vast majority of genes in the vast majority of organisms, but deviations are widespread. In particular, mitochondrial genomes often use non-standard codes. The human nuclear and mitochondrial genetic codes. Four codons (blue boxes) are interpreted differently, with the mitochondrial interpretation given in blue. Thus, the mitochondrial code has 4 stop codons instead of three (UAA, UAG, AGA, AGG), two Trp codons instead of one (UGA, UGG), four Arg codons instead of six (CGA, CGC, CGG, CGU), two Met codons instead of one (AUA, AUG) and two Ile codons instead of three (AUC, AUU).

16. Other non standard codes Non-standard codes are also known in the nuclear genomes of lower eukaryotes. Often a modification is restricted to just a small group of organisms and frequently it involves reassignment of the termination codons. Modifications are less common among prokaryotes but examples s are known in Mycoplasma and Micrococcus species. Deviations from the standard genetic code in nuclear and prokaryotic genomes Codons Standard code Actual code Several protozoa UAA, UAG Stop Gln Euplotes sp. (ciliates) UGA Stop Cys Acetabularia sp. (green algae) UAA, UAG Stop Gln Candida sp. (fungi) CUG Leu Ser Micrococcus sp. (bacteria) AGA Arg Stop AUA Ile Stop Mycoplasma sp. (bacteria) UGA Stop Trp CGG Arg Stop

17. Theories on the origin of the genetic code Despite the variations that exist, the genetic codes used by all known forms of life on Earth are very similar. Therefore, one o can ask the question: is the genetic code completely random, just one of many possible sets of codon-amino acid correspondences that happened to establish itself and became "frozen in" early in evolution? There are three main themes running through the many theories that seek to explain the evolution of the genetic code 1. It has been shown that some amino acids have a selective chemical affinity for the base triplets that code for them. This suggests that the current, complex translation mechanism may be a later development, and that originally, protein sequences were directly templated on base sequences. 2. The standard genetic code that we see today may have grown from a simpler, earlier code through a process of "biosynthetic expansion". Here the idea is that primordial life 'discovered' new useful amino acids and incorporated some of these into the machinery of genetic coding. Some circumstantial evidence has been found to suggest that fewer different amino acids were used in the past than today. 3. The standard genetic code may have been mainly selected through a process that minimize the effects of mutations.

18. Properties of the genetic code In summary, the genetic code is redundant, unambiguous, nonoverlapping, without punctuation, & universal The genetic code is redundant.. There are 64 possible codons but only 20 amino acids. Not all amino acids have an equal number of codons coding for it. For example, tryptophan has one codon while arginine has six codons The code is read in non overlapping groups of three nucleotides. Each group is called a codon. There are no spaces or commas separating neighboring codons. There is a start codon corresponding to the amino acid methionine.. When translation begins the first amino acid is always methionine. After translation this amino acid is removed as part of editing the protein. Once translation has started, methionine can occur in the protein. There are three non coding stop or nonsense codons. These tell the machinery of translation that the end of the protein has been reached. The code is almost universal.. Certain bacteria, mitochondria and protista have minor variations in their codes.