NEW OUTCOMES IN MUTATION RATES ANALYSIS

Similar documents
Basic concepts of molecular biology

Bioinformatics. ONE Introduction to Biology. Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012

Basic concepts of molecular biology

8/21/2014. From Gene to Protein

ENZYMES AND METABOLIC PATHWAYS

Life History Traits and Genome Structure: Aerobiosis and G+C Content in Bacteria

Ch 10 Molecular Biology of the Gene

Chapter 12: Molecular Biology of the Gene

BIOSTAT516 Statistical Methods in Genetic Epidemiology Autumn 2005 Handout1, prepared by Kathleen Kerr and Stephanie Monks

Lecture for Wednesday. Dr. Prince BIOL 1408

Chapter 8. Microbial Genetics. Lectures prepared by Christine L. Case. Copyright 2010 Pearson Education, Inc.

The neutral theory of molecular evolution

Problem Set Unit The base ratios in the DNA and RNA for an onion (Allium cepa) are given below.

BIOLOGY - CLUTCH CH.17 - GENE EXPRESSION.

Algorithms in Bioinformatics ONE Transcription Translation

Molecular Biology. Biology Review ONE. Protein Factory. Genotype to Phenotype. From DNA to Protein. DNA à RNA à Protein. June 2016

Class XII Chapter 6 Molecular Basis of Inheritance Biology

1. DNA, RNA structure. 2. DNA replication. 3. Transcription, translation

(Very) Basic Molecular Biology

Molecular Basis of Inheritance

Genetics. Chapter 9 - Microbial Genetics. Chromosome. Genes. Topics - Genetics - Flow of Genetics - Regulation - Mutation - Recombination

The Nature of Genes. The Nature of Genes. The Nature of Genes. The Nature of Genes. The Nature of Genes. The Genetic Code. Genes and How They Work

Ch Molecular Biology of the Gene

7.014 Quiz II Handout

7.014 Solution Set 4

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS

Biological Sciences 50 Practice Exam 2

d. reading a DNA strand and making a complementary messenger RNA

4/3/2013. DNA Synthesis Replication of Bacterial DNA Replication of Bacterial DNA

7.014 Problem Set 4 Answers to this problem set are to be turned in. Problem sets will not be accepted late. Solutions will be posted on the web.

BIO 101 : The genetic code and the central dogma

Human Genomics. Higher Human Biology

Genes and How They Work. Chapter 15

G+C content. 1 Introduction. 2 Chromosomes Topology & Counts. 3 Genome size. 4 Replichores and gene orientation. 5 Chirochores.

BIOL 1030 Introduction to Biology: Organismal Biology. Fall 2009 Sections B & D. Steve Thompson:

Chapter 17. From Gene to Protein. Slide 1. Slide 2. Slide 3. Gene Expression. Which of the following is the best example of gene expression? Why?

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Station 1: DNA Structure Use the figure above to answer each of the following questions. 1.This is the subunit that DNA is composed of. 2.

DNA Evolution of knowledge about gene. Contains information about RNAs and proteins. Polynucleotide chains; Double stranded molecule;

BIO303, Genetics Study Guide II for Spring 2007 Semester

The Nature of Genes. The Nature of Genes. Genes and How They Work. Chapter 15/16

Nucleic acids. How DNA works. DNA RNA Protein. DNA (deoxyribonucleic acid) RNA (ribonucleic acid) Central Dogma of Molecular Biology

Chapter 12 Reading Questions

You are genetically unique

STUDY GUIDE SECTION 10-1 Discovery of DNA

1/4/18 NUCLEIC ACIDS. Nucleic Acids. Nucleic Acids. ECS129 Instructor: Patrice Koehl

NUCLEIC ACIDS. ECS129 Instructor: Patrice Koehl

DNA and Biotechnology Form of DNA Form of DNA Form of DNA Form of DNA Replication of DNA Replication of DNA

Lesson 8. DNA: The Molecule of Heredity. Gene Expression and Regulation. Introduction to Life Processes - SCI 102 1

Chapter 13 - Concept Mapping

The study of the structure, function, and interaction of cellular proteins is called. A) bioinformatics B) haplotypics C) genomics D) proteomics

Resources. How to Use This Presentation. Chapter 10. Objectives. Table of Contents. Griffith s Discovery of Transformation. Griffith s Experiments

Applied Practice. Inheritance, Genetic Mutations, and DNA Technology STAAR Biology EOC

Lac Operon contains three structural genes and is controlled by the lac repressor: (1) LacY protein transports lactose into the cell.

DNA: Structure and Function

Genome manipulation by homologous recombination in Drosophila Xiaolin Bi and Yikang S. Rong Date received (in revised form): 9th May 2003

Big Idea 3C Basic Review

Name Date Class. The Central Dogma of Biology

DNA.notebook March 08, DNA Overview

DNA. translation. base pairing rules for DNA Replication. thymine. cytosine. amino acids. The building blocks of proteins are?

Ch 12.DNA and RNA.Biology.Landis

Griffith and Transformation (pages ) 1. What hypothesis did Griffith form from the results of his experiments?

Computational Genomics. Ron Shamir & Roded Sharan Fall

From DNA to Protein: Genotype to Phenotype

Problem Set 8. Answer Key

Bundle 5 Test Review

Unit 6 (Part 1) DNA, RNA, & Protein Synthesis

Alpha-helices, beta-sheets and U-turns within a protein are stabilized by (hint: two words).

Gene function at the level of traits Gene function at the molecular level

I. Gene Expression Figure 1: Central Dogma of Molecular Biology

MBioS 503: Section 1 Chromosome, Gene, Translation, & Transcription. Gene Organization. Genome. Objectives: Gene Organization

Chapter 13. From DNA to Protein

Mutation. ! Mutation occurs when a DNA gene is damaged or changed in such a way as to alter the genetic message carried by that gene

DNA REPLICATION. DNA structure. Semiconservative replication. DNA structure. Origin of replication. Replication bubbles and forks.

MUTANT: A mutant is a strain that has suffered a mutation and exhibits a different phenotype from the parental strain.

DNA & Protein Synthesis UNIT D & E

From Genes to Protein

From DNA to Protein: Genotype to Phenotype

From Genes to Protein

Molecular Genetics Quiz #1 SBI4U K T/I A C TOTAL

Chapter 9: DNA: The Molecule of Heredity

What is molecular evolution? BIOL2007 Molecular Evolution. Modes of molecular evolution. Modes of molecular evolution

Regulation of enzyme synthesis

Chapter 8: DNA and RNA

Self-test Quiz for Chapter 12 (From DNA to Protein: Genotype to Phenotype)

3. What was Griffith trying to figure out?

Chapter Twelve Protein Synthesis: Translation of the Genetic Message

Tala Saleh. Tamer Barakat ... Anas Abu. Humaidan

Chapter 9. Topics - Genetics - Flow of Genetics - Regulation - Mutation - Recombination

MCB 102 University of California, Berkeley August 11 13, Problem Set 8

Enzyme that uses RNA as a template to synthesize a complementary DNA

Higher Human Biology Unit 1: Human Cells Pupils Learning Outcomes

5. Which of the following enzymes catalyze the attachment of an amino acid to trna in the formation of aminoacyl trna?

DNA: STRUCTURE AND REPLICATION

Bacterial Genetics. Stijn van der Veen

What does DNA stand for?

Microbial Genetics. Chapter 8

Fig Ch 17: From Gene to Protein

Chapter 14 Active Reading Guide From Gene to Protein

Storage and Expression of Genetic Information

Transcription:

NEW OUTCOMES IN MUTATION RATES ANALYSIS Valentina Agoni valentina.agoni@unipv.it The GC-content is very variable in different genome regions and species but although many hypothesis we still do not know the reason why. Here we show that a relationship exists with the mutation rate, in particular we noticed a new recurrence in the amino acids coding table. Moreover we analyze recombination frequency taking into account Single Nucleotide Polymorphisms hourglass distribution. 1. Error correction and the GC-content The reason of the high variability in the GC-content between species remains unclear. As mentioned by [1], many hypothesis have been postulated during the years by many groups, such as UV resistance [2][3], thermal adaptation [4][5], directional mutation pressure [6], metabolism[7], the length of coding sequences [8][9], nitrogen fixation [10], aerobiosis [11], environment pressure [12], genome size [13] and DNA polymerase III [14]. What we know for sure is that chromosomal bands reveal different compositions in G+C and that the gene density in the GC-richest fraction of the human genome is known to be higher than in the poorer GC regions [15]. Amino acids coding table evolved to minimize mutations therefore this codons organization suggests C T and A G mutations to be more frequent. In his storic paper Codon anticodon pairing: The wobble hypothesis [16], Crick suggests that, due to the degeneracy of the genetic code, wobble base pairs are fundamental for the proper translation of the genetic code. So that while the first two positions of the triplet are critical, there may be some wobble in the pairing of the third base. In other words, the third base is considered to be less discriminatory than the other two bases. However, if we look at the amino acids coding table (Figure 1) and we focus on amino acids encoded by two different triplets (Phe, Tyr, His, Gln, Asn, Lys, Asp, Glu, Cys, Arg), we can notice that the triplets can be divided into 2 categories: codons which third nucleotide can be U or C (pink circles), and codons which third nucleotide is A or G (purple circles). This cannot be due to casuality. Here we hypothesis error correction to explain the dilemma. The determinant factors could be the number of bonds (each couple includes a nucleotide that forms 3 bonds and a nucleotide that forms 2) or the nucleotide formula (purines or pirimidines). Moreover, we hypothesize polymerase mutations of a 3-bonds-forming-nucleotide to a 2-bondsforming-nucleotide from now we will call this type of 3 2 to be not only the more conserved but the more probable too. The neutral theory of molecular evolution, introduced by Kimura between 60s and 70s [17], states that the vast majority of conserved mutations are neutral. Sueoka quantitative theory of directional mutation pressure [18] indicates that the major cause for a change in DNA G + C content of an organism is the large excess of GC AT mutations over AT GC mutations.

Figure 1. The amino acids coding table. We hypothesize 3 2 mutations to be more probable also because coding regions (which are less tolerant for casual not synonymous mutations) have an higher GC-content. We can hypothesis this to be not limited to the 3 rd nucleotide. It will be probably good to consider this phenomenon for new drugs resistance. Bacteria subjected to a high evolutionary pressure, for example related to antibiotics resistance, should have a lower GC-content. Imagine to study a new drug active against bacteria, knowing if the mutation responsible for resistance belongs to the more probable class it can makes a difference. Another important application can be the cancer predisposition probability. 2. Role of chromosomal recombination in new species generation The frequency of Single Nucleotide Polymorphisms (SNPs) along chromosomes follows the typical hourglass distribution [19] (Figure 2). This typical distribution suggests being centromeres a physical constrain to crossing over the more variable genes in the different species to be at the end of chromosome arms, then the SNPs number to be proportional to the length of the arm. As a consequence this can be the main difference between species: the variability of genes more than the protein characteristics. Moreover we know that euchromatic regions undergo crossing over with an high probability [20]. It is known that CENP-A, a centromere protein, is able to identify centromeres by itself (with or without epigenetic modifications of centromeric DNA). We hypothesize that the process of creating a new specie starts from a non-homologous recombination that leads to centromere repositioning. In this case the discriminant thing between species could be not only the sequence but also the number of SNPs in a particular gene.

Figure 2. SNPs typical hourglass distribution along chromosomes [19]. The minimum length of the annealing region required for crossing over is 25bp while the optimum is about 400bp. In fact, the recombination rates increase three to four orders of magnitude as homology rises from 25 to 411bp [21]. Here we use the combinatorial calculus dispositions with repetitions formula: D n, k n k where D represents the possible sequences of k objects extracted from a set of n objects, to calculate the number of different possible combinations of the 4 nucleotides 400bp long. In our case k=4= TCGA and n=400 therefore we obtain 400 4 =2.5*10 10 combinations. This number is smaller respect to human genome dimension (3*10 9 bp). This means that every combination appears once (on average) in the human genome or, in other words there are 10 7 different sequences of 400bp. We know that in principle even a single mutation in the minimum annealing region of 25bp is sufficient to block recombination. On the other hand there is a high probability to find another region to match with (10 7 ). Although SNPs distribution is not linear along the chromosome (consider euchromatic regions and hourglass distribution ) we can make an appoximation taking into account a mutation rate of 10-6 (as the polymerase error rate is) [22].

This permits to calculate the probability that a mutation occurs in a 25bp sequence: 25/10 6 =2.5*10-5. The reciprocal number represents the dimension (number of items along a certain time interval) of an isolated population before it becomes a new species. We can say species are the discrete elements of the continuous evolutive process. In conclusion we can say that both SNPs and chromosomal recombination play a role in evolution. Evolution is the result of two opposite forces: error correction and mutations. SNPs and chromosomal recombination act in two very different ways. [1] H. Wu, Z. Zhang, S. Hu and J. Yu, On the molecular mechanism of GC content variation among eubacterial genomes, Biology Direct, Vol.7, No. 1, 2012, pp. 2. [2] G. F. Gause, Y. V. Dudnik, A. V. Laiko, E. M. Netyksa, Induction of mutants with altered DNA composition: effect of ultraviolet on Bacterium paracoli, 5099. Science, Vol.157, 1967, pp. 1196-1197. [3] C. E. Singer, B. N. Ames, Sunlight ultraviolet and bacterial DNA base ratios, Science, Vol.170, 1970, pp. 822-825. [4] Y. Kagawa, H. Nojima, N. Nukiwa, M. Ishizuka, T. Nakajima, T. Yasuhara, T. Tanaka, T. Oshima, High guanine plus cytosine content in the third letter of codons of an extreme thermophile. DNA sequence of the isopropylmalate dehydrogenase of Thermus thermophilus, J Biol Chem, Vol.259, 1984, pp. 2956-2960. [5] H. Musto, H. Naya, A. Zavala, H. Romero, F. Alvarez-Valin, G. Bernardi, Correlations between genomic GC levels and optimal growth temperatures in prokaryotes, FEBS Lett, Vol. 573, 2004, pp.:73-77. [6] N. Sueoka, Directional mutation pressure and neutral molecular evolution, Proc Natl Acad Sci USA, Vol.85, 1988, pp. 2653-2657. [7] A. P. Martin, Metabolic rate and directional nucleotide substitution in animal mitochondrial DNA, Mol Biol Evol, Vol.12, 1995, pp. 1124-1131. [8] J. L. Oliver, A. Marin, A relationship between GC content and coding sequence length, J Mol Evol, Vol.43,1996, pp.216-223. [9] X. Xia, Z. Xie, W. H. Li, Effects of GC content and mutational pressure on the lengths of exons and coding sequences, J Mol Evol,Vol.56, 2003, pp.362-370.

[10] C. E. McEwan, D. Gatherer, N. R. McEwan, Nitrogen-fixing aerobic bacteria have higher genomic GC content than non-fixing species within the same genus, Hereditas,Vol.128, 1998, pp.173-178. [11] H. Naya, H. Romero, A. Zavala, B. Alvarez, H. Musto, Aerobiosis increases the genomic guanine plus cytosine content (GC%) in prokaryotes, J Mol Evol, Vol.55, 2002, pp.260-264. [12] K. U. Foerstner, C. von Mering, S. D. Hooper, P. Bork, Environments shape the nucleotide composition of genomes, EMBO Rep,Vol.6, 2005, pp.1208-1213. [13] H. Musto, H. Naya, A. Zavala, H. Romero, F. Alvarez-Valin, G. Bernardi, Genomic GC level, optimal growth temperature, and genome size in prokaryotes, Biochem Biophys Res Commun,Vol.347, 2006, pp.1-3. [14] X. Zhao, Z. Zhang, J. Yan, J. Yu, GC content variability of eubacteria is governed by the pol III alpha subunit, Biochem Biophys Res Commun, Vol.356, 2007, pp.20-25. [15] G. Bernardi, The isocore organization of the human genome, Annu Rev Genet, Vol.23, 1989, pp.637-661. [16] F.H.C. Crick, Codon anticodon pairing: The wobble hypothesis, Journal of Molecular Biology, Vol.19, No 2, 1966, pp.548 555. [17] M. Kimura, "Evolutionary rate at the molecular level", Nature Vol.217, 1968, pp.624-626. [18] N. Sueoka On the genetic basis of variation and heterogeneity of DNA base composition, Proc Natl Acad Sci USA Vol.48, 1962, pp. 582 592. [19] L. D. Sun, F. L. Xiao, Y. Li, et al., Genome-wide association study identifies two new susceptibility loci for atopic dermatitis in the Chinese Han population, Nature Genetics, Vol.43, 2011, pp.690 694. [20] A. B. Barton, M. R. Pekosz, R. S. Kurvathi, and D. B. Kaback, Meiotic Recombination at the Ends of Chromosomes in Saccharomyces cerevisiae, Genetics Society of America, 2008. [21] S. T. Lovett, R. L. Hurley, V. A.Sutera, R. H. Aubuchon, and M. A. Lebedeva, "Crossing over between regions of limited homology in Escherichia coli. RecA-dependent and RecA-independent pathways." Genetics, Vol. 160, No. 3, 2002, pp. 851-9. [22]http://www.costphytoplasma.eu/PDF%20files/Bioinformatics%20school/Foissac%202%20Taq %20errors-r.pdf [23] R. H. Bort,s and J. E. Haber, Length and Distribution of Meiotic Gene Conversion Tracts and Crossovers in Saccharomyces cerevisiae, Genetics Society of America, 1989.