On the origin of the genetic code: pattern and processes

Size: px
Start display at page:

Download "On the origin of the genetic code: pattern and processes"

Transcription

1 On the origin of the genetic code: pattern and processes Eörs Szathmáry Collegium Budapest Eötvös University

2 The major transitions (1995) * * * * These transitions are regarded to be difficult

3 The structure of the genetic code Amino acids in the same column of the genetic code are more related to each other physicochemically

4 Central nucleotide and amino acid properties

5 Woese (1967, 1971) Amino acid polar requirement correlates well with the columnar arrangement Direct interaction with the anticodon loop was postulated Attention called to modified nucleosides in the anticodon loop!

6 Constraints on codon reshuffling for statistical investigations

7 Significance of some patterns The genetic code is one in a million for polarity (Freeland and Hurst)

8 Amino acid biosynthesis in E. coli

9 Biosynthetic relationships

10 Stereochemistry I am particularly struck by the difficulty of getting [the genetic code] started unless there is some basis in the specificity of interaction between nucleic acids and amino acids or polypeptide to build upon. (Woese 1967) Nonetheless, it is clear that at some early stage in the evolution of life the direct association of amino acids with polynucleotides, which was later to evolve into the genetic code, must have begun. (Orgel 1968)

11 Biosynthesis and amino acid chemistry BOTH have shaped the code The code within the codons (Taylor & Coates, 1989): first letter correlates with biosynthesis, second letter with chemisty Szathmáry, E. & Zintzaras, E. (1992) A statistical test of hypotheses on the organization and origin of the genetic code. J. Mol. Evol. 35,

12 The RNA world may have preceded the RNA-protein world Easy optimisation (with limits) Many artifical ribozymes (BUT no replicase) Coenzymes Ribozyme doing peptidyl transfer during protein synthesis in ribosomes Amino acyl-trna synthetases are NOT the most ancient proteins 20 residues are better than 4 in catalysis

13 A complex metabolic coenzyme

14 One early suggestions (1990) Selection for replicated ribozymes Using affinity chromatography columns Using transition state analogues

15 A even earlier suggestion (1989) Szathmáry, E. (1989) The emergence, maintenance, and transitions of the earliest evolutionary units. Oxf. Surv. Evol. Biol. 6, Generate aptamers against different amino acids See whether there is specific binding at all Search for codonic or anticodonic sequence accumulation in the binding sites Draw conclusions

16 Amino acid binding RNA aptamers (Yarus, 2009)

17 Conclusions from the Yarus experiments Using recent sequences for 337 independent binding sites directed to 8 amino acids and containing 18,551 nucleotides in all, we show a highly robust connection between amino acids and cognate coding triplets within their RNA binding sites. The apparent probability (P) that cognate triplets around these sites are unrelated to binding sites is 5.3 x for codons overall, and 2.19 x for cognate anticodons. Therefore, some triplets are unequivocally localized near their present amino acids.

18 Forces may have changed in strength

19 Important ribozyme activities for the emergence of translation

20 But what could have been the initial advantage? Evolution has no foresight Should confer some immediate advantage Concept of exaptation (preadaptation) Coded protein enzymes as culmination of a protracted phase of evolution Origin of the genetic code and protein synthesis are not necessariy the same thing Evolution is opportunistic

21 Replicability and enzyme action are in conflict An independent catalytic alphabet is a cool idea provided you can get to it

22 Coding coenzyme handle (CCH) hypothesis for the origin of the genetic code (1990, 1993, ) This mechanism works only if binding between the kissing hairpins follows the umambiguous, but degenerate principle of the current genetic code

23 Piecemeal vocabulary extension Amino acids are added and utilised one by one No vicious error feedback as far as amino acids are not involved (at the beginning) in the functioning of synthetase ribozymes Coding precedes translation

24 Why indirect binding through basepairing? N number of amino acids M number of metabolic enzymes If metabolic ribozymes specifically and directly bind amino acid cofactors, then 2 * M functionalities are needed (costly) In contrast, only M specific synthetases are needed If M >> N, then choose synthetases Bind cofactors by their handles through base pairing (cheap)

25 Cofactor use by aptamers

26 CCH generates a prediction Missed previously Footprints of the evolution for catalytic potential should be found in codon clustering Kun et al. (2008) In: M. Barbieri (eds) Codes of Life. Springer, Berlin.

27 Catalytic propensity and properties

28 Amino acid catalytic propensities Joint work with Kun, Pongor and Jordán (2008)

29 Significance of some patterns

30 Highest catalytic and β-turn propensities

31 Substitution connectivity based on the BLOSUM matrix

32 A minimalist enzyme Chorismate mutase built of 9 amino acids only

33 Towards a historical reconstruction Two startling discoveries by Ohno and the two Rodins trnas with complementary codons are also complementary for the whole anticodon loop and the at the second base at the 5 end Genes for the two synthetase classes are also complementary (ancient double-strand coding)

34 Jointattemptatsynthesis

35 The ancient tetrad? Green arrows indicate dual complentariry

36 The acceptor stem was is presumably the most ancient part First amino acids were charged to a hairpin including the acceptor stem (as also suggested by many previous authors) But why?

37 An interesting paper

38 Koonin s suggestion for selective retention in a leaky protocell Amino acids are small polar molecules that easily diffuse. Accumulation of amino acids (along with other important molecules) within a compartment, obviously, would be beneficial. Small, amino-acid-binding RNAs (T) evolve under the pressure of selection for amino acid accumulation

39 Pathway innovation and pathway retention

40 A ribozyme acylating trna with Phe (Saito et al. 2001)

41 A thought-provoking paper

42 Twotypesoflinking GlutotRNA

43 An ancient genetic code at the anticodon? In eubacteria, a paralog of glutamyl-trna synthetase, which lacks the trna-binding domain, was found to aminoacylate trnaasp not on the 30-hydroxyl group of the acceptor stem but on a cyclopentene diol of the modified nucleoside queuosine present at the wobble position of anticodon loop. This modified nucleoside might be a relic of an ancient code.

44 Nucleosides modified by amino acids in trna molecules (from Grosjean et al., 2004). (a) N6threonylcarbamoyladenosine (hn6 A: N6- hydroxynorvalylcarba moyladenosine); (b) N6- glycylcarbamoyladenosine; (c) glutamylqueuosine

45 Amino acids in trna modifications At positions 34 and 37 of trna

46 Modified queuosine

47 ALL members of the first tetrad are metabolically important C U GLY GCC GUC ASP ALA GGC GAC VAL G A RNA synthesis (Gly, Asp) Coenzyma A synthesis (Val, Asp, Ala) Asp is also catalytically important

48 Complementary anticodons and parallel expansion into the catalytic and structural worlds As a rule, pairs of complementary triplets encode the functionally very different amino acids, most often those with a high catalytic propensity (His, Asp, Glu, Lys, Arg) contrasted with those with a low catalytic but high structural (beta sheet building) propensity (Val, Ile, Leu, Phe, Ala)

49 Second tetrad, catalytic expansion, and the formation of the anticodon loop? C U ARG GCG GUG HIS ALA CGC CAC VAL G A

50 Protein buildup on RNA scaffolds Shrinking RNA cores Selection for peptidyl-transferase activity Initially, proteins were strongly associated with RNAs Could not fold by themselves

51 Proteins from pieces

52 Superfolds

53 Ribosomal proteins cannot fold by themselves

54 From RNA to protein Gradual lose of scaffold Amino acids to peptides to proteins Not very well worked out

55 Thanks for your attention!