Basic concepts of molecular biology

Size: px
Start display at page:

Download "Basic concepts of molecular biology"

Transcription

1 Basic concepts of molecular biology Gabriella Trucco

2 Life The main actors in the chemistry of life are molecules called proteins nucleic acids Proteins: many different kinds: Structural proteins, enzymes, Two kinds of nucleic acids: ribonucleic acid, abbreviated by RNA, and deoxyribonucleic acid, or DNA

3 Proteins chain of simpler molecules called amino acids Aminoacid one central carbon atom hydrogen atom amino group (NH2) carboxy group (COOH) side chain

4 Proteins There are 20 common amino acids (aa s); two systems of abbreviations are used: 3-letter-code and 1-letter-code. We usually use the 1-letter-code. alanine Ala A arginine Arg R asparagine Asn N aspartic acid Asp D cysteine Cys C glutamine Gln Q glutamic acid Glu E glycine Gly G histidine His H isoleucine Ile I leucine Leu L lysine Lys K methionine Met M phenylalanine Phe F proline Pro P serine Ser S threonine Thr T tryptophan Trp W tyrosine Tyr Y valine Val V

5 Proteins Peptide bonds Backbone Primary, secondary, tertiary, and quaternary structures of proteins

6 DNA: nucleotides Double chain Simple chain: strand Basic unit (nucleotide): Sugar Phosphate group Nitrogenous base 4 bases: A C G T: adenine, cytosine, guanine, thymine (bases, nucleotides)

7 DNA: nucleotides 5...AACAGTACCATGCTAGGTCAATCGA TTGTCATGGTACGATCCAGTTAGCT...5 orientation (read from 5 to 3 end) length measured in bp (base pairs) double stranded, the two strands are antiparallel A - T and C - G complementary (Watson-Crick pairs) DNA as string of letters, each letter representing a base. "string-view" of DNA: one of the strings on top of the other reverse complementation: (ACCTG)rc = CAGGT

8 RNA: nucleotides Like DNA, except: 4 characters: A C U G: adenine, cytosine, uracil, guanine (U instead of T)

9 Genes Certain contiguous stretches along DNA encode information for building proteins, but others do not This stretch is known as a gene Protein: chain of amino acids Triplets of nucleotides specify each amino acid Each nucleotide triplet is called a codon Genetic code: table that gives the correspondence between each possible triplet and each amino acid

10 The genetic code Degeneracy of the genetic code: 64 codons but only 20 aa s plus stop codon Silent mutations: if third position mutates, this often does not alter the aa

11 The central dogma of molecular biology How the information in the DNA results in proteins Promoter AUG Transcription: copy of the gene made on an RNA molecule (messenger RNA, or mrna ). This resulting RNA will have exactly the same sequence as one of the strands of the gene but substituting U for T The strand identical to the mrna is called coding strand The other strand (the one which is used for the transcription) is called template strand

12 The central dogma of molecular biology Protein synthesis inside cellular structures called ribosomes trna: transfer RNA Translation: trna make the connection between a codon and the specific amino acid this codon codes for. Each trna molecule has, on one side, a conformation that has high affinity for a specific codon and, on the other side, a conformation that binds easily to the corresponding amino acid. As the messenger RNA passes through the ribosome, a trna matching the current codon binds to it, bringing the corresponding amino acid When a STOP codon appears, no trna associates with it, and the synthesis ends

13 Strings in molecular biology Many other problems in molecular biology can be modelled by strings (e.g. gene order, haplotypes,... ) Strings are finite sequences over an alphabet S (also called sequences) DNA (characters: nucleotides) S = {A,C,G,T} RNA (characters: nucleotides) S = {A,C,G,U} Proteins (characters: peptides) S = {A,C,D,E,F,...,W,Y}

14 Reading frames Reading frame: one of the three possible ways of grouping bases to form codons in a DNA or RNA sequence 3 different reading frames for translation: The DNA sequence 5...TATTCGAATCGGC...3 can be translated in 3 different ways, leading to different aa sequences

15 Exercise Translate this DNA sequence according to the 3 different reading frames: 5...TATTCGAATCGGC...3