PRINCIPLES OF BIOINFORMATICS

Size: px
Start display at page:

Download "PRINCIPLES OF BIOINFORMATICS"

Transcription

1 PRINCIPLES OF BIOINFORMATICS BIO540/STA569/CSI660, Fall 2010 Lecture 3 (Sep ) Primer on Molecular Biology/Genomics Igor Kuznetsov Department of Epidemiology & Biostatistics Cancer Research Center University at Albany 1

2 Outline DNA, RNA, and proteins Replication Transcription Post-transcriptional processing Genetic code, Translation Reading: Zvelebil & Baum, Chapter 1. 2

3 Prokaryotic Cell 3

4 Eukaryotic Cell 4

5 Major biological macromolecules DNA DeoxyriboNucleic Acid: storage of genetic information. RNA RiboNucleic Acid: transfer of genetic information. Proteins: molecular l machines and structural t elements that perform most functions in the cell. Lipids: the main role is to form membranes. 5

6 DNA DeoxyriboNucleic Acid DNA is a storage of genetic information in the cell. All the information required to make and maintain a living organism is encoded in its DNA. DNA is a linear polymer that consists of two strands that are bound to each other and form what is called the DNA double helix. DNA double helix 6

7 DNA DeoxyriboNucleic Acid Each DNA strand is made of just four different building blocks called nucleotides. Each nucleotide is made up of base + sugar + phosphate. The bases are called Adenine (A), Cytosine (C), Guanine (G), and Thymine (T). The two DNA strands are complementary and interact by base-pairing. Base pair 7

8 DNA Replication converting one DNA molecule into two exact copies 8

9 RNA RiboNucleic Acid RNA is a linear polymer similar to DNA, but RNA is single-stranded. RNA is made of four different nucleotide units: Adenine (A), Cytosine (C), Guanine (G), and Uracil (U). In RNA, A pairs with U and G pairs with C. Usually, RNA molecules are much shorter than DNA and less stable. The main role of RNA is transfer of genetic information from DNA. Transcription the process of synthesizing a single-stranded RNA from a DNA template. physical template for RNA 9

10 DNA sequence CCACCAGATATAATTAAGTAGATCAGAGTAGAAGAAGATGGGAACAAATGAATGGCATGTAGAAAGAAGA GATAGCATAGGTACTGAATCTCCTGTAGCAAGAGAGGTACTTGAAACTGGCACACTCTCTATTGTTGTGC TTGGTGCTTCTGGTGATCTTGCCAAGAAGAAGACTTTTCCTGCACTTTTTCACTTATATAAACAGGAATT GTTGCCACCTGATGAAGTTCACATTTTTGGCTATGCAAGGTCAAAGATCTCCGATGATGAATTGAGAAAC AAATTGCGTAGCTATCTTGTTCCAGAGAAAGGTGCTTCTCCTAAACAGTTAGATGATGTATCAAAGTTTT TACAATTGGTTAAATATGTAAGTGGCCCTTATGATTCTGAAGATGGATTTCGCTTGTTGGATAAAGAGAT TTCAGAGCATGAATATTTGAAAAATAGTAAAGAGGGTTCATCTCGGAGGCTTTTCTATCTTGCACTTCCT RNA sequence AACAUAGAGACAUAGACAGAUAUAGCAUGCAUGGCGAUGCGCGCAUGGCGGUUUCGGCCGAUACGCUAAUAGC 10

11 Proteins are linear hetero-polymers composed of units called amino acids 11

12 Amino Acids are the building blocks of proteins. There are 20 amino acids. Amino acids can be grouped by their chemical properties. Amino acids are not planar! 12

13 Protein synthesis 13

14 Levels of protein structure 14

15 The Central Dogma of Molecular Biology 15

16 The genetic code instructions on how to convert RNA sequence into protein sequence. The translation of RNA bases occurs in non-overlapping sets of three bases called codons. aug ccc aag M P K There are three possible ways (called Open Reading Frames, ORFs) to translate any RNA sequence, depending on which base is chosen as the start. 16

17 The standard genetic code The AUG codon codes for both amino acid Methionine (M) and serves as the translation initiation site. 17

18 DNA atgcccaagctgaatagcgtagaggggttttcatcatttgaggacgatgtataa messenger RNA (mrna) augcccaagcugaauagcguagagggguuuucaucauuugaggacgauguauaa Possible protein sequences ORF 1 aug ccc aag cug aau agc gua gag ggg uuu uca uca uuu gag gac gau gua uaa M P K L N S V E G F S S F E D D V * ORF 2 ugc cca agc uga aua gcg uag agg ggu uuu cau cau uug agg acg aug uau C P S * I A * R G F H H L R T M Y ORF 3 gcc caa gcu gaa uag cgu aga ggg guu uuc auc auu uga gga cga ugu aua A Q A E * R R G V F I I * G R C I 18

19 DNA codes three RNAs (mrna, rrna & trna); each plays a critical role in protein translation. 19

20 Transcription is performed by RNA polymerase complex 20

21 Post-transcriptional mrna processing (splicing) in eukaryotes: Exons are retained, Introns are removed I = Intron E = Exon I = Intron E = Exon Intron - non-coding region Exon coding region mrna processing in bacteria generally does not occur Cytoplasm Nucleus 21

22 Transfer RNAs read codons in mrna and bring appropriate amino acids 61 different trnas one for each amino acid coding codon Different trna For each Codon Charged trna 22

23 Translation: 1. Ribosome associates with mrna: protein synthesis begins at the AUG start codon (encodes methionine) 2. trna met brings methionine to P site 3. 2 nd trna brings next AA to A site which is positioned at the next codon of the mrna 4. Peptide bond forms between AAs 5. Ribosome moves one codon along the mrna so the trna that was in the A position is now in the P position. At the same time, the original trna met is released and a new trna moves into the A site with the 3 rd AA 6. Translation continues until ribosome reaches a stop codon and is released P site A site 23

24 Chromosomes In the cell, each individual DNA molecule is arranged to form compact structures called chromosomes. The set of all chromosomes of a cell makes up its genome. The human genome has approximately 3 billion base pairs of DNA arranged into 46 chromosomes. 24

25 Cell Nucleus 25

26 Genes The genetic information in a genome is held within genes. A gene is a unit of heredity and is a region of DNA that encodes a functional RNA (mrna, trna, rrna). All living organisms depend on genes, as they specify all functional RNAs and proteins. 26

27 Most genes code for mrnas, that is, proteins SELLVNTKSGKVMGTRVPVLSSHISAFLGIPFAEPPVGNMRFRRPEPKKPWSGVWNASTYPNNCQQYVDE QFPGFSGSEMWNPNREMSEDCLYLNIWVPSPRPKSTTVMVWIYGGGFYSGSSTLDVYNGKYLAYTEEVVL VSLSYRVGAFGFLALHGSQEAPGNVGLLDQRMALQWVHDNIQFFGGDPKTVTIFGESAGGASVGMHILSP GSRDLFRRAILQSGSPNCPWASVSVAEGRRRAVELGRNLNCNLNSDEELIHCLREKKPQELIDVEWNVLP FDSIFRFSFVPVIDGEFFPTSLESMLNSGNFKKTQILLGVNKDEGSFFLLYGAPGFSKDSESKISREDFM SGVKLSVPHANDLGLDAVTLQYTDWMDDNNGIKNRDGLDDIVGDHNVICPLMHFVNKYTKFGNGTYLYFF NHRASNLVWPEWMGVIHGYEIEFVFGLPLVKELNYTAEEEALSRRIMHYWATFAKTGNPNESKWPLFTTK EQKFIDLNTEPMKVHQRLRVQMCVFWNQFLPKLLN Others code for RNA only (such as trna and rrna) AUACACAGAAACAAUAUAACAGAGACAGAUAUACUACUAUCUACUACUAUCAUACCCAUAUAACUCUAUCU 27

28 Biological pathways A biological pathway is a series of actions among molecules in a cell that leads to a certain product or a change in a cell. Such a pathway can trigger the assembly of new molecules, such as a protein or some other molecule. Pathways can also turn genes on and off, or spur a cell to move. cell membrane G-protein complex 28

29 Additional video material available at html 29