Basic Biology. Gina Cannarozzi. 28th October Basic Biology. Gina. Introduction DNA. Proteins. Central Dogma.

Similar documents
1. DNA, RNA structure. 2. DNA replication. 3. Transcription, translation

Biomolecules: lecture 6

Level 2 Biology, 2017

Bioinformatics CSM17 Week 6: DNA, RNA and Proteins

Biomolecules: lecture 6

iclicker Question #28B - after lecture Shown below is a diagram of a typical eukaryotic gene which encodes a protein: start codon stop codon 2 3

Fishy Amino Acid Codon. UUU Phe UCU Ser UAU Tyr UGU Cys. UUC Phe UCC Ser UAC Tyr UGC Cys. UUA Leu UCA Ser UAA Stop UGA Stop

Protein Synthesis. Application Based Questions

ANCIENT BACTERIA? 250 million years later, scientists revive life forms

CONVERGENT EVOLUTION. Def n acquisition of some biological trait but different lineages

Folding simulation: self-organization of 4-helix bundle protein. yellow = helical turns

Degenerate Code. Translation. trna. The Code is Degenerate trna / Proofreading Ribosomes Translation Mechanism

UNIT I RNA AND TYPES R.KAVITHA,M.PHARM LECTURER DEPARTMENT OF PHARMACEUTICS SRM COLLEGE OF PHARMACY KATTANKULATUR

Honors packet Instructions

Deoxyribonucleic Acid DNA. Structure of DNA. Structure of DNA. Nucleotide. Nucleotides 5/13/2013

A Zero-Knowledge Based Introduction to Biology

Protein Synthesis: Transcription and Translation

Chemistry 121 Winter 17

Human Gene,cs 06: Gene Expression. Diversity of cell types. How do cells become different? 9/19/11. neuron

Important points from last time

The combination of a phosphate, sugar and a base forms a compound called a nucleotide.

p-adic GENETIC CODE AND ULTRAMETRIC BIOINFORMATION

(a) Which enzyme(s) make 5' - 3' phosphodiester bonds? (c) Which enzyme(s) make single-strand breaks in DNA backbones?

How life. constructs itself.

Chapter 10. The Structure and Function of DNA. Lectures by Edward J. Zalisko

Just one nucleotide! Exploring the effects of random single nucleotide mutations

UNIT (12) MOLECULES OF LIFE: NUCLEIC ACIDS

7.016 Problem Set 3. 1 st Pedigree

PRINCIPLES OF BIOINFORMATICS

PROTEIN SYNTHESIS Study Guide

Enduring Understanding

Describe the features of a gene which enable it to code for a particular protein.

Codon Bias with PRISM. 2IM24/25, Fall 2007

CISC 1115 (Science Section) Brooklyn College Professor Langsam. Assignment #6. The Genetic Code 1

Station 1: DNA Structure Use the figure above to answer each of the following questions. 1.This is the subunit that DNA is composed of. 2.

Lecture 19A. DNA computing

BIOL591: Introduction to Bioinformatics Comparative genomes to look for genes responsible for pathogenesis

Bioinformatics. ONE Introduction to Biology. Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Gene Prediction

DNA Begins the Process

INTRODUCTION TO THE MOLECULAR GENETICS OF THE COLOR MUTATIONS IN ROCK POCKET MICE

DNA sentences. How are proteins coded for by DNA? Materials. Teacher instructions. Student instructions. Reflection

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 1. Bioinformatics 1: Biology, Sequences, Phylogenetics

GENETICS and the DNA code NOTES

Chapter 10. The Structure and Function of DNA. Lectures by Edward J. Zalisko

Name Date Class. The Central Dogma of Biology

Chapter 13 From Genes to Proteins

Emergence of the Canonical Genetic Code

Basic concepts of molecular biology

Molecular Level of Genetics

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Mechanisms of Genetics

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

IMAGE HIDING IN DNA SEQUENCE USING ARITHMETIC ENCODING Prof. Samir Kumar Bandyopadhyay 1* and Mr. Suman Chakraborty

Chapter 3: Information Storage and Transfer in Life

If stretched out, the DNA in chromosome 1 is roughly long.

Inheritance of Traits

Bioinformation by Biomedical Informatics Publishing Group

CHAPTER 12- RISE OF GENETICS I. DISCOVERY OF DNA A. GRIFFITH (1928) 11/15/2016

7.013 Problem Set

Unit 1. DNA and the Genome

RNA and PROTEIN SYNTHESIS. Chapter 13

PROTEIN SYNTHESIS WHAT IS IT? HOW DOES IT WORK?

Flow of Genetic Information

Materials Protein synthesis kit. This kit consists of 24 amino acids, 24 transfer RNAs, four messenger RNAs and one ribosome (see below).

From Gene to Protein. How Genes Work

Why are proteins important?

Basic concepts of molecular biology

UNIT (12) MOLECULES OF LIFE: NUCLEIC ACIDS

2. Examine the objects inside the box labeled #2. What is this called? nucleotide

BIOSTAT516 Statistical Methods in Genetic Epidemiology Autumn 2005 Handout1, prepared by Kathleen Kerr and Stephanie Monks

Introduction to Molecular Biology

Daily Agenda. Warm Up: Review. Translation Notes Protein Synthesis Practice. Redos

Genes & Inheritance Series: Set 1. Copyright 2005 Version: 2.0

BIOLOGY. Gene Expression. Gene to Protein. Protein Synthesis Overview. The process in which the information coded in DNA is used to make proteins

MOLECULAR EVOLUTION AND PHYLOGENETICS

Problem Set 3

Evolution of protein coding sequences

Molecular Biology. Biology Review ONE. Protein Factory. Genotype to Phenotype. From DNA to Protein. DNA à RNA à Protein. June 2016

UNIT 4. DNA, RNA, and Gene Expression

Translating the Genetic Code. DANILO V. ROGAYAN JR. Faculty, Department of Natural Sciences

The Pieces Inside of You that Make You Who You Are. Genes and DNA

CH107 Mock Exam 2. 3) Which of the following is a purine base? Purines = adenine, guanine A) Uracil B) Adenine C) Thymine D) Cysteine E) Orotate

PGRP negatively regulates NOD-mediated cytokine production in rainbow trout liver cells

Today in Astronomy 106: polymers to life

Chapter 8: DNA and RNA

Today in Astronomy 106: the important polymers and from polymers to life

From DNA to Protein. Chapter 14

CHAPTER 31 NUCLEIC ACIDS AND HEREDITY SOLUTIONS TO REVIEW QUESTIONS. ƒ H. Guanine H O. Uracil

Genes and Proteins. Objectives

FROM MOLECULES TO LIFE

FROM MOLECULES TO LIFE

Ch 10 Molecular Biology of the Gene

DNA & Protein Synthesis. Chapter 8

NUCLEIC ACIDS AND PROTEIN SYNTHESIS

ENZYMES AND METABOLIC PATHWAYS

SUPPLEMENTARY INFORMATION

6. Which nucleotide part(s) make up the rungs of the DNA ladder? Sugar Phosphate Base

Topic 2.7 Replication, Transcription & Translation

Chapter 12-3 RNA & Protein Synthesis Notes From DNA to Protein (DNA RNA Protein)

Transcription:

Cannarozzi 28th October 2005

Class Overview RNA Protein Genomics Transcriptomics Proteomics Genome wide Genome Comparison Microarrays Orthology: Families comparison and Sequencing of Transcription factor Gene Finding binding sites Single sequence Secondary Structure and Family Experimental Prediction comparison Techniques Tools: String Alignment Phylogenetic Tree Construction Multiple Sequence Al

Outline 1. biological macromolecules - proteins and /RNA 2. interaction of proteins, and RNA is described by the of Biology - is organized into genes, the genes are transcribed to RNA and translated to proteins 3., RNA 4. proteins 5. molecular evolution - continual process of random mutation and natural selction

Vocabulary Genome - The entire set of hereditary instructions for building, running and maintaining an organism. The genome encompasses the material passed on from generation to generation. (does not vary within one organism) Genomics - study of the structure and function of the genome Transcriptome - The entire set of messenger RNA expressed while building, running and maintaining an organism. The transcriptome is all the mrna transcribed from genes within a given genome. (varies depending on physiology) Transcriptomics - The genome-wide study of mrna expression levels. Proteome - The entire set of proteins expressed while building, running and maintaining an organism. The proteome is all the protein translated from mrna of a given transcriptome. (varies with physiology) Proteomics - study of the structure and function of the proteome

Biology Biology = Science of living things What is Life? - As a rule: living organisms exchange matter and energy with their environment, grow, develop, die, react to their environment and reproduce - Exceptions: seeds in vegetative state, viruses, viroids, prions... - Viruses have or RNA (genetic material) evolve can not use the genetic material themselves, they must have a host cell to function and reproduce does not use enery themselves but the cells they infect do use energy are they alive?

Computational Biology truely interdisciplinary biology is described by general rules with many many exceptions computer science is more clearly defined probability and statistics is important to describe general rules with many exceptions input data can be flawed

Composition of Organisms All organisms are based on similar biochemical processes. Main actors in these processes: Nucleic Acids (/RNA)

Vocabulary nucleotide or nucleic acid or base one unit of comprised of a base, a sugar and and a phosphate group oligonucleotide polymer of more than 1 nucleic acid purine bases adenine and guanine pyrimidine bases thymine and cytosine transition conversion between a purine and a purine or a pyrimidine and a pyrimidine transversion conversion between a purine and a pyrimidine codon 3 bases that code for 1 amino acid reading frame one of 3 possible ways to read a nucleotide sequences in one direction frameshift type of mutation in which an insertion or deletion changes the reading frame introns part of a gene that is transcribed but not present in a mature mrna after splicing exons part of a gene that is present in the mature mrna and is translated to protein

Nucleic Acids are synthesized from information encoded in genes in. Nucleic Acids (NA) 2 types of NA: 1. Deoxiribonucleic acid () 2. Ribonucleic acid (RNA)

consists of 2 strands Each strand has 3 components: 1. Phosphate residue 2. Sugar: 2 -deoxyribose 3. 4 bases: Adenine (A), Cytosine (C), Guanine (G), Thymine (T)

Bases in pyrimidines Cytosine, Thymine purines Adenine, Guanine

RNA The same as except the sugar is ribose and the base Uracil (U) replaces Thymine (T).:

RNA II single stranded folds on to itself can take many different configurations has many different functions. 2 most important: 1. mrna: messenger RNA, information transport 2. trna: transfer RNA 3. rrna: ribosomal RNA 4. ribozymes: RNA with enzymatic functions

Repair Mechanisms

Take Home Message: /RNA /RNA - strands have a repetitive pattern: [Sugar Phosphate] chain structure Orientation: 5 3 small number of bases and double strandedness of insures secure data storage

Take Home Message: /RNA II Definition of alphabets: A := {A, C, G, T } A RNA := {A, C, G, U} sequence: s = B 1 B 2... B n = n i=1 B i where n i=1b i A RNA sequence: s RNA = B 1 B 2... B n = n i=1 B i where n i=1b i A RNA

Main component of all organisms Peform most functions in the Cell 1. Structural proteins: e.g. muscle filaments in a body 2. Enzymes: catalysts for chemical reactions. Speed up of reaction and decrease in energy requirement. 3. Signalling : e.g. cell receptors for estrogen) are built of simpler building blocks called: Amino Acids (AA)

Amino Acids Common structure of Amino Acids:

Amino Acids II In nature, 20 amino acids are common Amino acids differ in their side chains (R) Side chains can be as simple as a hydrogen atom or as complex as two carbon rings. Characteristics of side change that can influence their chemistry are: hydrophobicity, charge, acidity, basicity, and size

Amino Acids II

Alle mini hydrophiles by Cannarozzi Alle mini hydrophiles KREND, KREND FAMILYVW are hydrophobes. They dont like the sea.

"My name is Bond, Peptide Bond" Question: How do amino acids form proteins? [ ] Peptide bonds

of Biology The central dogma of biology states that the coded genetic information hard-wired into is transcribed into individual transportable cassettes, composed of messenger RNA (mrna); each mrna cassette contains the program for synthesis of a particular protein (or small number of proteins).

Transcription Exon1 Intron 1 Exon 2 copy Intron 2 Exon 3 Stop precursor RNA splicing Exon1 Exon 2 Exon 3 mrna transport to ribosome Research questions: Where are the introns? Where are the coding sequences? Where are the stop and start of translation? Where are the transcription factors that control when transcription takes place?

Translation How is a mrna sequence mapped to a protein? The genetic code Groups of 3 subsequent bases are translated into 1 amino acid Codons

The Genetic Code GGG G Gly AGG R Arg CGG R Arg UGG W Trp GGA G Gly AGA R Arg CGA R Arg UGA Stop GGC G Gly AGC S Ser CGC R Arg UGC C Cys GGU G Gly AGU S Ser CGU R Arg UGU C Cys GAG E Glu AAG K Lys CAG Q Gln UAG Stop GAA E Glu AAA K Lys CAA Q Gln UAA Stop GAC D Asp AAC N Asn CAC H His UAC Y Tyr GAU D Asp AAU N Asn CAU H His UAU Y Tyr GCG A Ala ACG T Thr CCG P Pro UCG S Ser GCA A Ala ACA T Thr CCA P Pro UCA S Ser GCC A Ala ACC T Thr CCC P Pro UCC S Ser GCU A Ala ACU T Thr CCU P Pro UCU S Ser GUG V Val AUG M Met CUG L Leu UUG L Leu GUA V Val AUA I Ile CUA L Leu UUA L Leu GUC V Val AUC I Ile CUC L Leu UUC F Phe GUU V Val AUU I Ile CUU L Leu UUU F Phe

Translation in the Ribosome

Formal representation of proteins Each amino acid represented by one letter all amino acids as alphabet: A := {A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y, V } Together with chain structure and orientation proteins can be represented as strings n p = a 1 a 2... a n = where n i=1a i A where denotes the concatenation operator (Diss Knecht, 1996) i=1 a i

s From a biological point of view proteins are not just strings fold into 3D-configurations each amino acid can behave differently depending on the situation Characterization of proteins with the following structures: Primary: sequence Secondary: alpha helix or beta strand which arrange into sheets Tertiary: packing of secondary structures Quaternary: higher order packing

Primary Structure the sequence of 20 amino acids KLKDYKHAYHPVDLDIKDIDYTMFHLACITKLFDGD... linear (unbranched) Repetitive motive of: [ N C (CO) ] backbone chain structure Orientation from N to C terminal end 50-3000 amino acids

Secondary Structure- alpha Helix (H) alpha helices have 3.6 residues per turn. (100 degrees separate the side chains of consecutive residues) there is a hydrogen bond between residue i and residue i+4 average length of an alpha helix is 11 amino acids although they can be as short as 5 and as long as 25

Secondary Structure- beta strands

Tertiary Structure- the "fold" of the protein

Summary

crystal structure of Triose Phosphate Isomerase Dimer (two identical proteins together) with beta-barrel fold.

Protein Vocabulary amino acid or residue building blocks of proteins backbone the N-C-C -N-C-C bonds that form the chain of the protein side chains the part of each amino acid that gives the distinctive characteristic and carries the functionality (acidic, basic, charged + or -, hydrophobic, hydrophilic) fold refers to the 3 dimensional structure - common folds are, for example, a barrel or an alpha/beta sandwich hydrophobicity characteristic of amino acid side chains that determines much of how they behave- are they compatible with water or not? hydrophobic do not like water (FAMILYVW) hydrophilic like water (KREND) hydrogen bonding a strong type of intermolecular attraction based on partial charges. Occurs between hydrogen and F, O and N. alpha helix a helix formed in a protein (not the same as double helix of )- characterized by hydrogen bonds between residue i and i+4 beta strand an extended chain conformation in a protein in which the side chains stick out on alternating sides and hydrogen bonding is between neighboring strands in sheets. Beta strands are almost always arranged into sheets.

Over time sequences change due to UV-radiation, Toxins, Radioactivity, other factors...... as a consequence, undergoes Mutations: Change of one base to another base Deletions of bases Insertions of bases Other evolutionary forces

, Phylogeny and the Alignment Rice Mosquito NGTTD!VDKIVKILNEGQIASTDVVEVVVSPPYVFLPVVKSQL...!..!.!...!.:....!.:.~ NGDKASIADLCKVLTTGPLNAD TEVVVGCPAPYLTLARSQL Rice Corn Dog Fly Mosquito alignment implies an evolutionary relationship represented by Phylogenetic Tree alignmnet shows accepted substitutions since divergence our model allows for point mutations and insertions/deletions (indels) aligns amino acids that diverged from the same residue in (hypothetical) most recent common ancestor darwinian evolution is driven by mutation and natural selection mutations may be adaptive, neutral or deleterious proteins evolve under functional constraints - mutations that destroy function do not appear in database via organism death "correct" alignment represents actual events- substitutions, indels impossible to verify -> take alignment with the highest probability that the alignment is correct under our model

Tree of Life tree courtesy of Daniel Margadant

Vocabulary homology related by common ancestry homologs biological sequences related by common ancestry orthology homology that arises via speciation paralogy homology that arises via gene duplication eukaryote one of the 3 kingdoms of life, organisms that have a nucleus prokaryote organisms that lack a nucleus archaea one of the 3 kingdoms of life, prokaryotes that often live in extreme environments (hot, salty, cold, basic, acidic) bacteria one of the 3 kingdoms of life, prokaryotes

Web Sites Course: http://www.icos.ethz.ch/education/courses/computational_biology CBRG: http://www.cbrg.ethz.ch/ BioRecipes: http://www.biorecipes.com