Primer on Genome Biology Part I: Fundamentals

Similar documents
Introduction to genome biology Sandrine Dudoit and Robert Gentleman

Introduction to Genome Biology

Introduction to genome biology. Outline. From chromosomes to proteins. A brief history

Introduction to Genome Biology

Nucleic acids deoxyribonucleic acid (DNA) ribonucleic acid (RNA) nucleotide

NUCLEIC ACID METABOLISM. Omidiwura, B.R.O

Protein Synthesis

From Gene to Protein

Chapter 2. An Introduction to Genes and Genomes

DNA - DEOXYRIBONUCLEIC ACID

DNA is the genetic material. DNA structure. Chapter 7: DNA Replication, Transcription & Translation; Mutations & Ames test

Nucleic acids and protein synthesis

Molecular Genetics. The flow of genetic information from DNA. DNA Replication. Two kinds of nucleic acids in cells: DNA and RNA.

Unit 5 DNA, RNA, and Protein Synthesis

Chapter 12: Molecular Biology of the Gene

DNA. Essential Question: How does the structure of the DNA molecule allow it to carry information?

I. Gene Expression Figure 1: Central Dogma of Molecular Biology

DNA and RNA Structure. Unit 7 Lesson 1

PROTEIN SYNTHESIS Flow of Genetic Information The flow of genetic information can be symbolized as: DNA RNA Protein

CELL BIOLOGY: DNA. Generalized nucleotide structure: NUCLEOTIDES: Each nucleotide monomer is made up of three linked molecules:

Biology 30 DNA Review: Importance of Meiosis nucleus chromosomes Genes DNA

Ch 10 Molecular Biology of the Gene

DNA REPLICATION. DNA structure. Semiconservative replication. DNA structure. Origin of replication. Replication bubbles and forks.

BIOLOGY 111. CHAPTER 6: DNA: The Molecule of Life

Biology Ch. 17- Molecular Genetics 17.1 Isolating the Genetic Material

Chapter 10 - Molecular Biology of the Gene

Unit IX Problem 3 Genetics: Basic Concepts in Molecular Biology

Bundle 5 Test Review

DNA Structure DNA Nucleotide 3 Parts: 1. Phosphate Group 2. Sugar 3. Nitrogen Base

Chapter 12-3 RNA & Protein Synthesis Notes From DNA to Protein (DNA RNA Protein)

Purines vs. Pyrimidines

Name Date Class. The Central Dogma of Biology

DNA and RNA Structure Guided Notes

Hello! Outline. Cell Biology: RNA and Protein synthesis. In all living cells, DNA molecules are the storehouses of information. 6.

Unit II Problem 3 Genetics: Summary of Basic Concepts in Molecular Biology

DNA Replication and Protein Synthesis

Unit 6 Molecular Genetics

Biology. Biology. Slide 1 of 39. End Show. Copyright Pearson Prentice Hall

Biology. Biology. Slide 1 of 39. End Show. Copyright Pearson Prentice Hall

CH 4 - DNA. DNA = deoxyribonucleic acid. DNA is the hereditary substance that is found in the nucleus of cells

Nucleic Acid Structure:

Nucleic Acid Structure:

Lesson 8. DNA: The Molecule of Heredity. Gene Expression and Regulation. Introduction to Life Processes - SCI 102 1

REVISION: DNA, RNA & MEIOSIS 13 MARCH 2013

NUCLEIC ACIDS AND PROTEIN SYNTHESIS ǁ Author: DONGO SHEMA F. ǁ JULY 2016 ǁ

A. Incorrect! This feature does help with it suitability as genetic material.

Gene Expression: Transcription, Translation, RNAs and the Genetic Code

DNA Structure and Replication, and Virus Structure and Replication Test Review

Independent Study Guide The Blueprint of Life, from DNA to Protein (Chapter 7)

(deoxyribonucleic acid)

Nucleic Acids. OpenStax College. 1 DNA and RNA

DNA. translation. base pairing rules for DNA Replication. thymine. cytosine. amino acids. The building blocks of proteins are?

Replication Transcription Translation

Chapter 12 Molecular Genetics

RNA, & PROTEIN SYNTHESIS. 7 th Grade, Week 4, Day 1 Monday, July 15, 2013

Replication Review. 1. What is DNA Replication? 2. Where does DNA Replication take place in eukaryotic cells?

Lecture Overview. Overview of the Genetic Information. Marieb s Human Anatomy and Physiology. Chapter 3 DNA & RNA Protein Synthesis Lecture 6

Human Anatomy & Physiology I Dr. Sullivan Unit IV Cellular Function Chapter 4, Chapter 27 (meiosis only)

what are proteins? what are the building blocks of proteins? what type of bond is in proteins? Molecular Biology Proteins - review Amino Acids

From DNA to Protein: Genotype to Phenotype

NUCLEIC ACIDS AND PROTEIN SYNTHESIS

Unit 1: DNA and the Genome. Sub-Topic (1.3) Gene Expression

Do you think DNA is important? T.V shows Movies Biotech Films News Cloning Genetic Engineering

1. I can describe the stages of the cell cycle.

Chapter 13 - Concept Mapping

DNA and Biotechnology Form of DNA Form of DNA Form of DNA Form of DNA Replication of DNA Replication of DNA

Name 10 Molecular Biology of the Gene Test Date Study Guide You must know: The structure of DNA. The major steps to replication.

To truly understand genetics, biologists first had to discover the chemical nature of genes

Resources. How to Use This Presentation. Chapter 10. Objectives. Table of Contents. Griffith s Discovery of Transformation. Griffith s Experiments

Lecture Overview. Overview of the Genetic Information. Chapter 3 DNA & RNA Lecture 6

Bioinformatics. ONE Introduction to Biology. Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012

DNA & RNA. Chapter Twelve and Thirteen Biology One

Videos. Lesson Overview. Fermentation

How do we know what the structure and function of DNA is? - Double helix, base pairs, sugar, and phosphate - Stores genetic information

Do you remember. What is a gene? What is RNA? How does it differ from DNA? What is protein?

Key Area 1.3: Gene Expression

Semester 2: Unit 1: Molecular Genetics

Adv Biology: DNA and RNA Study Guide

3.1.5 Nucleic Acids Structure of DNA and RNA

Honors Biology Reading Guide Chapter 10 v Fredrick Griffith Ø When he killed bacteria and then mixed the bacteria remains with living harmless

II. DNA Deoxyribonucleic Acid Located in the nucleus of the cell Codes for your genes Frank Griffith- discovered DNA in 1928

The Structure of Proteins The Structure of Proteins. How Proteins are Made: Genetic Transcription, Translation, and Regulation

translation The building blocks of proteins are? amino acids nitrogen containing bases like A, G, T, C, and U Complementary base pairing links

DNA Function: Information Transmission

Chromosomes. Chromosomes. Genes. Strands of DNA that contain all of the genes an organism needs to survive and reproduce

Principle 2. Overview of Central. 3. Nucleic Acid Structure 4. The Organization of

The Structure of RNA. The Central Dogma

Lecture 8. Chromosome. The Nuclei. Two Types of Nucleic Acids. Genes. Information Contained Within Each Cell

3.A.1 DNA and RNA: Structure and Replication

DNA RNA PROTEIN SYNTHESIS -NOTES-

DNA and RNA. Chapter 12

DNA, RNA, Replication and Transcription

How can something so small cause problems so large?

The Structure and Func.on of Macromolecules Nucleic Acids

Chapter 8: DNA and RNA

C. Incorrect! Threonine is an amino acid, not a nucleotide base.

Microbiology: The Blueprint of Life, from DNA to protein

DNA, Proteins and Protein Synthesis

NUCLEIC ACIDS Genetic material of all known organisms DNA: deoxyribonucleic acid RNA: ribonucleic acid (e.g., some viruses)

From Gene to Protein. Lesson 3

Transcription:

Primer on Genome Biology Part I: Fundamentals PB HLTH C240F/STAT C245F Division of Biostatistics and Department of Statistics University of California, Berkeley www.stat.berkeley.edu/~sandrine Copyright 2017, all rights reserved

Outline Cells, Chromosomes, and Cell Division DNA Structure and Replication Proteins Central Dogma: Transcription and Translation Pathways 2

A Brief History Gregor Mendel (1823-1884) Thomas Hunt Morgan (1866-1945) Francis Crick (1916-2004) James D. Watson (1928- ) 3

Cells, Chromosomes, and Cell Division 4

From Chromosomes to Proteins www.ornl.gov/hgmis/graphics/slides/images1.html 5

Cells 6

Cells Cells. The fundamental working units of every living organism. Metazoa. Multicellular organisms. E.g. Humans: trillions of cells. Protozoa. Unicellular organisms. E.g. Yeast, bacteria. 7

Cells Each cell contains a complete copy of an organism s genome, i.e., the blueprint for all cellular structures and activities. Cells are of many different types (e.g., blood, nerve, skin cells), but all can be traced back to a single cell, the fertilized egg. 8

Cell Composition 90% water. Molecular composition excluding water (dry weight): 50% protein, 15% carbohydrate, 15% nucleic acid, 10% lipid, 10% miscellaneous. Elemental composition: 60% H, 25% O, 10%C, 5%N. 9 Spring web.mit.edu/esgbio/www/cb/cellbasics.html 2017

The Genome The genome is distributed along chromosomes, which are made of compressed and entwined DNA. A (protein-coding) gene is a segment of chromosomal DNA that directs the synthesis of a protein. 10

Eukaryotes vs. Prokaryotes 11

Eukaryotes vs. Prokaryotes Prokaryotic cells. Lack a distinct, membrane-bound nucleus. E.g. Bacteria. Eukaryotic cells. Distinct, membranebound nucleus. Larger and more complex in structure than prokaryotic cells. E.g. Mammals, yeast. 12

The Eukaryotic Cell www.accessexcellence.com/ab/gg/ 13

The Eukaryotic Cell Nucleus. Membrane-enclosed structure which contains chromosomes, i.e., DNA molecules carrying genes essential to cellular function. Cytoplasm. Material between the nuclear and cellular membranes; includes fluid (cytosol), organelles, and various membranes. Organelle. Membrane-enclosed structure found in the cytoplasm. 14

The Eukaryotic Cell Ribosome. Small particle composed of RNAs and proteins that functions in protein synthesis. Mitochondrion. Organelle found in most eukaryotic cells in which respiration and energy generation occurs. Mitochondrial DNA. Codes for ribosomal RNAs and transfer RNAs used in the mitochondrion; contains only 13 recognizable genes that code for polypeptides. Vesicle. Small cavity or sac, especially one filled with fluid. 15

The Eukaryotic Cell Centrioles. Either of a pair of cylindrical bodies composed of microtubules (spindles); determine cell polarity and used during mitosis and meiosis. Endoplasmic reticulum. Network of membranous vesicles to which ribosomes are often attached. Golgi apparatus. Network of vesicles functioning in the manufacture of proteins. Cilia. Very small hair-like projections found on certain types of cells; can be used for movement. 16

The Human Genome The human genome is distributed along 23 pairs of chromosomes 22 autosomal pairs; the sex chromosome pair, XX for females and XY for males. In each pair, one chromosome is paternally inherited, the other maternally inherited (cf. meiosis). 17

Chromosomes www.accessexcellence.com/ab/gg/ 18

Chromosome Banding Patterns www.people.virginia.edu/~rjh9u/ideo.html 19

Of Mice and Men 20

Cell Divisions Mitosis. Cellular division which produces two diploid daughter cells, identical to the parent cell. How each cell can be traced back to a single fertilized egg. Meiosis. Two successive cellular divisions which produce four haploid daughter cells, different from the original parent cell. Leads to the formation of gametes (egg/sperm). 21

Mitosis www.accessexcellence.com/ab/gg/ 22

Meiosis www.accessexcellence.com/ab/gg/ 23

Meiosis vs. Mitosis 24

Dividing Cell Image of cell at metaphase from fluorescent-light microscope. www.ornl.gov/hgmis/graphics/slides/images1.html 25

Recombination www.accessexcellence.com/ab/gg/ 26

Recombination 27

Recombination 28

Chromosomes and DNA www.accessexcellence.com/ab/gg/ 29

DNA Structure and Replication 30

DNA Structure We wish to suggest a structure for the salt of deoxyribose nucleic acid (D.N.A.). This structure has novel features which are of considerable biological interest. J.D. Watson and F. H. C. Crick (1953). Molecular structure of Nucleic Acids. Nature, 171: 737-738. 31

DNA Structure A deoxyribonucleic acid or DNA molecule is a double-stranded polymer composed of four basic molecular units called nucleotides. Each nucleotide comprises a phosphate group; a deoxyribose sugar; one of four nitrogen bases: purines: adenine (A) and guanine (G), pyrimidines: cytosine (C) and thymine (T). 32

DNA Structure Base-pairing occurs according to the following rule: C pairs with G, A pairs with T. The two chains are held together by hydrogen bonds between nitrogen bases. 33

DNA Structure academy.d20.co.edu/kadets/lundberg/dnapic.html 34

DNA Structure www.accessexcellence.com/ab/gg/ 35

DNA Structure Four nucleotide bases: purines: A, G; pyrimidine: T, C. A pairs with T, 2 H bonds; C pairs with G, 3 H bonds. www.ornl.gov/hgmis/graphics/slides/images1.html 36

Nucleotides Purines Adenine (A) Pyrimidines Guanine (G) Thymine (T) Cytosine (C) Uracil (U) (RNA) 37

Nucleotide Base-Pairing G C pair 3 H bonds A T pair 2 H bonds www.accessexcellence.com/ab/gg/ 38

DNA Structure Polynucleotide chains are directional molecules, with slightly different structures marking the two ends of the chains, the socalled 3' end and 5' end. The 3' and 5' notation refers to the numbering of carbon atoms in the sugar ring. The 3' end carries a sugar group and the 5' end carries a phosphate group. The two complementary strands of DNA are antiparallel, i.e., 5' end to 3' end directions for each strand are opposite. 39

Genetic and Physical Maps Physical distance. Number of base pairs (bp) between two loci. Genetic map distance. Expected number of crossovers between two loci, per chromatid, per meiosis. Measured in Morgans (M) or centimorgans (cm). 1cM ~ 1 million bp (1Mb). 40

The Human Genome in Numbers Number of chromosomes: 23 pairs. Physical length: ~3,234,830,000 bp, i.e., 3,234.83 Mb (~ 2 meters of DNA). Genetic length: ~ 35 M, 27 M for males and 44 M for females. Number of genes: ~ 20,000. 41

DNA Replication It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material. J.D. Watson and F. H. C. Crick (1953). Molecular structure of Nucleic Acids. Nature, 171: 737-738. 42

DNA Replication Three possible models 43

DNA Replication 44

DNA Replication In the replication of a double-stranded or duplex DNA molecule, both parental (i.e., original) DNA strands are copied. The parental DNA strand that is copied to form a new strand is called a template. When copying is finished, the two new duplexes each consist of one of the original strands plus its complementary copy semiconservative replication. 45

DNA Replication Base-pairing provides the mechanism for DNA replication. www.oup.co.uk/best.textbooks/biochemistry/genesvii 46

DNA Replication Many enzymes are required to unwind the double helix and to synthesize a new strand of DNA. The unwound helix, with each strand being synthesized into a new double helix, is called the replication fork. DNA synthesis occurs in the 5' à 3' direction. 47

DNA Replication 48

DNA Replication 49

DNA Replication 50

DNA Replication 51

DNA Replication www.oup.co.uk/best.textbooks/biochemistry/genesvii 52

DNA Replication www.oup.co.uk/best.textbooks/biochemistry/genesvii 53

DNA Replication www.oup.co.uk/best.textbooks/biochemistry/genesvii 54

Enzymes in DNA Replication 1. Topoisomerase. Removes supercoils and initiates duplex unwinding. 2. Helicase. Unwinds duplex. 3. DNA polymerase. Synthesizes the new DNA strand; also performs proofreading. 4. Primase. Attaches small RNA primer to singlestranded DNA to act as a substitute 3'OH for DNA polymerase to begin synthesizing from. 5. Ligase. Catalyzes the formation of phosphodiester bonds. 6. Single-stranded binding proteins. Maintain the stability of the replication fork. 55

DNA Polymerase There are different types of polymerases, DNA polymerase III is used for synthesizing the new strand. DNA polymerase is a holoenzyme, i.e., an aggregate of several different protein subunits. DNA polymerase proceeds along the template and recruits free dntps (deoxynucleotide triphosphate) to form hydrogen bonds with their appropriate complementary dntp on the template. The energy stored in the triphosphate is used to form the covalent bonds. DNA polymerase uses a short DNA fragment or primer with a 3'OH group onto which it can attach a dntp. 56

DNA Polymerase The β-subunit of DNA polymerase III holoenzyme forms a ring that completely surrounds a DNA duplex. 57

Proteins 58

Proteins www.biochem.szote.u-szeged.hu/astrojan/protein1.htm 59

Proteins Proteins. Large molecules composed of one or more polypeptides, i.e., chains of amino acids. Amino acids. Class of 20 different organic compounds, containing a basic amino group (-NH 2 ) and an acidic carboxyl group (- COOH). The amino acid sequence is determined by the nucleotide base sequence in the gene coding for the protein. Sandrine E.g. Dudoit Antibodies, PB HLTH enzymes, C240F/STAT C245F hormones. - 60

Amino Acids 61

Amino Acids 62

Amino Acids www.accessexcellence.com/ab/gg/ 63

Proteins www.accessexcellence.com/ab/gg/ 64

Proteins www.accessexcellence.com/ab/gg/ 65

Cell Types www.accessexcellence.com/ab/gg/ 66

Differential Gene Expression Each cell contains a complete copy of the organism's genome. Cells are of many different types and states. E.g. Blood, nerve, and skin cells, dividing cells, cancerous cells, etc. What makes the cells different? Differential gene expression, i.e., when, where, and how much each gene is expressed. On average, 40% of our genes are expressed at any given time. 67

Central Dogma: Transcription and Translation 68

Central Dogma 69

Central Dogma The expression of the genetic information stored in the DNA molecule occurs in two stages: (i) transcription, during which DNA is transcribed into messenger RNA (mrna); (ii) translation, during which mrna is translated to produce a protein. DNA è mrna è protein Other important aspects of gene regulation: methylation, alternative splicing, etc. 70

Central Dogma www.accessexcellence.com/ab/gg/ 71

RNA A ribonucleic acid or RNA molecule is a nucleic acid similar to DNA, but single-stranded vs. double-stranded; ribose sugar vs. deoxyribose sugar; uracil (U) vs. thymine (T) as one of the four bases. RNA plays an important role in protein synthesis and other chemical activities of the cell. Several classes of RNA molecules, including messenger RNA (mrna), ribosomal RNA (rrna), transfer RNA (trna), and other small RNAs. 72

The Genetic Code DNA: sequence of four different nucleotides. Proteins: sequence of twenty different amino acids. The correspondence between DNA's fourletter alphabet and a protein's twenty-letter alphabet is specified by the genetic code, which relates nucleotide triplets or codons to amino acids. 73

The Genetic Code Start codon. Marks the initiation of translation (AUG, Met). Stop codons. Mark the termination of translation. The mapping between codons and amino acids is many-toone: 64 codons but only 20 a.a.. Third base in codon is often redundant, e.g., stop codons. www.accessexcellence.com/ab/gg/ 74

Protein Synthesis 75

Transcription Analogous to DNA replication: several steps and many enzymes. RNA polymerase synthesizes an RNA strand complementary to one of the two DNA strands. The RNA polymerase recruits rntps (ribonucleotide triphosphate) in the same way that DNA polymerase recruits dntps (deoxunucleotide triphospate). However, synthesis is single-stranded and only proceeds in the 5' to 3' direction of mrna (no Okazaki fragments). 76

Transcription The strand being transcribed is called the template or antisense strand; it contains anticodons. The other strand is called the sense or coding strand; it contains codons. The RNA strand newly synthesized from and complementary to the template contains the same information as the coding strand. 77

Transcription 78

Transcription Promoter. Regulatory unidirectional DNA sequence upstream of the coding region (i.e., at 5' end on sense strand) that signals the RNA polymerase both where to start and on which strand to continue synthesis. E.g. TATA box. Terminator. Regulatory DNA region at 3' end signaling the termination of transcription. Transcription factor. A protein that initiates the transcription of a gene, binds either to specific DNA sequences (e.g., promoters) or to other transcription factors. 79

Transcription www.oup.co.uk/best.textbooks/biochemistry/genesvii 80

Exons and Introns Genes comprise only about 2% of the human genome. The rest consists of non-coding regions, chromosomal structural integrity, cell division (e.g., centromere), regulatory regions: regulating when, where, and in what quantity proteins are made. The terms exon and intron refer to coding (translated into a protein) and non-coding DNA, respectively. 81

Exons and Introns www.accessexcellence.com/ab/gg/ 82

Splicing 83

Alternative Splicing There are millions of human antibodies. How is this possible with only ~20,000 genes? Alternative splicing refers to the different ways of combining a gene s exons. As a result, the same gene can produce different proteins. Alternative pre-mrna splicing is an important mechanism for regulating gene expression in higher eukaryotes. E.g. A large proportion of human genes are subject to alternative splicing: estimates ranging from 30% to 70% have been Sandrine reported. Dudoit 84

Alternative Splicing 85

Immunoglobulin B-cells produce antibody molecules called immunoglobulins (Ig) which fall in five broad classes: IgG, IgA, IgM, IgD, IgE. Diversity of Ig molecules from somatic recombination prior to transcription (VDJ recombination); alternative splicing; post-translational proteolysis, glycosylation. IgG1 86

Translation Ribosome: cellular factory responsible for protein synthesis; a large subunit and a small subunit; structural RNA and about 80 different proteins. transfer RNA (trna): adaptor molecule between mrna and protein; specific anticodon and acceptor site; specific charger protein, can only bind to that particular trna and attach the correct amino acid to the acceptor site. 87

Translation Initiation. Start codon, AUG, which codes for amino acid methionine, Met. Not every protein necessarily starts with methionine. Often this first amino acid will be removed in post-translational processing of the protein. Termination. Stop codons, UAA, UAG, or UGA. Ribosome breaks into its large and small subunits, releasing the new protein and the mrna. 88

Translation www.oup.co.uk/best.textbooks/biochemistry/genesvii 89

trna The trna has an anticodon on its mrnabinding end that is complementary to the codon on the mrna. Each trna only binds the appropriate amino acid for its anticodon. 90

Post-Translational Processing Folding. Cleavage by a proteolytic (protein-cutting) enzyme. Alteration of amino acid residues by phosphorylation, e.g., of a tyrosine residue; Glycosylation, e.g., carbohydrates covalently attached to asparagine residue; methylation, e.g., of arginine. Lipid conjugation. 91

Functional Genomics The various genome projects have yielded the complete DNA sequences of many organisms. E.g. Human, fruitfly, mouse, yeast. Human: ~ 3 billion base-pairs, ~20 25 thousand genes. Challenge: go from sequence to function, i.e., define the role of each gene and understand how the genome functions as a whole. 92

Pathways 93

Pathways On its own, the complete genome sequence does not say much about how the organism functions as a biological system. One needs to study how different gene products interact to produce various components. Most important activities are not the result of a single molecule, but depend on the coordinated effects of multiple molecules. 94

TGF-β Pathway Transforming Growth Factor beta, TGF-β, plays an essential role in the control of development and morphogenesis in multicellular organisms. The basic pathway provides a simple route for signals to pass from the extracellular environment to the nucleus, involving only four types of molecules. 95

TGF-β Pathway Four types of molecules: TGF-β, TGF-β type I receptors, TGF-β type II receptors, SMADS, a family of signal transducers and transcriptional activators. 96

TGF-β Pathway 97

TGF-β Pathway Extracellular TGF-β ligands transmit their signals to the cell's interior by binding to type II receptors, which form heterodimers with type I receptors. The receptors in turn activate the SMAD transcription factors. 98

TGF-β Pathway Phosphorylated and receptor-activated SMADs (R-SMADs) form heterodimers with common SMADs (co-smads) and translocate to the nucleus. In the nucleus, SMADs activate or inhibit the transcription of target genes, in collaboration with other factors. 99

References Access Excellence www.accessexcellence.com/ab/gg/ Genes VII www.oup.co.uk/best.textbooks/biochemistry/genesvii/ Human Genome Project Information www.ornl.gov/sci/techresources/human_genome/ home.shtml Kimball s Biology Pages www.ultranet.com/~jkimball/biologypages/ MIT Biology Hypertextbook esg-www.mit.edu:8001/ United States Air Force Academy academy.d20.co.edu/kadets/lundberg/dnapic.html 100