Chapter 6 Genomic Architecture. Molecular Structure of Genes and Chromosomes

Size: px
Start display at page:

Download "Chapter 6 Genomic Architecture. Molecular Structure of Genes and Chromosomes"

Transcription

1 Chapter 6 Genomic Architecture Molecular Structure of Genes and Chromosomes

2 Overview: structure of genes and chromosomes Nucleosome: + histone H1 capping the entry and exit of the DNA strand into the nucleosome 10nm fiber Transcriptionally active form Nucleosome: 147 bp DNA-helix wound 2.66 times around 8 histone proteins (2 of each of H2A, H2B, H3, and H4)

3 6 Molecular definition of a gene A gene is the entire nucleic acid sequence that is necessary for the controlled production of its final product (RNA or Protein) Regulatory region (enhancer/prepressor) Splice site Basal Promoter PolyA site TSS (Introns) Exons =coding region (ORF) In eukaryotes, genes lie amidst a large expanse of noncoding DNA with unknown function and genes may also span regions of DNA unrelated to the gene If a gene is incapable of producing a final gene-product = pseudogene

4 Bacterial operons produce polycistronic mrnas while most eukaryotic mrnas are monocistronic and contain introns Transcriped into 5 separate proteins

5 Simple and complex transcription units are found in eukaryotic genomes Only eucaryotic genes can contain introns The final mrna contains a continous ORF flanked by a 3 UTR (incl polya site) and a 5 UTR (incl Cap and Kozak sequence)

6 Many genes encode several different variations = isoforms (~60% of all genes) Alternative splicing of one pre-mrna Alternative termination giving 2 different premrna Alternative TSS giving 2 different pre-mrna

7 There are several types of eukaryotic DNA, much of which is never transcribed

8 Protein coding genes can be Solitairy or duplicated Solitary genes: occur only once in the genome Duplicated genes: * Genefamilies: very similar (not identical) genes (example: containing very similar globular modules in the protein structure) * Multicopy: identical (or nearly identical) copies of genes encoding products needed in high quantity (example: histones)

9 Genomes of higher eukaryotes contain much nonfunctional DNA Amongst eukaryotes, cellular DNA content does not correlate with phylogeny. Terminology: Gene-cluster, gene-desert The beta globins is an example of a gene family

10 Known nonprotein-coding RNAs and their functions

11 Repetitious DNA - Simple Microsatellites Tandem repeats of up to 150 repeats of 1-13 bp The number of repeats within a microsatellite is very variable between individuals Occurrence of microsatellites in vital genes (insertions) can be detrimental (even fatal) They are often plentifull in centromere regions (see the FISH image to the left) Microsattelites most likely occur by daughterstrand slippage during replication (giving small repeated regions) FISH= flourescent in-situ hybridization (same principle as Southern blot)

12 Mobile DNA Moderately-repeated, mobile DNA sequences are interspersed throughout the genomes of prokaryotes, higher plants and animals These sequences range in size from hundreds to a few thousand base pairs The sequences are copied and inserted into a new site in the genome by the process of transposition Once mobile DNA was termed selfish DNA, but actually it may have contributed to our genediversity through exon-shuffling Classes of Mobile DNA * DNA-transposons * Retrotransposons *LTR-retrotransposons (similar to retroviral provirus, but without coat-protein coding genes, most common: ERV) *Non-LTR-retrotransposons (LINEs AND SINEs) Long interspersed element (LINE) Short interspersed element (SINE)

13 Repetitious DNA - Mobile Cut n paste Copy and paste

14 6-9 General structure of bacterial IS elements (cut n paste) Transposase= an enzyme that cuts the element out of one genome site (exit), and cuts a whole at another site (entry) Host genome insertion site (pale blue) becomes copied as part of the IS element insertion process (there fore repeat structure)

15 General structure of bacterial transposons -actively partaking in gene-recombination Through genomic recombination (using the inverted repeats at both ends of the IS element) genes can become trapped witin Transposons and transported to other locations or exchanged between cells (through naturally occurring plasmids)

16 6-12 Retrotransposons containing LTRs behave like retroviruses in the genome Copy-paste LTR: contains promoter region to initate transcription of the retrotransposal genes Reverse transcriptase + special primer (copies RNA into cdna) Integrase (integrates the cdna into a new genomic site)

17 6-13 Retroviral mechanism of multiplication and integration into genomic site Copy-paste Reverse transcriptase + special primer cdna copy of the RNA Integrase mediates integration of the copy into new genome location

18 6-13 Retroviral mechanism of multiplication and integration into genomic site Copy-paste Most endogenous LTR retrotransposons have lost their genes and consist of only LTR sequences. Therefore they can only move through recombination Endogenous= occurring within a cell (as opposed to entering through viral infection)

19 6-16 Structure of non- LTR retrotranposons Different Copy-paste SINEs (~600bp) are similar to LINEs but have lost the ORFs and can only transpose if they can use LINEs enzymes. There are ~ 1,6 million SINEs in the average genome ~1.1 million are Alu sites (= cleaved by the AluI restriction enzyme)

20 6-17 Non-LTR retrotransposons (LINEs and SINEs) move by an unusual mechanism Different Copy-paste

21 Mobile DNA elements probably had a significant influence on evolution Spontaneous mutations may result from the insertion of a mobile DNA element into or near a transcription unit Homologous recombination between mobile DNA elements may contribute to gene duplication and other rearrangements, including duplication of exons (generating gene-families), recombination of exons to create new genes exon shuffling, and altered control of gene expression (copying generegulatory elements between different promoters) Biotechnical evolution : mobile DNA elements are a possible tool for inserting therapeutic genes into patients (gene-therapy) It has recently been discovered that some of these mobile element still play an active role in the human genome

22 6.6 Structural genome organization Procaryotic: Most bacterial genomes are carried in one circular chromosome Stable replication requires one replication origin (ORI) the Genome is packed with polyamines (stabilizing proteins). Eukaryotic: the genome is distributed over several linear chromosomes. Stable replication occurs from several replication origins within each chromosome, and additionally requires: Centromeres (for equal distribution between daughter cells during mitosis) Telomeres (to protect the chromosome ends against shortening during replication Eukaryotic DNA associates with many different proteins to form chromatin

23 6.28 Chromatin exists in extended and condensed forms 10 nm fiber (beads-on-a-string) 30 nm fiber (condensed bead structure) Nucleosome

24 9.5 Nucleosomes are complexes of histones

25 9.5 The solenoid model of condensed chromatin

26 Morphology and functional elements of eukaryotic chromosomes Half the mass of chromatin is made up by proteins, which are determinant for the condensation (-> heterochromatin) and opening (-> euchromatin) of the chromatin structure Microscopic observations on the number and size of chromosomes and their staining pattern has revealed important aspects of chromosome structure

27 Chromosome number, size and shape at metaphase are species specific Two species of Indian deer... At metaphase the chromosomes become very condensed. This is required for a proper separation via the mitotic spindles before mitosis (Anaphase)

28 Heterochromatin consists of chromosome regions that do not uncoil Electron micrograph: the heterochromatin is very condensed (dark appearance)

29 Nonhistone proteins provide a structural scaffold for long chromatin loops Figure 9-34

30 A model for chromatin packing in metaphase chromosomes Metaphase chromosome (electron micrograph)

31 6-41 Stained chromosomes have characteristic banding patterns Giemsa staining of metaphase chromosomes reveal the DNA dense areas as dark bands

32 Stained chromosomes have characteristic banding patterns Figure 9-38

33 6-36 Experimental demonstration of chromatin loops in interphase chromosomes Newest research: chromosome kissing = the scaffold brings two gene areas from different chromosomes into proximity (so that they can be co-regulated) SARS: scaffold associated Regions SARs Sites A-F are separated by millions of bp of linear sequence but are physically close to each other in interphase nuclei Special DNA sequences associate with the scaffold protein in an ordered fashion

34 9.6 Chromosome painting distinguishes each homologous pair by color

35 Reading the histone code The histones can be modified (like many other proteins) The histone ends are acidic and link the nucleosomes together in a tighter structure Acetylation neutralizes the ends -> looser structure -> gene activation Methylation -> prohibits acetylation -> gene silencing Additionally histones can be phosphorylated or ubiquitilated The study of the histone code and how it determines which genes are actively transcriped in a cell is called Epigenetics

36 Organelle genomes: The Kraftwerke of eucaryotic cells carry their own genomes

37 Dual staining shows multiple mt DNA molecules in eucaryotic cell Red= ethidium bromide staining, binds dsdna and emmits light Green= dioc6, which is incorpoated into mitochondria

38 Human mtdna Mitochondrial DNA is circular, multicopy, and has its own genetic triplet-code It encodes a fraction of the genes especially needed in mitochondrial functions Genotype is almost 100% maternal inheritance (!)

39 DNA-fingerprint DNA profiling For the coding part of genes a limited number of alleles exist (high conservation). The remaining ~98% of the human genome has very high allelic variation. -> each human has a unique genomic composition (although one-egged twins are almost identical) Examples of variable regions: number of repeats in repeated DNA, single-nucleotide polymophisms (SNPs), variable number of gene-copies. Identification of these classes of differences can be used to generate techniques for determining individual DNA profiles.

40 All organisms can be DNA profiled DNA-profiling * Determines relationships between the source of two samples (identical?, closely related? Distant related?) (forensic, paternity tests, phylogenetic studies, identification of infections, Dinosaur studies, etc.) * Determines the presence/absence of disease genes (diagnostic)

41 Most commonly used techniques are based on RFLP; Restriction fragment length polymorphism genetic variation causes Restriction sites to dissapear and appear at different distances in the genomes VNTR; variable number of tandem repeats variable number of minisatellites (10-100bp sequences that are present in a variable number of head-to-tail copies (=repeats)) STR; short tandem repeats variable number of microsatellites (4-10bp) SNPs; Single nucleotide polymorphisms Mitochondrial DNA sequencing (when very little non-degraded cellular material is available) Southern blotting, detection with radioactive probes PCR, detection with gelectrophoresis or capillary electrophoresis (very specific, and very little original input DNA required)

42 Simple-sequence DNAs are concentrated in specific chromosomal locations DNA fingerprinting depends on differences in length of simplesequence DNAs

43 PCR-based STR analysis How? DNA is isolated and purified PCR is performed on the DNA with primers flanking known microsatellite areas. (so that only these areas with a high size-variability are amplified) The amplified DNA is size-separated with gel-electrophoresis The resulting pattern of size-fractionated DNA-bands is compared to other patterns to determine similarity. Usually each area is chosen on a separate chromosome. 13 primerpairs (microsatellite areas) are enough to ensure that the actual probability that 2 random persons have the same STR-pattern is only 1 in 3 trillions

44 Examples

45 A modern family tree How are these 2 parents and 4 children related? NB: a classic example of a cheap paternity test! Far to few markers have been applied for it to be significant.

46 Next lecture: Regulation of Transcription Initiation RNA Processing, Nuclear Transport, and Post- Transcriptional Control

47 After completion of this lecture you should be able to: Know how eucaryotic DNA is packaged in the nucleus: doublehelix nucleosomes (beads on a string) condensation into 30 nm fibers (euchromatin) further condensed fibers (heterocromatin) Define nucleosomes and mention their components Know that procaryotic DNA is circular and simply packed with polyamines (whereas eucaryotic DNA is packed with many different proteins including various histones Know the major types of classification of DNA sequences in the genome Know the definition of a gene and be able to sketch the major structure and place the various functional elements (introns, exons, TSS, enhancers, etc...) Know the definition of a pseudo-gene Know that prokaryotic genes differ from eucaryotic genes in that they contains no introns and can be polycistronic. Define the terms polycistronic and monocistronic Define the term protein isoform, and describe how several isoforms can arise from one gene Know that coding sequence (exons) only constitute about 1.1% of the entire human genome and be able to mention some of the other major classes of nuclear eucaryotic DNA Understand the term nonprotein-coding RNA, and give examples Define the term mobile DNA, and briefly describe the two classes; DNA-transposons and Retrotransposons Understand why mobile DNA is important for continued evolutionary development Know the basics of what LINEs and SINEs are Know that eucaryotic chromatin is organised on scaffolds and define what SARs are Define ther term Epigenetics Give examples of the histone code, and its purposes Know that mitochondria have their own circular DNA, reminiscent of ancient eubacterial DNA Describe the purpose os DNA-fingerprinting (DNA-profiling) and give examples of the techniques used for this