- all cells express large number of the same genes - housekeeping or common genes - many cells also express cell-type-specific genes

Size: px
Start display at page:

Download "- all cells express large number of the same genes - housekeeping or common genes - many cells also express cell-type-specific genes"

Transcription

1 Developmental Biology - Biology 4361 Lecture 8 - Differential Gene Expression October 13, 2005 The principle of genomic equivalence states that all cells in a developing organism have the same genetic constitution. - therefore, differences in cells (e.g. determination, bias, differentiation) must arise from differential gene expression - all cells express large number of the same genes - housekeeping or common genes - many cells also express cell-type-specific genes - cell-type-specific gene products include: - bulk products (e.g. ovalbumin, globins); usually from terminally differentiated cells - smaller amounts of regulatory proteins, e.g. transcription factors - crucial in stepwise determination process - cell-type-specific gene expression depends on stage and body position - selective gene transcription depends on regulatory signals - where do regulatory signals come from? - from the egg: maternal stores (e.g mrnas in the egg) - zygotic de novo expression (i.e. activation of genes expressing regulatory proteins) of zygotic messages Transcriptional control - - dependent on regulatory DNA sequences (cis-acting) - and on regulatory proteins (trans-acting) - regulatory proteins can enhance or inhibit transcription - regulatory protein control (sometimes) by intra- and inter-cellular molecular signals Evidence for transcriptional control - polytene chromosome puffs - variable among different cell types - cdna libraries from different cells or developmental periods - widely varying suites of messages in different tissues - in situ hybridization - method (see e.g. Kalthoff, method 8.1, p. 179) DNA Sequences Controlling Transcription - gene anatomy (vertebrate): [general convention: DNA - 3'-5' = upstream - downstream; RNA polymerase runs from a 3' to 5' direction on the sense strand; produces RNA in a 5' to 3' orientation, matching the antisense strand; transcribed genes referred to in 5' - 3' = upstream - 1

2 downstream orientation] - promoter region - binding and positioning RNA polymerase - transcription initiation site (contains cap sequence; will receive 5' cap) [Note - by convention, referred to as +1 base pair (bp) in gene sequence] - 5' cap = methylated guanosine in opposite polarity to RNA - necessary for ribosome binding - may protect mrna from exonucleases - translation initiation site (ATG = AUG in RNA) - 5' untranslated region (5' UTR) - space between transcription and translation initiation sites (50 bp in human globin gene); can determine rate at which translation is initiated - exons (RNA which exits nucleus) - coding sequence - introns - intervening sequences - non-coding sequence - translation termination codon (TAA = UAA in RNA); ribosome dissociates - 3' untranslated region (3' UTR); - includes sequence AATAAA - needed for polyadenylation - poly(a) tail ~ A - gives stability to mrna - allows exit from nucleus - permits translation - may protect mrna from exonucleases - regulatory region - cis-acting sequences upstream of transcription start site - regulatory DNA sequences function to accelerate or inhibit binding of RNA polymerase II to transcription start site - regulation indirect - i.e. - sequences recognized by and bind trans-acting transcription factors - regulate assembly of large protein complex (basic transcription factors) which determines activity of RNA polymerase II Promoter - binds RNA polymerase [ RNA polymerases - RNA polymerase I - rrnas - RNA polymerase II - mrna - RNA polymerase III - trna, 5s rrna, other small nuclear & cytosolic RNAs - others in mitochondria, chloroplasts - RNA polymerase II produces pre-messenger RNA (aka nuclear RNA, nrna; heterogeneous nuclear RNA) - or just mrna prior to RNA processing] 2

3 - promoter necessary for all transcription - known as core promoter or basal promoter or proximal promoter - most located directly upstream - 5'- of transcription start site - several consensus sequences (shared sequences) among species - e.g. TATA box; A/T base pairs; usu. located -25 to sequence binds RNA polymerase II - NOTE - no binding without additional transcription factors (eukaryotes) - at least 6 transcription factors necessary - e.g. IIB recognition element - binds transcription factor IIB - others e.g. in TATA-less promoters - contain initiator element, downstream promoter element - located ~ basal (general) transcription factors - promoter attracts and binds basal transcription factors (general transcription factors) - general transcription factors bind to core promoters of most genes - ~ 20 general transcription factors - TFIID - recognizes TATA box - subunit (TATA-binding protein = TBP) recognizes TATA sequence - position RNA polymerase molecule near transcription start site - e.g. TFIIA, TFIIB, etc. - TATA box-binding protein (TBP) - TBP-associated factors (TAFs); at least 8 - some genes operate with multiple core promoters - can be regulated by multiple trans-elements - multiple promoters can operate at different levels; e.g. - weak or strong promoters - slow or rapid RNA polymerase binding = slow or rapid transcription - different combinations of general transcription factors are able to stimulate transcription - to variable degrees - distribution of general transcription factors fairly equal among cell types - concentration of general transcription factors probably not limiting in any cell Enhancers - increase or inhibit (= silencer) activity of core promoters - distinguished from promoters: 1. need promoter to work; promoter works independently 2. enhancers effective from a distance away from transcribed gene; promoters always in proximity 3. enhancers are effective in reverse orientation - promoters are not 3

4 - promoters and enhancers are equivalent among all cell in an organism - it follows that selective gene activity must be controlled by differential transacting elements Transcription Factors - functional domains - DNA binding domain - binds to promoter or enhancer - activation domain - interacts with other transcription factors and/or RNA polymerase - many dimeric - may be homodimers or heterodimers - dimerization domain - ligand-binding domain (e.g. estrogen, retinoic acid) - enhancer may bind multiple transcription factors - each will have different effect on assembly or activity of transcription complex - net outcome depends on relative concentration of each competing factor - transcriptional activators and repressors - bind to upstream promoter elements and enhancers - these sequences limited to certain genes - concentration of specific activators and repressors highly variable among cell types - thus, highly variable transcriptional activity between cells; e.g. differential gene expression - activators/repressors bind to enhancer elements far away from transcribed sequence - spacer elements loop ; bring activator/repressor into contact with general TFs - many activator/repressors act on TAFs associated with TBP - TFIID (containing TBP + associated TAFs) view as central processing unit that integrates modifying signals received from several enhancers simultaneously - repressors/activators can work combinatorially - relatively small number can control many different genes - composite response elements - closely spaced or overlapping recognition sequences - some bound competitively by activators or repressors that mutually displace each other - some bind two or more activators/repressors simultaneously 4

5 - insulators - keep action of enhancers restricted to nearest promoter - transcription factors are gene products - synthesis is controlled by other transcription factors - or the same ones - Note that Pax1 acts as a transcription factor for its own gene Transcription factor families - based on structure of functional domains; esp. DNA-binding domains - also multimerization domains - e.g.: - helix-turn-helix - homeodomain - zinc finger ~ 30 aa s folded around zinc atom - leucine zipper - common dimerization domain - basic helix-loop-helix - dimerization - functional domains well conserved Transcription factor regulation - expression of some tied to environmental conditions; e.g. - heat shock genes - metallothionein - p450 genes (detoxification) - some produced as inactive precursors; activation = regulation - multimerization - Note - homodimers and heterodimers (RAR-RXR; TR-RXR) - transport from cytoplasm - activation by phosphorylation/dephosphorylation - dependent on activity of kinases/phosphatases - regulation of activity state of kinases/phosphatases can produce regulatory cascades - ligand binding (i.e. receptors; e.g. estrogen receptor, thyroid, glucocorticoid) - hormone-activated receptors (transcription factors) - steroid hormone - small, lipid soluble - receptors in cytoplasm (e.g. testosterone, progesterone) or in nucleus (estrogen) - only target cells have hormone receptor - receptor binds steroid, release inhibitory protein - receptor-ligand binds steroid response element (enhancer) - note - most hormone receptors dimerize 5

6 - some homodimers - some heterodimers - most hormone receptors with ligand activate - some repress activity - some non-ligand-bound sit on response element and repress transcription - Nuclear hormone receptor family ~150 members - including orphan receptors - no known ligand Chromatin structure and transcription - DNA contained in chromatin; associated with histones and other proteins - structure affects accessibility of genes for transcription - condensed chromatin - not readily expressed - DNA in diffuse state (interphase) = euchromatin ( eu = good) = transcribed - DNA in condensed state (heterochromatin; hetero = other) = not transcribed - constitutive heterochromatin - present in all individuals and species - facultative heterochromatin - only for one sex or at certain stage of development - e.g. X-chromosome condensation - polytene chromosome puffs - active transcription - basic packaging units of chromatin - nucleosome - ~150 bp DNA wrapped around core of eight histone proteins (octomer) - 2 each: H2A, H2B, H3, H4 - connected by DNA linker to next nucleosome - linker associated with histone H1 - nucleosomes form beaded string configuration - decondensed model - fairly accessible to transcription factors and polymerase - nucleosomes further packaged as helix - 6 nucleosomes/turn - 30 nm diameter - helix forms solenoid configuration - less accessible for transcription factors, etc. - tests/assays for transcription in different chromatin configurations using nucleases - genes most actively transcribed are also most vulnerable to DNase I - decondensed regions often ~100 kb+ - within decondensed regions - DNase I-hypersensitive sites - may reflect disruption of nucleosome structure - coincide with promoter/enhancer sites 6

7 - histone acetylation/deacetylation - eight core histones positioned with NH 4+ termini facing out - accessible to acetyltransferases and deacetylases - add or remove acetyl groups (CH 3 CO) - lysines acetylated - shifts charge (positive to neutral) - acetylation of lysine = histone-dna binding relaxed - regulation of histone acetyltransferase - some interact with TAF - some act as transcriptional coregulators RNA processing - 5'-capping - guanosine in reverse orientation - transport out of nucleus - prevents exonuclease - ribosome binding and translation initiation - 3'-polyadenylation - transport - number of translation rounds - alternative splicing - average vertebrate nrna consists of relatively numerous short exons - avg exon ~140 bp - introns much longer (generally) - splice different exons = different mrnas = different proteins - exon/intron recognition critical - exons/introns may be exchangeable in different nuclei - most genes contain consensus sequences at 5' and 3' ends of introns = splice sites - spliceosomes - small nuclear RNAs (snrnas) plus proteins = snrnps (particles) RNAs + up to 10 proteins - proteins - splicing factors; bind to splice sites or adjacent areas - different splicing factors = different spliceosomes = alternative splicing - regulatory proteins may block splice sites/make weak splice sites stronger - regulated splicing pattern - depends on one or more regulatory proteins - default splicing pattern - occurs in absence of regulatory proteins - differential RNA splicing found to control alternative forms of expression of genes 7

8 encoding over 100 proteins - estimated that ~ 35% of all human genes produce alternatively spliced RNAs - one gene - one protein (polypeptide)? actually: one gene - one family of proteins = splicing isoforms = splice variants Translational control - differential mrna longevity - stability often dependent on poly(a) tail length - poly(a) length depends on sequences in 3' UTR - longer half-life = more translation - some messages stabilized at certain times and places - selective inhibition of mrna translation - e.g. in egg - many messages not translated until ovulation/fertilization - stabilized by 5' cap and 3' UTR - regulate accessibility of mrna to ribosomes - i.e. no cap/no poly(a) tail = no translation - some non-capped messages capped after fertilization (e.g. tobacco hornworm) - short poly(a) = no degradation, no translation; add poly(a) (Drosophila) - amphibian oocytes - 5' and 3' ends tethered by protein called maskin - forms mrna circle - must be cleaved prior to translation 8

9 Method - in situ hybridization Method - gene fusion - reporter genes - composites of regulatory region from eukaryotic gene with bacterial reporter - gene transcription will be carried out in accordance with normal regulatory influences - reporter will be transcribed - chloramphenicol acetyltransferase (CAT) - -galactosidase (lacz) - green fluorescent protein (GFP; coelenterate) - fusion genes can be used for precise functional tests of promoter and enhancer sequences e.g. - deletion mapping - point mutagenesis - techniques change sequences one base-pair at a time; - reveal crucial sequences - e.g. - boxes - sites - cis-elements - motifs - common regulatory sequences in multiple genes 9