Genetic code Codon: triple base pairs defining each amino acid.

Size: px
Start display at page:

Download "Genetic code Codon: triple base pairs defining each amino acid."

Transcription

1 Genetic code Codon: triple base pairs defining each amino acid. Why genetic code is triple? double code represents 4 2 = 16 different information triple code: 4 3 = 64 (two much to represent 20 amino acids) Degenerate site - twofold degenerate site - fourfold degenerate site Synonymous codon Nonsynonymous codon Initiation codon: AUG Stop codon: UAA UAG UGA

2 Genetic code was NOT universal!

3 MITOCHONDRIA - The first global endosymbiosys - Size: 6-2,000 kb 16 kb in human - Maternal inheritance no recombination - In mammals, mtdna evolves 10 times faster than nuclear DNA. nice tool for population genetics and systematics in zoology - In plants, very conserved. various rearrangements. -self-splicing introns - Genes are distributed in BOTH strands.

4 CHLOROPLAST - The second global endosymbiosys - LSC + SSC + two IRs - Genes are distributed in BOTH strands. - Size: kb - Maternal inheritance no recombination - Introns are also found as mitochondria - Generally 4-5 times slower than nuclear genes. - Composed of 1) rrna genes 2) trna genes 3) genes related in photosynthesis - Frequently used gene for plant molecular phylogenetics: ribulose-1,5-biphosphate carboxylase (RUBISCO) large subunit (rbcl)

5 Chase et al. (42 authors) Phylogeny of seed plants: an analysis of nucleotide sequences from the plastid gene rbcl. Annals of Missouri i Botanical Garden 80: The first molecular l analysis for intensive i samples in angiosperms using rbcl gene. - Includs about 500 representatives of angiosperm - Change the concept of major grouping in angiosperms c.f: Dr. Mark Chase FRS: Keeper of the Jordrell Laboratory in the of in the Royal Botanical Garden, Kew

6 THE STRUCTURE OF GENES - Structural genes: trna, rrna, and structural t protein producing genes - Regulatory genes: control expression of other genes. - Housekeeping genes: expressed in all cell types - Tisssue-specific specific genes: ex. Insulin (pancreatic beta-cells) - Making proteins DNA mrna protein translation transcription P t i i l di RNA -Promotor regions including RNA polymerase binding site

7 Sequence motif in promoter region -In bacteria -10 site: TATAAT -35 site: TTGACA Shine-Dalgarno box: AGGAGG (ribosome binding site) - Single promotor controls a set of structural genes: Operon theory -In eukaryotes TATA box CAAT box (in mammals): ~-40. Binding transcription factors (TF) GC box (GGGCGG): ~-110. Binding transcription factors (TF) Polyadenylation signal: AATAA (3 end)

8 Reading frame

9 Ethidiumbromide (EtBr) causes frame shift mutation (carcinogen)!

10 Splicing Introns: sequences which are discarded during protein synthesis - nearly always have GT-----AG structure Exons: encode the finished protein Number, size, and organization of introns varies greatly from gene to gene - histon: no intron pro-α 2 -collagen: over 50 - virus SV40: 31bp human dystrophin gene: 210,000bp - introns contain genes! (e.g. Drosophila Adh gene also contains introns: twintrons) Spliceosomal introns (nuclear introns): spliced by spliconsome (RNA + proteins) Protein-spliced introns: spliced by proteins found in some trna and rrna genes

11 Group I introns: self-splicing intron (without the aid of proteins) -many of them are mobile because they encode DNA endonuclease (transposone) - mitochondria and chloroplast genes, rrna genes of some eukaryotes and in T4 bacteriophage Group II introns: self-splicing but different mechanism - contain sequences similar to those of reverse transcriptase Group III introns: similar to group II but the central portion removed (in Euglena). ) Alternative splicing: gives diversity in functional proteins.

12 Kim et al., Sequence and expression studies of A-, B-, and E-CLASS MADS-box homologues in Eupomatia (Eupomatiaceae): support for the bracteate origin of the calyptra. International Journal of Plant Sciences 166:

13

14 THE FIRST ASSIGNMENT -One page (A4) - definition i i and meaning -due date: Oct. 8. 1) The Evolution of Introns Introns-early theory vs. Introns-late theory 2) Concerted evolution (p77)

15 MULTIGENE FAMILIES Many similar il genes are exist in an organism multigene family - important evolutionary innovations based on gene duplications - ex. β-globin gene family in human, MADS-box gene family in plants Pseudogene: non-functional genes (normally because of premature stop codon) CpG islands: regionshave high concentration of G and C bases - p means phosphate which links the two bases - normally CpG dinucleotide is highly susceptable to mutation through methylation. But CpG islands are protected from methylation by a special DNA repair system - useful in mapping the position of gene loci.

16 Gene duplications in MADS-box family Nam et al. (2004)

17 Thymine Oid Oxidative i deamination Cytosine

18

19 Haploid Diploid Homologous pair Non-homologous pair

20 Transcription by RNA polymerase II mrna c.f. RNA polymerase I rrna RNA polymerase III trna Transcription Factor (TF) Sense strand Antisense strand = complementary DNA = cdna mrna processing 1) splicing 2) 5 capping 3) 3 polya tailing Translation

21 RNA editing: In some genes (especially angiosperm chloroplast and mitochondria), mrna molecule can be changed into different nucleotides mrna sequences can be different from the genomic DNA sequences That care must be taken if mrna and DNA sequences are combined in a phylogenetic analysis

22 Central Dogma

23 MUTATION Point mutation = single substitution transition transversion Synonymous (=silent) substitution Non-synonymous substitution Insertion and deletion (indels) sometimes causes frameshift Chromosomal mutations Polyploidy Aneuploidy (e.g. trisomy21) Inversion Translocation Duplication Deletion DNA REPAIR proofreading (Pfu enzyme)

24 Recombination: based on crossing-over

25 Unequal crossing-over: non-homologues pairing

26 Gene conversion Chi site: DNA regions showing high rate of recombination

27 GENOME ORGANIZATION AND EVOLUTION

28 Cot (mol sec per liter) C-value: single copy genes/ total genes C-value paradox: genome size morphological complexity of organism

29 23,000 protein coding genes!

30 THE EVOLUTION OF MULTIGENE FAMILIES

31 Tandem repeated genes: ribosomal DNAs 18S: component of ribosome small subunit 28S + 5.8S: components of large subunit - ITS regions (ITS1 and ITS2) are frequently used for phylogenetic analysis Pros: abundant target sequence number placed between two highly conserved regions primers are universal Cons: secondary structure formation polymorphism

32 Stem-and-loop structure

33 NON-CODING REPETATIVE DNA Satellite DNA - heterochromatin region - assumed to be junk DNA 60% i hil - 60% in Drosophila - heterochromatin region Minisatellites (=VNTR loci; variable number of tandem repeats - 11~60bp repeats h h i i - assumed to be junk DNA - 60% in Drosophila -powerful molecular l markers for Mendelian inheritance Microsatellites - 2-5bp repeats

34

35 TRANSPOSABLE ELEMENTS: increase their copy number by jumping around the genome - more than 50% of maize genome Transpose through an intermediate RNA stage (retrotransposition) Transpose directly from DNA to DNA

36 Gene cloning Complementary DNA (cdna) DNA library Cloning vector Plasmids Cosmids YACs Recambinat DNA PCR Primers Taq. Polymerase Restriction enzymes Electrophoresis Restriction Fragment Length Polymorphism (RFLP) Randomly Amplified Polymorphic DNA (RAPD) DNA sequencing

37 NEXT TWO WEEK: Short lecture and lab 412 Science Hall for DNA extraction, PCR, gel electrophoresis, and restriction site analysis