Intercalated cytosine motif and novel adenine clusters in the crystal structure of the Tetrahymena telomere

Similar documents
Structural Bioinformatics (C3210) DNA and RNA Structure

MCB 110:Biochemistry of the Central Dogma of MB. MCB 110:Biochemistry of the Central Dogma of MB

Nucleic acids. How DNA works. DNA RNA Protein. DNA (deoxyribonucleic acid) RNA (ribonucleic acid) Central Dogma of Molecular Biology

Canonical B-DNA CGCGTTGACAACTGCAGAATC GC AT CG TA AT GC TA TA CG AT 20 Å. Minor Groove 34 Å. Major Groove 3.4 Å. Strands are antiparallel

Structure of nucleic acids II Biochemistry 302. January 20, 2006

MBMB,BCHM, or CHEM 451A

Chapter 1 Structure of Nucleic Acids DNA The structure of part of a DNA double helix

RNA does not adopt the classic B-DNA helix conformation when it forms a self-complementary double helix

Nucleotides: structure and functions. Prof. Dalė Vieželienė Biochemistry department Room No

The Double Helix. DNA and RNA, part 2. Part A. Hint 1. The difference between purines and pyrimidines. Hint 2. Distinguish purines from pyrimidines

All Rights Reserved. U.S. Patents 6,471,520B1; 5,498,190; 5,916, North Market Street, Suite CC130A, Milwaukee, WI 53202

Structure of nucleic acids II Biochemistry 302. Bob Kelm January 21, 2005

DNA AND CHROMOSOMES. Genetica per Scienze Naturali a.a prof S. Presciuttini

Gene Expression - Transcription

MOLECULAR STRUCTURE OF DNA

Chapter 5: Nucleic Acids, etc.

What Are the Chemical Structures and Functions of Nucleic Acids?

X-Ray Diffraction by Macromolecules

Lecture Overview. Overview of the Genetic Information. Marieb s Human Anatomy and Physiology. Chapter 3 DNA & RNA Protein Synthesis Lecture 6

Packing of Secondary Structures

Syllabus for GUTS Lecture on DNA and Nucleotides

38. Inter-basepair Hydrogen Bonds in DNA

Chapter 9: DNA: The Molecule of Heredity

NUCLEIC ACIDS Genetic material of all known organisms DNA: deoxyribonucleic acid RNA: ribonucleic acid (e.g., some viruses)

Appendix A DNA and PCR in detail DNA: A Detailed Look

Ch 10 Molecular Biology of the Gene

By the end of today, you will have an answer to: How can 1 strand of DNA serve as a template for replication?

Diversity in DNA recognition by p53 revealed by crystal structures with Hoogsteen base pairs

Bioinformatics. ONE Introduction to Biology. Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012

BIOLOGICAL SCIENCE. Lecture Presentation by Cindy S. Malone, PhD, California State University Northridge. FIFTH EDITION Freeman Quillin Allison

Algorithms in Bioinformatics

Paper 4: Biomolecules and Their Interactions Module 14: Chargaff's rule, DNA polymorphism

CSE : Computational Issues in Molecular Biology. Lecture 19. Spring 2004

A nucleotide consists of: an inorganic phosphate group (attached to carbon 5 of the sugar) a 5C sugar (pentose) a Nitrogenous (N containing) base

Nucleic Acids: DNA and RNA

LABS 9 AND 10 DNA STRUCTURE AND REPLICATION; RNA AND PROTEIN SYNTHESIS

Structure formation and association of biomolecules. Prof. Dr. Martin Zacharias Lehrstuhl für Molekulardynamik (T38) Technische Universität München

Nucleic acids and protein synthesis

Protein Folding Problem I400: Introduction to Bioinformatics

BIOCHEMISTRY Nucleic Acids

TERTIARY MOTIF INTERACTIONS ON RNA STRUCTURE

GENETICS الفريق الطبي االكاديمي. DNA Genes & Chromosomes. DONE BY : Buthaina Al-masaeed & Yousef Qandeel. Page 0


Nucleic Acid Structure. Nucleic Acid Sequence Abbreviations. Sequence Abbreviations, con t.

Chapter 8 Nucleotides & Nucleic Acids

DNA & DNA : Protein Interactions BIBC 100

DNA RNA PROTEIN SYNTHESIS -NOTES-

9/3/2009. DNA RNA Proteins. DNA Genetic program RNAs Ensure synthesis of proteins Proteins Ensure all cellular functions Carbohydrates (sugars) Energy

Chapter Fundamental Molecular Genetic Mechanisms

THE CELLULAR AND MOLECULAR BASIS OF INHERITANCE

Hmwk # 8 : DNA-Binding Proteins : Part II

Molecular Biology (1)

DNA: The Genetic Material. Chapter 14. Genetic Material

Genome Architecture Structural Subdivisons

Lecture 2: Central Dogma of Molecular Biology & Intro to Programming

NUCLEIC ACID. Subtitle

Nucleic Acids. One-letter abbreviation. 1.1* Base / Nucleotide DNA / RNA / Both. 1.2* Base / Nucleotide DNA / RNA / Both

Nucleic Acids: Structure and Function

The structure, type and functions of a cell are all determined by chromosomes:

Structure of DNA [pln39]

DNA STRUCTURE AND REPLICATION

Nucleic Acids and the RNA World. Pages Chapter 4

DNA Replication and Protein Synthesis

DNA Structure & the Genome. Bio160 General Biology

translation The building blocks of proteins are? amino acids nitrogen containing bases like A, G, T, C, and U Complementary base pairing links

The Molecul Chapter ar Basis 16: The M of olecular Inheritance Basis of Inheritance Fig. 16-1

Molecular Genetics Quiz #1 SBI4U K T/I A C TOTAL

CHAPTER 4, Part 1: LECTURE TOPICS: DNA and RNA - MOLECULES OF HEREDITY

4) separates the DNA strands during replication a. A b. B c. C d. D e. E. 5) covalently connects segments of DNA a. A b. B c. C d. D e.

Poly(dA:dT)-rich DNAs are highly flexible in the context of DNA looping: Supporting Information

Components of DNA. Components of DNA. Aim: What is the structure of DNA? February 15, DNA_Structure_2011.notebook. Do Now.

Genetic material must be able to:

UNIT 24: Nucleic Acids Essential Idea(s): The structure of DNA allows efficient storage of genetic information.

Ch Molecular Biology of the Gene

After 1 week, attractive bright red-orange, tetragonal, rod-like. crystals appeared. Spectroscopic analysis of the dissolved crystal

DNA vs. RNA B-4.1. Compare DNA and RNA in terms of structure, nucleotides and base pairs.

Polymerase chain reaction

Fig. 16-7a. 5 end Hydrogen bond 3 end. 1 nm. 3.4 nm nm

DNA and RNA. Chapter 12

Frederick Griffith. Dead Smooth Bacteria. Live Smooth Bacteria. Live Rough Bacteria. Live R+ dead S Bacteria

Topic 1 Year 10 Biology

The Molecular Basis of Inheritance

3.1.5 Nucleic Acids Structure of DNA and RNA

DNA Replication AP Biology

Friday, April 17 th. Crash Course: DNA, Transcription and Translation. AP Biology

Sarah AlDosari SPECTRAL CHARACTERIZATION OF DNA

DNA is the genetic material. DNA structure. Chapter 7: DNA Replication, Transcription & Translation; Mutations & Ames test

How do we know what the structure and function of DNA is? - Double helix, base pairs, sugar, and phosphate - Stores genetic information

RNA is a single strand molecule composed of subunits called nucleotides joined by phosphodiester bonds.

Nucleic Acids, Proteins, and Enzymes

Proteins Higher Order Structures

DNA Replication AP Biology

Protein Synthesis. OpenStax College

Double helix structure of DNA

BIOB111 - Tutorial activity for Session 13

DNA is a nucleic acid which acts as molecular repository for all genetic information

Super Models. Deoxyribonucleic Acid (DNA) Molecular Model Kit. Copyright 2015 Ryler Enterprises, Inc. Recommended for ages 10-adult

DNA: Structure and Replication - 1

DNA, RNA, PROTEIN SYNTHESIS, AND MUTATIONS UNIT GUIDE Due December 9 th. Monday Tuesday Wednesday Thursday Friday 16 CBA History of DNA video

NUCLEIC ACIDS AND PROTEIN SYNTHESIS

Transcription:

4696 4705 Nucleic Acids Research, 1998, Vol. 26, No. 20 1998 Oxford University Press Intercalated cytosine motif and novel adenine clusters in the crystal structure of the Tetrahymena telomere Li Cai, Liqing Chen, Sridharan Raghavan, Robert Ratliff 1, Robert Moyzis 1 and Alexander Rich* Department of Biology, Room 68-233, Massachusetts Institute of Technology, Cambridge, MA 02139, USA and 1 Center for Human Genome Studies, Los Alamos National Laboratories, Los Alamos, NM 87545, USA Received June 12, 1998; Revised and Accepted August 9, 1998 PDB accession no. 294D ABSTRACT The cytosine-rich strand of the Tetrahymena telomere consists of multiple repeats of sequence d(aacccc). We have solved the crystal structure of the crystalline repeat sequence at 2.5 Å resolution. The adenines form two different and previously unknown clusters (A clusters) in orthogonal directions with their counterparts from other strands, each containing a total of eight adenines. The clusters appear to be stable aggregates held together by base stacking and three different base-pairing modes. Two different types of cytosine tetraplexes are found in the crystal. Each four-stranded complex is composed of two intercalated parallel-stranded duplexes pointing in opposite directions, with hemiprotonated cytosine-cytosine (C C + ) base pairs. The outermost C C + base pairs are from the 5 -end of each strand in one cytosine tetraplex and from the 3 -end of each strand in the other. The A clusters and the cytosine tetraplexes form two alternating stacking patterns, creating continuous base stacking in two perpendicular directions along the x- and z-axes. The adenine clusters could be organizational motifs for macromolecular RNA. INTRODUCTION Telomere DNA located at chromosome ends with many repeating sequences plays a vital role in chromosomal stability (1,2). It is important in both the normal control of cell proliferation and the abnormal growth of cancer (3). The first telomere DNA was isolated from the ciliate Tetrahymena thermophila in the early 1970s (4). Its G-rich strand contains repeats of a short sequence, d(ggggtt), and its complementary C-rich strand contains repeated d(aacccc). Both of these repeating segments can exist as four-stranded molecules as well as in DNA duplex form. It has long been known that polymers containing cytosine can form three hydrogen bonds with another cytosine if they are hemiprotonated (5 8). More recent NMR experiments on d(tc 5 ) and related sequences yielded an unusual structural motif: an intercalated tetraplex (I motif), in which the same C C + pairings were seen in two parallel-stranded duplexes intercalated into each other in an antiparallel fashion (9). The first crystal structure of a C-rich sequence d(c 4 ) confirmed the novel I motif and revealed more detailed structural information (10). Subsequently, several additional crystal studies of sequences with cytosine stretches have also revealed the I motif and showed structural variation among different sequences (11 13). In these sequences, the bases attached to the cytosine tetraplex have shown a great degree of structural variability. In the metazoan telomeric sequence d(taaccc), a stabilized loop was formed by TAA. However, in the Tetrahymena telomeric sequence, d(aacccc), the structure displays a novel structural motif: the adenine cluster (A cluster). The adenines located at the 5 -end of each strand form two different types of A clusters, with three stacking base pairs in one and four stacking base pairs in another. Three different base pairing modes are involved. The stacked A A base pairs in each A cluster also stack upon the two different types of cytosine tetraplexes in orthogonal directions to form alternating A cluster C tetraplex base stacking continuously along the x- and z-axes. These features have some similarities with another recently solved structure d(aaccc) (L.Chen, L.Cai, Q.Gao and A.Rich, in preparation). There are two cytosine tetraplexes in an asymmetric unit, however, there are significant differences in their geometries. MATERIALS AND METHODS The oligodeoxyribonucleotide d(aacccc) was synthesized on an Applied Biosystem DNA synthesizer. It was then purified by HPLC with a linear gradient of 5 40% acetonitrile in 0.1 M triethylammonium acetate buffer, ph 7.0. Crystals were grown at room temperature by vapor diffusion using the sitting drop method from solutions containing 2.0 mm d(aacccc) and 100 mm sodium cacodylate buffer adjusted to various ph values and equilibrated with a reservoir of 70% ammonium sulfate. The best crystal, measuring 0.3 0.2 0.1 mm, was obtained with buffer at ph 7.5. The crystal diffracted to 2.5 Å resolution. It crystallizes in space group P22 1 2 1 with cell dimensions a = 35.93, b = 52.33, c = 76.94 Å. All diffraction data were collected on a Rigaku R-AXIS II imaging plate system at 4 C and processed with the PROCESS program provided by the Molecular Structure Corporation. The data set was collected to 2.5 Å resolution, with 64 frames at a crystal-to-plate distance of 120 mm using 4 *To whom correspondence should be addressed. Tel: +1 617 253 4715; Fax: +1 617 253 8699; Email: cbeckman@mit.edu

oscillations. There were 4628 independent reflections above the 1σ (I) level from 20 to 2.5 Å. Seventy-five percent of the reflections were observed in the resolution shell between 2.75 and 2.5 Å. Overall completeness from 20 to 2.5 Å is 86.5%. See Table 1 for a summary of crystal data and data collection statistics. Table 1. Crystallographic data Crystal data for d(aacccc) Space group P22 1 2 1 Unit cell a = 35.93 Å, b = 52.33 Å, c = 76.94 Å Strands per unit cell 32 Strands per asymmetric unit 8 Summary of data collection statistics Resolution 20 2.5 Å Number of observations 33 551 Number of unique reflections 4628 Overall completeness 86.5% Outermost shell 2.75 2.5 Å Outermost shell completeness 75% R-merge 6% Refinement statistics Resolution 10 2.5 Å Number of reflections 4628 Number of non-hydrogen DNA 836 atoms Number of water molecules 61 RMS bond length 0.016 Å RMS bond angle 3.7 R-factor 0.21 Free R-factor 0.29 Several I motif crystal structures have been solved using molecular replacement techniques (11 13). This structure was also solved by that method using XPLOR (14). The starting model used the I motifs from the crystal structure of d(aaccc), which was solved by the single isomorphous replacement and single anomalous scattering method as the crystal soaked with HgCl 2 was isomorphous to the native crystal (L.Chen, L.Cai, A.Gao and A.Rich, in preparation). Rotation and translation searches with that model at various resolution ranges of the d(aacccc) diffraction data always led to the same orientation of the molecule in the lattice. This clearly showed that the asymmetric unit contained eight independent strands of d(aacccc), enough to form two independent cytosine tetraplexes. The position of the molecule showed that orientation of the helical axis of one tetraplex was parallel to the x-axis and the helical axis of the other parallel to the z-axis. This stacking pattern is in agreement with the native Patterson map of the molecule. After several cycles of rigid body refinement using 10 2.5 Å data, the difference map allowed us to identify the missing adenines and the extra cytosines. We then carried out simulated annealing refinement, leading to an R-factor of 25.2%. Twenty cycles of restrained individual isotropic B-factor refinement followed. 4697 Nucleic Acids Research, 1994, 1998, Vol. Vol. 22, 26, No. No. 120 4697 Well-ordered water molecules were then located from the difference Fourier map (F o F c ) and added as oxygen atoms to the model only if they had a peak height of >3σ in the difference density map. A total of 61 water molecules were found in this way. A final round of refinement completed the structural determination with an R-factor of 0.213 and root mean square (RMS) deviations from ideal bond lengths and angles of 0.016 Å and 3.744, respectively. The free R-factor (15) based on a random subset of 10% of the reflections is 29%. The refinement statistics are listed in Table 1. The coordinates have been deposited in the Brookhaven Protein Data Bank (accession no. 294D). RESULTS Two different cytosine tetraplexes The oligonucleotide d(aacccc) crystallizes in the orthorhombic space group P22 1 2 1. There are eight strands in the asymmetric unit, enough to form two cytosine tetraplexes. Figure 1a and b shows tetraplexes 1 and 2, respectively, together with the adenines. The center of each figure shown is the four cytosines from four different chains organized into an intercalation motif. In Figure 1a the cytosine bases stack along the x-axis and in Figure 1b they stack along the z-axis. There is an average stacking distance of 3.2 Å between adjacent cytosines from different strands. The stacking distance of 3.2 Å is in agreement with those of previously solved C tetraplex structures and it occurs when stacking is limited to the exocyclic amino and carbonyl groups and does not involve the pyrimidine rings. The adenines from each strand project out at the top and bottom, with the planes of the adenine bases nearly perpendicular to those containing the cytosine bases (with the exception of A52 and A71). Thus the adenines in Figure 1a are perpendicular to the z-axis and the adenines in Figure 1b are perpendicular to the x-axis. Careful inspection clearly shows that the configurations of the two cytosine tetraplexes differ in a subtle way. Tetraplex 1 (Fig. 1a), which is oriented along the x-axis, has the outermost C C + layers coming from the 3 -end of the strands. However, tetraplex 2 (Fig. 1b), which is oriented along the z-axis, has the outermost C C + layers coming from the 5 -end of the strands. Thus there is a significant variation among the two C tetraplexes. In each tetraplex, the interaction of two parallel duplexes yields a quadruplex with two wide and two narrow grooves which, as in d(c 4 ), are largely symmetrical about the helical axis. The narrow groove is made up of two closely packed strands in antiparallel orientation. The two backbone chains fit into each other remarkably well in a zig-zag fashion. They are so close to each other that some interchain P P distances are even shorter than intrachain ones. In tetraplex 1, the average intrachain P P distance is 6.33 Å. The average interchain P P distance across the minor groove is 6.36 Å, with the shortest being 5.62 Å. The average interchain P P distance across the minor groove for tetraplex 2 is comparable at 6.81 Å. The minor groove is so narrow that there is little room left to trap anything. Indeed, we find no water molecules inside the minor groove. In contrast, the major grooves are very wide. The average interchain P P distances across the major grooves of tetramers 1 and 2 are 16.09 and 15.19 Å, respectively. This symmetric feature of two broad grooves is very different from that seen in the metazoan telomeric structure d(taaccc) (12), where one broad

4698 Nucleic Acids Research, 1998, Vol. 26, No. 20 a b

4699 Nucleic Acids Research, 1994, 1998, Vol. Vol. 22, 26, No. No. 120 4699 Figure 1. Structure of cytosine tetraplexes. (a) Cytosine tetraplex 1 of structure d(aacccc). (Left) A schematic diagram illustrating the overall configuration of tetraplex 1. The two strands that are parallel and form hydrogen bonds between their cytosine bases are colored black, while the other two are colored white. (Right) View into the major groove of tetraplex 1. The major groove is wide and open. The center of the molecule is composed of intercalating cytosine residues held together by C C + base pairs. Note that there are two adenine residues at the 5 -end of each strand and that they project away from the center of the molecule. The outermost C C + base pairs of the tetraplex are from the 3 -end of each strand. (b) Cytosine tetraplex 2 of structure d(aacccc). (Left) A schematic diagram illustrating the overall configuration of tetraplex 2. The two strands that are parallel and form hydrogen bonds between cytosine bases are colored black, while the other two are colored white. Residues with asterisks represent symmetry-related residues (equivalently, we could have chosen the asymmetric unit in such a way that four strands in the asymmetric unit would form tetraplex 2). (Right) View into the major groove of tetraplex 2. The intercalating motif here is very similar to that of tetraplex 1. However, the outermost C C + base pairs of the tetraplex are from the 5 -end of each strand. groove is very flat and the phosphate groups in the other broad groove are rotated away from the center and bend over towards each other, stabilized by the bridging water molecules between phosphate oxygens and cytosine N4 groups. Both major grooves in d(aacccc) are very flat. Figure 2 shows the flat nature of the broad grooves with two stacked C C + base pairs from tetraplex 2 of d(aacccc), together with two water molecules that are within 3.3 Å of the base pairs. It also shows the two very wide, flat grooves. The heavy hydration with bridging water molecules between phosphate oxygens and cytosine N4 groups seen in other structures (11 13) is clearly absent here. In both tetraplexes 1 and 2, the molecules twist slowly in a right-handed manner. The average twist in both tetraplexes is 16.6, with a standard deviation of 3.4. Thus, one cytosine base pair is on average twisted 16.6 relative to its covalent neighbor. This is somewhat larger than the twist value in d(c 4 ), which is 12.4 (10). Two adenine clusters A novel feature of this structure is the presence of two groupings containing only adenine residues. They provide the interactions which hold the lattice together. The adenine bases, as shown in Figure 1, project away from the direction of the cytosine tetraplex. They form base pairs with adenine residues of neighboring strands (some of them are symmetry-related), creating two different kinds of adenine clusters in two orthogonal directions. Figure 3a is a schematic diagram illustrating the origin of the eight adenines in A cluster 1, which has bases perpendicular to the z-axis. There are a total of four base pairs in the cluster, shown in skeletal models in Figure 3b, stacking on top of each other with an average stacking distance of 3.5 Å. The increase in stacking distance from 3.2 Å in C C + base pair stacking to 3.5 Å when A A base pairs are involved is due to the involvement of aromatic rings Figure 2. Two adjacent layers of C C + base pairs from tetraplex 2 along with two water molecules that are within 3.5 Å of the base pairs. The view is down the axis of the molecule, which is the z-axis. Unlike structures such as d(aaccc) and d(taaccc), the broad grooves of this structure are essentially flat and the phosphates are not bent over. There is also no water molecule bridging the cytosine N4 amino group with the phosphate oxygens on the opposite side of the groove. The absence of this feature shows the variability of cytosine tetraplexes.

4700 Nucleic Acids Research, 1998, Vol. 26, No. 20 a b c

4701 Nucleic Acids Research, 1994, 1998, Vol. Vol. 22, 26, No. No. 120 4701 d d Figure 3. Adenine clusters of d(aacccc). (a) A schematic diagram of adenine cluster 1 illustrating the formation of A cluster 1 and its relation to the cytosine residues of the strands. There are two parallel backbone A A base pairs, A20*-A30* and A21*-A31*. The other two A A base pairs, A2*-A11 and A1*-A12, have antiparallel backbones. Every cytosine portion of the four strands combines with three other symmetry-related cytosines strands (not shown) to form tetraplex 1. Thus, there are four cytosine tetraplexes 1 connected by A cluster 1. (b) Skeletal view of A cluster 1 connecting four cytosine strands which belong to four different cytosine tetraplexes. It consists of four stacking A A base pairs. It stacks on two cytosine tetraplexes 2 (not shown), at both the top and bottom, forming a continuous stacking along the z-axis. (c) A schematic diagram of A cluster 2. Note there are only three stacking base pairs. The other two bases stack on each other, tilted 38 from the other three base pairs. Like A cluster 1, A cluster 2 connects four cytosine tetraplexes 2. Of the three A A base pairs, A61-A72 and A41-A51 are parallel while A42-A62 is antiparallel. (d) Skeletal view of A cluster 2 connecting four cytosine strands which belong to four different cytosine tetraplexes. It has three stacking A A base pairs shown at the top. It stacks on two cytosine tetraplexes 1 (not shown), at both the top and bottom, forming continuous stacking along the x-axis. At the lower right, two stacking bases A52 and A71 are shown. in stacking. When stacking interactions involve the aromatic rings, such as is the case in B-DNA, the stacking distance is generally 3.4 3.5 Å. These four base pairs are in turn sandwiched between two symmetry-related cytosine tetraplexes running along the z-axis, from the top to the bottom of the unit cell. The adenine and cytosine bases effectively show continuous stacking along the z-axis. Another adenine cluster, A cluster 2, also made up of eight adenine residues, has most of the bases perpendicular to the x-axis. As shown in the schematic diagram of Figure 3c, there are three A A base pairs, stacking along the x-axis. The other two adenine bases, A52 and A71, loosely stack upon each other and are tilted 38 from the three paired adenines in the cluster. Figure 3d shows a skeletal view. In a similar way, the three stacking base pairs in this cluster are also sandwiched between two symmetry-related cytosine tetraplexes along the x-axis and this alternating C tetraplex A cluster stacking pattern creates continuous stacking along the x-axis. Three modes of base pairing Close inspection of Figure 3b and d reveals that the polarities of the backbones holding the seven base pairs are not the same. In Figure 3b, base pairs A1*-A12* and A2*-A11* are antiparallel, while base pairs A20*-A30* and A21*-A31* are parallel. In Figure 3d, base pairs A51-A41 and A61-A72 are parallel, while A62-A42 is antiparallel. Among the four parallel A A base pairs, there exist all three possible different A A base pairing modes. Figure 4a shows base pair A20*-A30*. It is a symmetric A A N7 amino group base pairing of the type found in poly(a) fibers (16) and in yeast trna Phe (17). Figure 4b shows base pair A21*-A31*. It is a symmetric A A N1 amino group base pairing. Base pairs A51-A41 and A61-A72 adopt another paring mode, which is asymmetric N1 amino group, N7 amino group, as shown in Figure 4c. All the antiparallel base pairs adopt the asymmetric N1 amino group, N7 amino group base pairing mode, as illustrated in Figure 4d. The glycosyl conformations in

4702 Nucleic Acids Research, 1998, Vol. 26, No. 20 a the structure are anti for all the cytosine residues. Out of the 16 adenine residues, eight are anti, five are syn and three have almost clinal conformations. Several modes of sugar pucker are present in the structure. For cytosines, the most frequent one is C4 -exo, in nine residues, followed by C2 -endo and C3 -endo, with six residues apiece. For adenines, the most frequent puckers are C3 -exo and C2 -endo. Further details will be published elsewhere. b c d Figure 4. Various A A base pairs are shown in an electron density map plotted at 2σ. (a) Base pair A20*-A30* with parallel backbones. It is a symmetric A A N7 amino group base pairing. (b) Base pair A21*-A31* with parallel backbones. It is a symmetric A A N1 amino group base paring. (c) Base pair A61-A72 with parallel backbones. It is an asymmetric A A N1 amino group, N7 amino group base pairing. Note that A61 is in the syn conformation. (d) Base pair A42-A62 with antiparallel backbones. It is an asymmetric A A N1 amino group, N7 amino group base pairing. DISCUSSION Comparison with previous results Compared with the previously solved C-rich crystal structures, d(aacccc) reveals many interesting features. The fourstranded, intercalated cytosine segment is an extremely stable and predominant feature of the structure. It is interesting to note that the crystals were grown over a wide range of ph, ranging from ph 5.0 to 8.0. The formation of C C + base pairs depends on hemiprotonation of the cytosines (18,6 8). In poly[d(c)], the hemi-protonated structure was stable up to ph 7 (7). The fact that crystals of d(aacccc) can grow at ph 7.5 and 8.0 indicates that the stable nature of the tetraplex and the packing forces raised the pk for hemi-protonation to an even higher value. This reinforces the possibility that the Tetrahymena telomere could adopt the intercalation motif in vivo at physiological ph, possibly in the presence of binding proteins. Aside from the general structural similarity in I motifs, we have found many variations. One notable difference is the presence of two different conformations of cytosine tetraplexes. In one tetraplex, as in all previously reported C tetraplex crystals, the outermost base pairs are from the 5 -end of each strand; in the other, however, the outermost base pairs are from the 3 -end of each strand. This suggests that the two conformations are energetically comparably favorable, leaving open the possibility that telomere sequences might adopt either one of the two conformations, depending on the contributions of the noncytosine residues. Despite the apparent similarity of all cytosine tetraplex conformations, each individual strand varies considerably from structure to structure. The average twists between covalently linked cytosines vary from 12.4 for d(c 4 ) to 16.6 for d(aacccc). When the common I motif portion of the structures are superimposed, the RMS differences are quite considerable, especially among the sugar phosphate backbones. For example, the RMS difference between tetraplex 1 in d(aacccc) and tetraplex 1 in d(c 4 ) is 1.26 Å, with the cytosine bases having an RMS difference of 0.45 Å while the backbones have one of 1.55 Å. In all the structures solved, the tetraplexes show considerable differences from structure to structure and the differences are mainly due to those between the sugar phosphate backbones. The positions of the cytosine bases are relatively stable and often almost superimposable. This might be expected, given the less flexible nature of the C C + base pairing associated with three strong planar hydrogen bonds. In contrast, the sugar phosphate backbones are intrinsically more flexible, partly due to their lack of torsional restraints and partly due to strong electrostatic repulsion between phosphate groups in the narrow grooves. Where the backbones are close together, they may be stablized by C-H O hydrogen bonds as well as van der Waals interactions (19). These variable aspects of the cytosine tetraplex might be

4703 Nucleic Acids Research, 1994, 1998, Vol. Vol. 22, 26, No. No. 120 4703 a important if telomere sequences adopt different conformations under differing biological conditions. The bridging adenine clusters Even though there is some variability in the cytosine tetraplex among different structures, the major variation is seen in the non-cytosine part of the structure. Unlike the other telomeric sequence solved, namely the metazoan telomere d(taaccc) (12), where the adenine/thymine segment of the structure folds back on itself to form a stable loop, the adenines in this structure adopt an entirely different conformation. In this case, the adenines adopt three different kinds of A A base pairs and are an essential lattice building block. There are two adenine residues per strand. In cytosine tetraplex 1, which points along the x-axis, there are four stacked pairs of adenine residues. As seen in Figure 1a, these four pairs project away from the central C tetraplex and the planes of the bases are perpendicular to the z-axis. Each pair forms an A cluster along with six other symmetry-related adenine residues, stacking along the z-axis, connecting two symmetry-related cytosine tetraplexes 2. A rather interesting three-dimensional network is formed, in which the adenine clusters play a key role in assembling the complex (Fig. 5a). C tetraplex 2 and A cluster 1 form continuous stacking along the z-axis, as illustrated in Figure 5b. The original four pairs of adenine residues from tetraplex 1 are thus involved in four A clusters at different symmetry-related locations, creating four continuous columns of z-axis stacking. In a similar manner, the four adenine pairs from cytosine tetramer 2 join cytosine tetramer 1 along the x-axis, forming four continuous stacking columns along the x-axis, as seen in Figure 5c. In contrast to the rather rigid cytosine tetraplex, the non-cytosine part of the telomeric sequences clearly show a great deal of variability and versatility in forming different structural conformations. Sequences containing stretches of cytosines and adenines are found in telomeres (1) and also occur in segments scattered throughout the genome. They may also exist in large RNAs such as group I and group II introns and ribosomal and spliceosomal RNAs. The recent crystal structure of the P4 P6 domain of the T.thermophila intron (20,21) revealed adenosine platforms in which two adjacent adenine residues contribute to key components of the domain tertiary structure. The crystal structure of the Tetrahymena telomeric sequence d(aacccc) shows two different novel adenine clusters that play a key role in building the crystal lattice and stabilizing the structure. The abundance of adenosine residues in internal loops of many RNAs and the ability of A clusters observed in this structure to form stabilized tertiary structures suggests the possibility that A clusters, like the adenosine platforms observed in a group I intron fragment, could be a motif present in large RNAs to facilitate folding and be responsible for long range tertiary interactions. This crystal structure shows that the telomeric sequence can adopt a very different structural conformation from standard B-DNA. Does this structural conformation occur in vivo? We do not have the answer yet. The fact that both C-rich sequences and complementary G-rich sequences can form tetraplexes (22 24) makes it possible that the two structures could act in concert or one could promote formation of the other. Such an event could play an important role in DNA self-recognition, which is essential in many biological systems (10).

4704 Nucleic Acids Research, 1998, Vol. 26, No. 20 Figure 5. The organization of adenine clusters and cytosine tetraplexes. (a) (Previous page) Stereo view of the three-dimensinal network formed by continuous stacking along the x- and z-axes. The box shown is the unit cell of the crystal. (b) A cluster 1 stacks on two symmetry-related cytosine tetraplexes 2, at both the top and bottom, creating continuous stacking along the z-axis. (c) A cluster 2 stacks on two symmetry-related cytosine tetraplexes 1, at both the top and bottom, creating continuous stacking along the x-axis. ACKNOWLEDGEMENTS This research was supported by grants from the National Institutes of Health, the National Science Foundation and the Department of Energy through Los Alamos National Laboratories. REFERENCES 1 Blackburn,E.H. (1991) Nature, 350, 569 573. 2 Zakian,V.A. (1989) Annu. Rev. Genet., 23, 579 604. 3 Marx,J. (1994) Science, 265, 1656 1658. 4 Blackburn,E.H. (1990) Science, 249, 489 490. 5 Akinrimisi,E.O., Sander,C. and Ts o,p.o.p. (1963) Biochemistry, 2, 340 344.

4705 Nucleic Acids Research, 1994, 1998, Vol. Vol. 22, 26, No. No. 120 4705 6 Langridge,R. and Rich,A. (1963) Nature, 198, 725 728. 7 Inman,R.B. (1964) J. Mol. Biol., 9, 624 637. 8 Hartman,K.A., Jr and Rich,A. (1965) J. Am. Chem. Soc., 87, 2033 2039. 9 Gehring,K., Leroy,J.-L. and Gueron,M. (1993) Nature, 363, 561 565. 10 Chen,L., Cai,L., Zhang,X. and Rich,A. (1994) Biochemistry, 33, 13540 13546. 11 Kang,C.H., Berger,I., Lockshin,C., Moyzis,R., Ratliff,R. and Rich,A. (1994) Proc. Natl Acad. Sci. USA, 91, 11636 11640. 12 Kang,C.H., Berger,I., Lockshin,C., Ratliff,R., Moyzis,R. and Rich,A. (1995) Proc. Natl Acad. Sci. USA, 92, 3874 3878. 13 Berger,I., Kang,C.H., Fredian,A., Moyzis,R., Ratliff,R. and Rich,A. (1995) Nature Struct. Biol., 2, 416 425. 14 Brunger,A.T. (1992) X-PLOR: A System for Crystallography and NMR. Yale University, New Haven, CT. 15 Brunger,A.T. (1992) Nature, 355, 472 474. 16 Rich,A., Davies,D.R., Crick,F.H.C. and Watson,J.D (1961) J. Mol. Biol., 3, 71 86. 17 Kim,S.H., Suddath,F.L., Quigley,G.J., McPherson,A., Kim,J.J., Sussman,J.L., Wang,A.H.-J., Seeman,N.C. and Rich,A. (1974) Science, 185, 435 439. 18 Marsh,R.E., Bierstedt,R. and Eichhorn,E. (1962) Acta Crystallogr., 15, 310. 19 Berger,I., Egli,M. and Rich,A. (1996) Proc. Natl Acad. Sci. USA, 93, 12116 12121. 20 Cate,J.H., Gooding,A.R., Podell,E., Zhou,K., Golden,B.L., Kundrot,C.E., Cech,T.R. and Doudna,J.A. (1996) Science, 273, 1678 1685. 21 Cate,J.H., Gooding,A.R., Podell,E., Zhou,K., Golden,B.L., Szewczak,A.A., Kundrot,C.E., Cech,T.R. and Doudna,J.A. (1996) Science, 273, 1696 1699. 22 Kang,C.H., Zhang,X., Ratliff,R., Moyzis,R. and Rich,A. (1992) Nature, 356, 126 131. 23 Smith,F.W. and Feigon,J. (1992) Nature, 356, 164 168. 24 Laughlan,G., Murchie,A., Norman,D., Moore,M., Moody,P., Lilley,D. and Luisi,B. (1994) Science, 265, 520 524.