The oestrogen receptor recognizes an imperfectly palindromic response element through an alternative side-chain conformation

Similar documents
Suppl. Figure 1: RCC1 sequence and sequence alignments. (a) Amino acid

Hmwk # 8 : DNA-Binding Proteins : Part II

Structural Bioinformatics (C3210) DNA and RNA Structure

1.1 Chemical structure and conformational flexibility of single-stranded DNA

Nucleic acids. How DNA works. DNA RNA Protein. DNA (deoxyribonucleic acid) RNA (ribonucleic acid) Central Dogma of Molecular Biology

Chapter 14 Regulation of Transcription

Gene and DNA structure. Dr Saeb Aliwaini

MCB 110:Biochemistry of the Central Dogma of MB. MCB 110:Biochemistry of the Central Dogma of MB

Solution Structure of the DNA-binding Domain of GAL4 from Saccharomyces cerevisiae

Molecular design principles underlying β-strand swapping. in the adhesive dimerization of cadherins

6-Foot Mini Toober Activity

Chapter 25: Regulating Eukaryotic Transcription The Ligand Responsive Activators

Conformational changes in IgE contribute to its. uniquely slow dissociation rate from receptor FcεRI

Basic concepts of molecular biology

Not The Real Exam Just for Fun! Chemistry 391 Fall 2018

Enhancers. Activators and repressors of transcription

Assembly and Characteristics of Nucleic Acid Double Helices

Packing of Secondary Structures

Bioinformatics. ONE Introduction to Biology. Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012

DNA Structures. Biochemistry 201 Molecular Biology January 5, 2000 Doug Brutlag. The Structural Conformations of DNA

Chapter 8 DNA Recognition in Prokaryotes by Helix-Turn-Helix Motifs

Supplementary Note 1. Enzymatic properties of the purified Syn BVR

The Cys 2. His 2. Research Article 451

Nucleic Acids. Information specifying protein structure

Information specifying protein structure. Chapter 19 Nucleic Acids Nucleotides Are the Building Blocks of Nucleic Acids

BASIC MOLECULAR GENETIC MECHANISMS Introduction:

All Rights Reserved. U.S. Patents 6,471,520B1; 5,498,190; 5,916, North Market Street, Suite CC130A, Milwaukee, WI 53202

Non-standard base pairs Non-standard base pairs play critical roles in the varied structures observed in DNA and RNA.

Basic concepts of molecular biology

Drug DNA interaction. Modeling DNA ligand interaction of intercalating ligands

38. Inter-basepair Hydrogen Bonds in DNA

X-ray structures of fructosyl peptide oxidases revealing residues responsible for gating oxygen access in the oxidative half reaction

Algorithms in Bioinformatics ONE Transcription Translation

What Are the Chemical Structures and Functions of Nucleic Acids?

AP Biology Book Notes Chapter 3 v Nucleic acids Ø Polymers specialized for the storage transmission and use of genetic information Ø Two types DNA

Structure formation and association of biomolecules. Prof. Dr. Martin Zacharias Lehrstuhl für Molekulardynamik (T38) Technische Universität München

Supplementary Information. Structural basis for duplex RNA recognition and cleavage by A.

Diversity in DNA recognition by p53 revealed by crystal structures with Hoogsteen base pairs

RNA does not adopt the classic B-DNA helix conformation when it forms a self-complementary double helix

Hmwk 6. Nucleic Acids

A. Incorrect! A sugar residue is only part of a nucleotide. Go back and review the structure of nucleotides.

G. DNA-BINDING PROTEINS & REGULATION

has only one nucleotide, U20, between G19 and A21, while trna Glu CUC has two

What is necessary for life?

Hydrogen bonding in yeast phenylalanine transfer RNA (electron density map/base stacking/base pairing/ion interactions)

Dina Al-Tamimi. Faisal Nimri. Ma amoun Ahram. 1 P a g e

DNA and RNA are both made of nucleotides. Proteins are made of amino acids. Transcription can be reversed but translation cannot.

Sequence Determinants of a Conformational Switch in a Protein

DNA & DNA : Protein Interactions BIBC 100

The Double Helix. DNA and RNA, part 2. Part A. Hint 1. The difference between purines and pyrimidines. Hint 2. Distinguish purines from pyrimidines

TALENs (Transcription Activator-Like Effector Nucleases)

Structure of nucleic acids II Biochemistry 302. January 20, 2006

DNA and RNA Structure Guided Notes

The nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA).

Supplementary Fig. 1. Initial electron density maps for the NOX-D20:mC5a complex obtained after SAD-phasing. (a) Initial experimental electron

Ch 10 Molecular Biology of the Gene

Supplementary materials for Structure of an open clamp type II topoisomerase-dna complex provides a mechanism for DNA capture and transport

Binding Studies with Mutants of Zif268

Molecular Biology (1)

DNA Glycosylase Exercise


2. True or False? The sequence of nucleotides in the human genome is 90.9% identical from one person to the next.

Nature Structural & Molecular Biology: doi: /nsmb.3428

MOLECULAR STRUCTURE OF DNA

SUPPLEMENTARY INFORMATION

CSE : Computational Issues in Molecular Biology. Lecture 19. Spring 2004

DNA and RNA Structure. Unit 7 Lesson 1

Proteins: Wide range of func2ons. Polypep2des. Amino Acid Monomers

Supplementary Information for. Structure of human tyrosylprotein sulfotransferase-2 reveals the mechanism of protein tyrosine sulfation reaction

Supplementary Fig. S1. SAMHD1c has a more potent dntpase activity than. SAMHD1c. Purified recombinant SAMHD1c and SAMHD1c proteins (with

Specificity: Induced Fit

Your Name: MID TERM ANSWER SHEET SIN: ( )

Molecular biology WID Masters of Science in Tropical and Infectious Diseases-Transcription Lecture Series RNA I. Introduction and Background:

What is necessary for life?

BETA STRAND Prof. Alejandro Hochkoeppler Department of Pharmaceutical Sciences and Biotechnology University of Bologna

Biochemistry Prof. S. Dasgupta Department of Chemistry. Indian Institute of Technology Kharagpur. Lecture - 16 Nucleic Acids - I

Protocol S1: Supporting Information

NUCLEIC ACID. Subtitle

Antibody-Antigen recognition. Structural Biology Weekend Seminar Annegret Kramer

Nucleic Acids: How Structure Conveys Information 1. What Is the Structure of DNA? 2. What Are the Levels of Structure in Nucleic Acids? 3.

Nucleic acids. What important polymer is located in the nucleus? is the instructions for making a cell's.

Learning to Use PyMOL (includes instructions for PS #2)

Solutions to 7.02 Quiz II 10/27/05

Bi 8 Lecture 7. Ellen Rothenberg 26 January Reading: Ch. 3, pp ; panel 3-1

Polymerase chain reaction

The Skap-hom Dimerization and PH Domains Comprise

Biochemistry 674, Fall, 1995: Nucleic Acids Exam II: November 16, 1995 Your Name Here: PCR A C

1) The penicillin family of antibiotics, discovered by Alexander Fleming in 1928, has the following general structure: O O

Appendix A DNA and PCR in detail DNA: A Detailed Look

Clamping down on pathogenic bacteria how to shut down a key DNA polymerase complex

DNA Repair Protein Exercise

Synergy between adjacent zinc fingers in sequence-specific DNA recognition

DNA AND CHROMOSOMES. Genetica per Scienze Naturali a.a prof S. Presciuttini

Globins. The Backbone structure of Myoglobin 2. The Heme complex in myoglobin. Lecture 10/01/2009. Role of the Globin.

Lecture 2. Crystals: Theory and Practice. Dr. Susan Yates Wednesday, February 2, 2011

Structure/function relationship in DNA-binding proteins

Comparative Modeling Part 1. Jaroslaw Pillardy Computational Biology Service Unit Cornell Theory Center

Molecular Biology (1)

Pre-Lab: Molecular Biology

SUPPLEMENTARY INFORMATION

Transcription:

The oestrogen receptor recognizes an imperfectly palindromic response element through an alternative side-chain conformation John WR Schwabet*, Lynda Chapman and Daniela Rhodes Medical Research Council, Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK Background: Structural studies of protein-dna complexes have tended to give the impression that DNA recognition requires a unique molecular interface. However, many proteins recognize DNA targets that differ from what is thought to be their ideal target sequence. The steroid hormone receptors illustrate this problem in recognition rather well, since consensus DNA targets are rare. Results: Here we describe the structure, at 2.6 A resolution, of a complex between a dimer of the DNA-binding domain from the human oestrogen receptor (ERDBD) and a non-consensus DNA target site in which there is a single base substitution in one half of the palindromic binding site. This substitution results in a 10-fold increase in the dissociation constant of the ERDBD-DNA complex. Comparison of this structure with a structure containing a consensus DNA-binding site determined previously, shows that recognition of the non-consensus sequence is achieved by the rearrangement of a lysine side chain so as to make an alternative base contact. Conclusions: This study suggests that proteins adapt to recognize different DNA sequences by rearranging side chains at the protein-dna interface so as to form alternative patterns of intermolecular contacts. Structure 15 February 1995, 3:201-213 Key words: protein-dna recognition, steroid hormone receptor, X-ray crystallography Introduction Structural studies of protein-dna complexes have greatly enhanced our understanding of how proteins can recognize specific DNA sequences [1]. However, in the majority of cases only a single structure has been determined with the protein in complex with one particular sequence (normally its ideal or consensus binding site). This has tended to give the impression that specific protein-dna interactions involve a unique stereochemistry at the protein-dna interface. In vivo, however, most DNA-binding transcription factors are able to promote or repress transcription through binding to target sites with sequences that deviate from the consensus. This raises important questions as to how proteins are able to recognize a number of related sequences. These questions may be experimentally addressed through determining the structures of several complexes in which one protein is bound to different DNA targets. By relating differences in these complexes with measurements of DNA-binding affinities we may hope to gain a thermodynamic understanding of the interactions at the protein-dna interface. In only one case, to date, have we obtained real insights into how proteins interact with non-consensus, yet specific, DNA targets. Structural analyses of the phage 434 repressor DNA-binding domain in complex with three different DNA targets have revealed surprisingly extensive differences in the protein-dna interface at the consensus and non-consensus sequences, showing a rearrangement of several amino acid side chains, a displacement of the phosphate backbone of the DNA, as well as a small global movement of the protein [2-4]. The steroid hormone receptors bind as dimers to palindromic DNA targets. In vivo, these targets rarely match the consensus exactly and consequently these proteins provide a good system for understanding how proteins can recognize more than one DNA sequence [5]. Fig. la shows a selection of biologically active DNA targets for the oestrogen receptor (known as oestrogen response elements, EREs) [6-13]. These illustrate that in most cases one half-site matches the consensus, whereas the other contains base substitutions. Indeed, in the promoters of some oestrogen-responsive genes, it is only possible to identify single isolated half-sites since the deviation from the consensus is so great. The properties of the two non-consensus response elements in the Xenopus vitellogenin gene B1 [VitBl(1) and VitBl(2), see Fig. 1] have been studied extensively [8,14,15]. In vitro, each of these elements binds the oestrogen receptor considerably more weakly than the consensus response element and in vivo, the isolated response elements only weakly activate transcription in response to oestrogen. However, two dimers of the oestrogen receptor bind cooperatively to the two response elements as arranged in the natural enhancer (Fig. lb), and activate transcription to approximately the same extent as a single VitA2 consensus response element. Previous work has provided a good understanding of how the oestrogen receptor recognizes its consensus response element. This recognition is mediated by an 84 residue DNA-binding domain (DBD) containing two zinc-binding motifs (which appear to have arisen as an *Corresponding author. t Present address: The Salk Institute for Biological Studies, P.O. Box 85800, San Diego, CA 92186-5800, USA. Current Biology Ltd ISSN 0969-2126 201

202 Structure 1995, Vol 3 No 2 evolutionary duplication of a single domain). This DBD is highly conserved across the superfamily of nuclear hormone receptors (Fig. 2a) [16,17]. In solution, when isolated from the rest of the protein, the DBD is monomeric (the whole receptor is a dimer, held together by a strong dimerization domain elsewhere in the protein [18]). However, on binding to the palindromic response element two monomers interact highly cooperatively to form a dimeric complex. NMR structural analyses of the monomeric domain from the oestrogen receptor (the ERDBD) [19], and those from several other nuclear receptors [20-22], have shown that each of the two zinc-binding motifs contains an a-helix nucleated at its N terminus through binding a zinc ion. The two helices are oriented perpendicularly to each other and cross at their mid-points with an extensive hydrophobic core between them. Whereas the whole of the first zinc-binding motif is well ordered in solution, a region within the second zinc-binding motif was found to be comparatively poorly ordered [23]. Fig. 1. (a) A selection of biologically active oestrogen response elements. The sequences are derived from the promoters of the Xenopus vitellogenin A2 gene [6], the Xenopus vitellogenin B1 gene [71, the chicken vitellogenin gene 9], the human PS2 gene [11], and the human 1121 and mouse [131 c-fos genes. The five base pair half-sites are shown in yellow arrows. Bases that differ from the consensus sequence are shown in red. Note that one half of these response elements (here shown on the right) always matches the consensus. (b) The oestrogen responsive unit from the Xenopus vitellogenin B1 gene. The right hand, proximal response element is the ERE-VitB1(1), the left hand response element is the ERE-VitBi(2), which is used in this study. In the crystal structure of an ERDBD dimer bound to a consensus response element [24] (and analogously in a structure of the complex between the glucocorticoid receptor DNA-binding domain (GRDBD) and DNA [25]), two monomers of the ERDBD bind to adjacent major grooves from one side of the DNA double helix. Each monomer makes phosphate contacts on both sides of the major groove thus positioning the helix in the first of the two zinc-binding motifs in the major groove, such that exposed side chains directly contact the DNA bases (see below). The region within the DBD that was found to be disordered in solution, forms a tight dimer interface between the two monomers in the protein-dna complex [24,25]. Some residues in this region also contact the phosphate backbone of the DNA. The formation of this dimer interface determines the strict requirement for a three base pair (bp) spacing between the half-sites and also explains the cooperative binding characteristics Fig. 2. (a) The amino acid sequence of the ERDBD shown in the classical 'zincfinger' representation. Helical regions are boxed and shaded yellow. Phosphate contacts are indicated by filled squares, base contacts by asterisks. (Note that Ala29 is also in van der Waals contact with an adenine but this interaction is not shown.) Residues making direct inter-protein contacts at the dimer interface are indicated by filled circles. Those making indirect dimer contacts mediated by ordered water molecules are indicated by open circles. (b) The consensus and variant ERE DNA targets with which the ERDBD has been crystallized. Those base pairs in the ERE-VitB1 (2) that differ from those in the consensus response element (ERE-CON) are shown coloured red. Large letters indicate the two half-sites (here shown as 6 bp so as to be comparable to the glucocorticoid response element).

Recognizing a non-consensus DNA target Schwabe, Chapman and Rhodes 203 observed in binding assays. Cooperativity between adjacent DBDs assists the receptors when binding to imperfectly palindromic response elements. This is strikingly demonstrated in the crystal structure of a complex between the related GRDBD and a target site with a 4 bp spacing between the half-sites [25]. (The GR half-site sequence differs from that of the ER in only 2 bp.) Because there is a strict requirement for a three base pair spacing, only one half of the dimer binds to a consensus sequence. The other half of the dimer (assisted by contacts with the first half) is contacting what is effectively non-specific DNA, making many fewer contacts to the DNA bases. This 'half-specific' GRDBD structure gives some insight into how receptors can interact with targets containing only one recognizable half-site sequence. More commonly, however, in natural response elements the base substitutions are more conservative and footprinting assays show strong protection at both the consensus and non-consensus halves of the response element. Here we report the characterization of the binding of the ERDBD to a non-consensus ERE (derived from the VitB 1(2) site, hereafter termed ERE-VitB 1(2); Figs 1 and 2) which has a single guanine to adenine substitution in one of the two half-site sequences that comprise the binding site. We also report the crystal structure, at 2.6 A resolution, of an ERDBD dimer bound to this site. A comparison of this structure with the consensus structure shows how the ERDBD adapts to recognize the altered half-site sequence. Furthermore, since these crystals grow in a different space group, we have been able to observe the complex in a different packing environment. Results and discussion Binding assays Differential binding affinities for the ERDBD binding to consensus and variant DNA target sites were measured using gel-retardation assays with three different target sites. These contain the following half-sites: AGGTCA (consensus ER half-site), AAGTCA [VitB1(2) variant ER half-site] and AGAACA (consensus GR half-site, as a negative control). These three half-sites were embedded in the sequence: CCAAAGTCxxxxxxCAGTGACC- TGATCAAA. Two different types of binding assay were employed (described below). These were analyzed using non-denaturing gel electrophoresis and dissociation constants were derived from the intensity of bands in these gels (measured using a Molecular Dynamics PhosphorImager). In the dilution experiments (Fig. 3a) the protein:dna ratio was constant at 2:1 and the initial DNA concentration was 10-7 M. For the consensus ERE at 10-7 M, more than 90% of the DNA is in a dimer complex and at 4x10-9 M about 50% is in the dimer complex. These ratios yield an approximate overall dissociation constant of 10-9 M for the dimer (see graph of theoretical values in the Materials and methods section). A similar analysis for the ERE-VitB1(2) complex suggests that the overall dissociation constant for the dimer is approximately 10-8M. For the ERE-GRE the smearing in the gel makes it difficult to assess the dissociation constant, but the intensity measurements suggest that -30% of the DNA is in the dimer complex at 10-7 M DNA. This suggests an overall dissociation constant in the region of 10-7 M. Although these values are inherently approximate, especially because the rapid off-rate of the protein makes the complex very sensitive to electrophoresis conditions, they give a clear picture of the relative binding affinities for these three DNA-binding sites. In conclusion, the G to A substitution results in a 10-fold increase in the overall dissociation constant. Fig. 3b shows binding experiments in which increasing amounts of protein were added to the target site at 10-8 M (note the protein:dna ratios are different for the different DNA targets). The proportions of complex seen in these titrations are consistent with the estimated dissociation constants discussed above. At this DNA concentration, the DNA in complex with a single ERDBD (the monomeric species) becomes measurable, and using the analysis detailed in the Materials and methods section it is possible to estimate the ratio between the dissociation constants of the first and second ERDBD molecules. However, the estimation of monomer concentration in Fig. 3. The ERDBD binds more weakly to non-consensus response elements. Gel-retardation analyses of the ERDBD binding to the consensus (ERE-CON), variant [ERE-VitB (2)] and half- ERE/half-GRE (ERE-GRE) DNA targets. (a) Dilution experiments carried out at a constant protein:dna ratio of 2:1, with increasing DNA concentrations of 4x10-9 M, 2x10-8 M and x10-7 M (from left to right). (b) Binding titrations carried out at a DNA concentration of 1xl 0-8 M and increasing concentrations of protein (protein:dna ratios are indicated above each lane). The analyses were performed using 6-18% gradient non-denaturing polyacrylamide gels.

204 Structure 1995, Vol 3 No 2 the gels can only be approximate because of the smearing which results from the high off-rate. Our best estimate, based on these experiments and others, is that with the consensus ERE the dissociation constant for the second DBD is between 10 and 30 times lower than that of the first. On the ERE-VitB1(2) target, the dissociation constant for the second protein is between two and four times lower than the first. On the ERE-GRE target site, the dissociation constants for the first and second proteins are approximately equal. These experiments bring into focus the importance of the cooperative interaction between monomers in assisting the receptors to bind to non-consensus response elements. Whilst it is difficult to obtain precise DNA-binding constants, these gel-retardation assays set the scene for the interpretation of the structural work described below. Crystallization strategy and structure determination In many of the protein-dna complexes crystallized to date, the DNA forms pseudo-continuous fibres within the crystal lattice. Following this observation, it has become common practice to design oligonucleotides for crystallization such that they can pack in this way [25-28]. We followed this strategy previously when preparing the crystals of the ERDBD bound to the consensus response element [24]. The sequence of the 18 bp consensus DNA target was symmetrical (with the exception of the central base pair) such that it could fit in the crystal lattice in either orientation, making identical Hoogsteen base-pairing interactions with the two neighbouring oligonucleotides (Fig. 2b). This symmetry of sequence was reflected in a symmetrical structure with essentially identical protein:dna contacts made at each half-site. One of the consequences of employing the non-consensus VitB1(2) DNA target is that this introduces asymmetry into the complex. In order to restrict the oligonucleotide to a single orientation in the crystal lattice the ends of the DNA were designed to be asymmetrical (Fig. 2b) such that the oligonucleotide can only interact with its two neighbouring oligonucleotides in one orientation, with head-to-tail packing. Crystals of the ERDBD-ERE-VitB1(2) complex were obtained under similar conditions to those which yielded crystals of the consensus complex in an orthorhombic space group. However, the new crystals of the non-consensus complex have a monoclinic space group (C2). The structure of the ERDBD bound to the ERE-VitB 1(2) site in the monoclinic crystal form was solved to 2.6 A resolution by molecular replacement using dimer A (see below) from the consensus structure. (For full details, see Materials and methods.) In both the orthorhombic and monoclinic crystal forms [consensus ERE versus ERE-VitB1(2)] there are two complexes in the asymmetric unit (termed consensus and variant dimers A and B). Fig. 4 shows that the end-to-end packing of the DNA is rather similar in the two crystal forms, with pseudo-continuous DNA fibres along the two-fold screw axes in the unit cell (successive molecules in each fibre are crystallographically related). The side-by-side DNA packing, and protein packing, in the monoclinic crystal form is somewhat looser than that in the orthorhombic form. Thus, the monoclinic crystals Fig. 4. Orthogonal views of the crystal packing in the orthorhombic and monoclinic crystals. All atoms in the DNA are coloured magenta. Ca atoms in the protein are coloured yellow. The unit cell boundaries are shown in red. (a) The P2,2 1 2, cell viewed down the c axis. (b) The C2 cell viewed down the b axis. (c) The P2 1 2 1 2 1 cell viewed down the a axis. (d) The C2 cell viewed down the c axis.

Recognizing a non-consensus DNA target Schwabe, Chapman and Rhodes 205 are less dense (3.1 A 3 Da- 1 versus 2.1 A 3 Da - ') with solvent channels perpendicular to the DNA fibres (Fig. 4d). The refinement of the structure is described in the Materials and methods section. At each stage of the refinement process efforts were made to assign the directionality of the oligonucleotide, so that the appropriate asymmetry could be introduced into the model (which was based on the symmetrical consensus complex). At no stage could the directionality be unambiguously assigned. Refinement with asymmetric models for the two molecules in the asymmetric unit, in each possible arrangement, yielded no favoured orientation. Consequently, we have to conclude that despite the designed asymmetry of the oligonucleotide, the orientation of the DNA is to a large extent averaged in both complexes in the asymmetric unit. Indeed, simulated annealing omit maps calculated in X-PLOR [29], omitting the central base pair, revealed clearly averaged density for this base pair (shown in Fig. 5 for one of the oligonucleotides). This is the most diagnostic base pair to examine, since the different orientations switch the positions of the purine and pyrimidine bases. The geometry of the Hoogsteen base pairs (at the ends of the binding site) is good, implying that the averaging does not result from the oligonucleotides within one fibre packing in random orientations. Rather, it seems that symmetry-related pseudo-continuous fibres can pack in either orientation with respect to each other. It is important to note that the evidence for averaging in these crystals derives from looking at regions of the structure that one expects to be chemically distinct. There is no indication that there is a general averaging of two globally distinct structures (indeed one might expect such crystals to diffract very poorly). This provides strong evidence that the structures of both the protein and DNA (and their interaction) at the consensus and variant Fig. 5. Electron density for the central base pair of the DNA-binding site indicates that it is an average between an adeninethymine and a thymine-adenine base pair. The map shown in blue is a 2F 0 -F c simulated annealing omit map 129] (omitting the central base pair), contoured at 1.2(. Note that the density encloses a similar volume for each base. half-sites are extremely similar. Indeed, where they differ we can clearly see electron density for alternative sidechain conformations (discussed below). It is clear that the base substitution does not result in the type of loose complex seen in the 'half-specific' GRDBD structure [25] described above. In the discussion presented below the reader should appreciate that each of the four half-site complexes (there are two dimer complexes in the asymmetric unit) is an average structure of the consensus and variant half-site complexes. This does not mean that the two halves of each dimer complex need be the same, since each is in a unique environment in terms of crystal packing. Thus differences between the two halves of the averaged complexes arise from crystal packing. Differences resulting from the two alternative half-site sequences appear in all four half-site complexes as averaged electron density. Overall structure The structures of the ERDBD bound to the consensus ERE [24] and to the ERE-VitB1(2) are very similar, both in overall structure and in many of the details. Fig. 6a shows a superposition of the two consensus dimers and the two variant dimers of the ERDBD bound to DNA. Some of the details of the protein:dna interface are shown in Fig. 6b. The eight half-site complexes (from two crystal structures, each containing two dimer complexes, each with two half-site complexes) are superimposed and shown with the 2Fo-F c electron density for one complex from the variant crystals. Even at this level of detail, the structures of all the half-site complexes are extremely similar (the small differences are discussed below). Most of the contacts with the phosphate backbone are present in all of the structures such that the disposition of the protein with respect to the DNA is the same in all complexes. We have analyzed the DNA in both dimers in both the consensus and variant structures, using the program CURVES [30]. It is clear that where there are local distortions in the DNA (e.g. large propeller twist, buckle, roll etc.), these are present in all the structures. Interestingly near the ends of the DNA, which are not in contact with protein, there are some adjustments in the monoclinic crystals that result in a slight (1-2 A) reduction in the length of the DNA (see Fig. 6a). It is significant that the ends of the DNA can adjust to accommodate the different crystal packing (possibly due to freezing), yet the DNA structure in contact with the protein remains the same. This suggests that the protein requires a specific structure of the DNA at its recognition sites and that this structure is not perturbed by crystal packing. 'Recognition' of a non-ideal half-site sequence It is easiest to understand the consequences of the base substitution in the non-ideal half-site by first reviewing the contacts seen in the consensus structure [24]. In this structure, the pattern of protein-dna contacts is the same (except for two residues contacting the phosphate backbone) for all four proteins in the asymmetric unit of

206 Structure 1995, Vol 3 No 2 Fig. 6. Superposition of the consensus and variant structures of the ERDBD. Red indicates dimer A in the consensus structure; magenta indicates dimer B in the consensus structure; purple indicates dimer A in the variant structure and green indicates dimer B in the variant structure. This colour code is maintained in many of the following figures. (a) The four dimer complexes are shown superimposed on the Cao atoms of the two recognition helices in dimer A from the consensus structure (RMSDs for the superpositions: 0.299 A, 0.345 A and 0.445 A). The structure is oriented such that the viewer is looking down the recognition helices lying in adjacent major grooves, with the dimer interface between monomers above the minor groove of the DNA. (b) Eight half-site complexes are shown superimposed on Coa atoms 7-35 and 59-72 from protein 2 of dimer A in the variant structure (RMSDs 0.315 A, 0.292 A, 0.223 A, 0.239 A, 0.249 A, 0.310 A and 0.353 A). The 2FO-Fc electron density map for protein 2 in variant dimer A is shown contoured at 0.4 e A- 3. The view is looking down the axis of the recognition helix from the C terminus with DNA bases at the bottom and the second zinc site near the dimer interface at the top left. Four of the conserved water molecules (w) at the protein-dna interface are labelled. the crystal (Fig. 7). Four amino acids make direct hydrogen-bonding contacts with the base pairs; Glu25, Lys28, Lys32 and Arg33 (which also makes a phosphate contact). In addition, Ala29 is in van der Waals contact with an adenine base and there are numerous contacts to the phosphate backbone of the DNA, mainly from residues conserved across the nuclear receptor super family. Eight ordered water molecules are also present at the interface between each ERDBD monomer and DNA, making hydrogen bonds to both protein and DNA. The role of Glu25 seems remarkable in that the two carboxylic oxygens are buttressed by other amino acids. One of the buttressing residues is Serl5, which also donates a hydrogen bond to Hisl8, which in turn donates a hydrogen bond to a phosphate. The other is Lys28 which, in addition to making a salt bridge to Glu25, also donates a hydrogen bond to the 06 of a guanine (Figs 7 and 8a). It is this guanine that is substituted by an adenine in the ERE-VitBi(2) response element. As it was clear from the consensus structure that the position of Lys28 would result in a steric clash with the N6 amine of the adenine, it was predicted that in the ERE- VitB1(2) complex the position of this residue would necessarily be perturbed. In order to obtain an unbiased view of the role of Lys28 in each protein in the variant structure, simulated annealing maps were calculated in X-PLOR omitting these lysine residues (Figs 8b-e). In all four ERDBD molecules, the density observed for Lys28 can best be interpreted as arising from an average of two orientations for Lys28 (see Fig. 8). This is consistent with the interpretation of the averaging discussed above. One position allows Lys28 to hydrogen bond to the 06 of the guanine and form a salt bridge with Glu25 (see Fig. 8a). As this position was observed for all four proteins-in the consensus structure, it is presumably correlated with the consensus half-site. The second position is a new orientation and is presumably correlated with the nonideal half-site. In this position, the important salt bridge to Glu25 and the hydrogen bond to the guanine 06 (in this case adenine NH 2 ) are broken. However, in this new orientation, Lys28 is positioned so as to make hydrogen bonds to both the protein and the DNA. It donates a hydrogen bond to the hydroxyl group of Tyrl9 and to the N7 of the substituted adenine. However, these hydrogen bonds have less than optimal geometry. Thus it appears that the ability of Lys28 to adopt this alternative side chain arrangement is part of the design of the protein. Interestingly, a crystal structure of the GRDBD, mutated to resemble the ERDBD, in complex with a consensus ERE, shows one half of the dimer has Lys28 with the side chain in a similar 'alternative' conformation [31]. Of the other residues making base contacts, Glu25 and Arg33 have the same conformation as in the consensus structure. Lys32 however, shows some slight variability in its contacts, but is always close to two hydrogen bond accepting groups (the 04 of a thymine and the N7 of a

Recognizing a non-consensus DNA target Schwabe, Chapman and Rhodes 207 dimer A in the variant structure, the conformation of these residues is similar to that in the two dimers in the consensus structure. However, in dimer B, although the alanine contact is maintained, the two proline rings are splayed apart and a water molecule (with an exceptionally low temperature factor) is trapped in the hydrophobic pocket between the two rings below the two alanines. The deformation does not perturb the hydrogen bond between the backbone amide nitrogen of Tyr5O in one monomer and the carbonyl oxygen of Pro44 in the other. Fig. 7. Schematic view of the protein-dna contacts in the consensus and variant complexes. The DNA is shown as an opened-out helix with the base pairs in the major groove labelled. Bases that are in contact with the protein are shown in cyan and the side chains contacting them are indicated in yellow. Phosphates in contact with protein are shown in red, the corresponding amino acids are shown without highlight. Ordered water molecules at the protein-dna interface are shown as purple circles. Arrows indicate the direction of hydrogen bonds. The interpretation of the two alternative roles for the side chain of Lys28 are highlighted in green (consensus half-site) and pink (variant half-site). guanine). At two of the half-sites it appears to interact with both accepting groups, as it did in the consensus structure. At the third half-site it seems to have moved very slightly so that it can interact only with the thymine. At the fourth half-site there is no electron density beyond the Cy atom suggesting that it is mobile and switches between the different acceptors. Finally, the eight water molecules seen to bridge the protein-dna interface in the consensus complex, are essentially unperturbed by the change in the ERE-VitBl(2) sequence. Dimer interface The consensus structure showed that the dimer interface comprises two parts. The first of these is a loop between the zinc ligands, Cys43 and Cys49. This is exposed at the top of the structure, away from the DNA. Pro44 and Ala45 in this loop in one monomer make van der Waals contacts with the same residues in the other monomer (Fig. 9a). These contacts are the same in both dimers of the consensus structure, although in dimer B the temperature factors for these residues are rather high (data were collected at room temperature). In contrast, in both dimers of the variant structure the temperature factors for these residues are extremely low (even taking into account that the crystals were frozen; see below). In The second part of the dimer interface involves residues between the zinc ligands Cys49 and Cys59. The conformation of this region of the protein differed in the two dimers in the consensus structure (Fig. 9b). In both protein monomers in dimer A, residues from Asn54-Cys59 are folded to form a small helical segment and conserved residues Arg56 and Lys57 make several phosphate contacts. A number of ordered water molecules bridge the residues Ser58 and Met42 in the two monomers. In both protein monomers in dimer B (in the consensus structure), the small helix is absent and residues Cys49-Cys59 form an extended loop. As this loop is involved in crystal packing contacts in only one of the monomers, it was thought that the structures of the small helical regions in both monomers were interdependent and induced on DNA binding (note this region of the protein was disordered in the NMR structure of the monomer). Dimer A in the variant structure is similar to dimer A in the consensus structure - both proteins have the small helix. One protein in dimer B in the variant structure, has an extended loop analogous to the proteins in dimer B of the consensus structure. In the other protein in dimer B in the variant structure, there are clearly two or more alternative conformations for residues in this region. The electron density was not readily interpretable, but a mixture of the two conformations seen in the other dimers probably exists. It seems remarkable that in two very different packing environments, two distinct conformations should be present for the residues Asn54-Ser58. This is even more remarkable when one considers the importance of the role of Arg56, which not only contacts two phosphate groups, but also hydrogen bonds to one of the cysteine ligands in the first zinc-binding site, thus linking the structural core of the protein to the phosphate backbone. When the helix is not formed, this residue makes none of these contacts. This suggests that this region only forms a helical structure upon DNA binding, and that crystallization has trapped intermediates in this folding process. C-terminal tail We have found from binding analyses that the 10 C-terminal residues of the ERDBD, shown to be flexible in the NMR structure, play an important role in DNA binding, since their removal dramatically increases the

208 Structure 1995, Vol 3 No 2 Fig. 8. Recognition of the guanine to adenine substitution is achieved through a rearrangement of a lysine side chain. (a) The interaction of Lys28 with DNA in the consensus structure. The 2Fo-F electron density map is contoured at 0.4 e A-3. Hydrogen bonds are indicated by dotted lines. (b-e) Interaction of Lys28 with DNA in the four ERDBD molecules in the asymmetric unit of the variant structure. 2Fo-FC simulated annealing omit maps [29] (omitting Lys28) are contoured at 1.2a. Although adenines or guanines are shown at position 4, we believe this base is in fact an average of the two (see text). (f) Interpretation of the maps shown in parts (b-e). The two alternative conformations for the side chain of Lys28 when interacting at the variant and consensus halfsites (averaged in the map) are modelled. The base shown is a guanine, as found in the consensus half-site. The 06 group on the guanine is substituted by an NH 2 group in adenine. The hydrogen bonds coloured white can be formed at both half-sites, those in green at the consensus half-site and those in magenta at the variant half-site. dissociation constant [23]. However, to our surprise, in the crystal structure of the consensus complex [24], these residues were apparently disordered and made no contact with the DNA. One possible explanation for this disorder was that the tight side-by-side packing of the DNA fibres perturbed the structure of this region. However, the packing in new crystals is considerably looser and yet in all four proteins in the asymmetric unit there is no interpretable electron density for these 10 C-terminal residues. It therefore seems most likely that these residues contribute to the binding affinity by increasing the 'onrate', through sme electrostatic influence (the 'tail' includes several charged residues), rather than by decreasing the 'off-rate', through forming specific interactions with the DNA. Temperature factors The observation that temperature factors for residues at the dimer interface differed markedly in the consensus and variant structures (data collected at room temperature and 95 K respectively), prompted us to analyze the temperature factors in more detail. Fig. 10 shows temperature factors for protein and DNA in the four complexes (two from the consensus structure and two from the variant structure). As one might have expected, molecules in the frozen crystals have generally lower temperature factors. The temperature factors for phosphorus atoms in the DNA (Fig. 10a, top panel) follow a very similar pattern in all eight DNA strands. Although the protein makes phosphate contacts on both sides of the major groove, it is striking that the three phosphates

Recognizing a non-consensus DNA target Schwabe, Chapman and Rhodes 209 This contains the two regions of the dimer interface (blue block and bold-outlined yellow block). Freezing results in a dramatic lowering of temperature factors with the general exception (although not for one of the frozen proteins) of the lower part of the dimer interface, where there is a helix in some of the proteins. This supports the notion that this region is less stably structured than other parts of the protein. Conclusions At first sight the averaging in these crystals would appear to pose a problem in the interpretation of the structure. In fact, quite the opposite is true. Had the crystal not been averaged we would have analyzed all the small differences between the two halves and puzzled over which were the result of the altered DNA sequence. However, the fact that consensus and variant half-site complexes are so similar that they can be averaged, gives us a very strong indication that most features of the consensus and variant half-site complexes are identical. Fig. 9. Structural heterogeneity at the dimer interface. The colour scheme follows that described in Fig. 6. The structures were superimposed as described in Fig. 6a. (a) The top portion of the dimer interface showing the interactions of Pro44 and Ala45. The van der Waals distances between the interacting groups are listed. Note that for dimer B in the variant structure the prolines are splayed aart and a water molecule is positioned between the two rings. (b) The smoothed C traces for residues from Cys43 to Cys59 are shown. This view shows the two conformations of the backbone for residues Asn54 to Ser58 (see text for discussion). contacted by the side of the protein near the dimer interface have by far the lowest temperature factors. As expected, the temperature factors of the bases (bottom panel Fig. 10a) are generally lower than those of the phosphate backbone (note the temperature factor of the C2 atom is taken as being representative of the whole base). The pattern of temperatures factors is more or less conserved in all eight strands of DNA. It is interesting that unlike the temperature factors of the phosphate backbone, the bases with the lowest temperature factors are not those in most intimate contact with the protein. Indeed, the base with the lowest temperature factor is one of the three in the spacer between the two half-sites. This would seem to suggest that these bases are physically constrained and it is likely that the identity of these constrained bases may indirectly contribute to specificity. The temperature factors of the protein are shown in Fig. 10b. In the N-terminal zinc-binding motif, including the recognition helix, the temperature factors of all eight proteins are generally similar - freezing makes rather little difference. Furthermore, the linker region Ile35-Tyr41 is generally poorly ordered in all the structures. The second zinc-binding motif is more interesting. The primary objective of this study was to gain a structural understanding of the change in binding affinity for the variant binding site. In other words, does the structure of the protein-dna interface at the variant half-site account for the reduced DNA-binding affinity observed in the gel-retardation assays (Fig. 3)? There are essentially three possible causes of the reduced DNA-binding affinity: the first and simplest of these is a loss of interaction energy between protein and DNA through there being fewer, or less favourable, intermolecular hydrogen bonds, or electrostatic interactions. Although the new position occupied by Lys28 still allows it to make an equivalent number of hydrogen bonds, it is quite likely that these are less favourable than the buttressing salt bridge to Glu25 and the hydrogen bond to the guanine 06 at the consensus half-site. The second possible cause of the reduced DNA-binding affinity is a poorer complementarity of shape at the interface, leading to fewer waters being excluded, and a correspondingly reduced entropic contribution from this water displacement. Indeed, it is becoming accepted that a large proportion of the force driving the formation of protein-dna complexes is derived from the entropic gain in releasing water molecules from the surface of the isolated protein and DNA molecules on complex formation. This is clearly one of the causes of the reduced binding affinity of the GRDBD for the four-spaced target site, but does not seem important for the ERDBD in this case, because the nature of the complex is so similar and the same number of water molecules are present at the protein-dna interface. The third possible cause of reduced binding affinity is a 'hidden' enthalpic term. We have suggested (see above) that the protein seems to prefer a particular DNA conformation. If the change in half-site sequence were to

210 Structure 1995, Vol 3 No 2 Fig. 10. The refined temperature factors for the complexes in the frozen and room temperature crystals. The colour scheme follows that described in Fig. 6: red indicates dimer A in the consensus structure; magenta indicates dimer B in the consensus structure; purple indicates dimer A in the variant structure and green indicates dimer B in the variant structure. (a) Temperature factors for the phosphate backbone (top) and bases (bottom) in the DNA targets. The base sequence of the DNA is shown with the half-sites highlighted. Note that there are eight DNA strands (two strands in each dimer complex and two complexes in each of the two structures). (b) Temperature factors for the Ca atoms in the ERDBD in the different structures. The sequence is shown below the figure. Zinc ligands are indicated by dots; helices by yellow blocks; the top part of dimer interface by a blue block and the helix in the lower part of dimer interface by a yellow block with a bold outline. Note that there are eight ERDBD molecules (two in each dimer complex and two complexes in each of the two structures). increase the energy cost in adopting this favoured structure, this would detract from the binding affinity. In conclusion, the most likely cause of the reduced DNA-binding affinity appears to be a reduced interaction energy resulting from the loss of a salt bridge and the formation of hydrogen bonds with less than optimal geometry. In addition, there may be an energy cost in achieving the required DNA structure. Biological implications The regulation of gene expression requires that transcription factors are able to recognize specific DNA-binding sites in the promoters of the genes that they regulate. Traditionally, this protein-dna recognition has been viewed as being highly specific, yet most transcription factors are able to recognize and promote transcription from a number of DNA targets that deviate from the consensus sequence. So how does a transcription factor recognize target sites that have different sequences? Understanding how this is achieved at a molecular level is an important part of understanding protein-dna recognition as a whole. However, to date, most structural work has focused on proteins in complex with a single DNA sequence, often the ideal or consensus sequence. The present study addresses this issue by comparing the structure of the oestrogen receptor DNA-binding domain (ERDBD) in complex with two different naturally occurring DNA targets. We have also measured the

Recognizing a non-consensus DNA target Schwabe, Chapman and Rhodes 211 DNA-binding affinity for these different DNA targets. By relating changes in affinity to structural changes at the protein-dna interface, we hope to begin to understand the thermodynamics of the interactions at protein-dna interfaces. The oestrogen receptor is one member of a large family of transcription factors involved in a wide range of cellular processes [16]. It regulates transcription of its target genes by binding as a dimer to a palindromic binding site consisting of two half-sites separated by three base pairs [6]. It is striking that, in vivo, many response elements are composed of one half-site matching the consensus and one half-site containing one or more base substitutions. We have previously determined the structure of an ERDBD dimer bound to a consensus response element [24]. This structure showed how recognition is achieved through a network of hydrogen bonds between protein, DNA and ordered water molecules at the protein-dna interface. In the structure described here the ERDBD is bound to a non-consensus response element from the Xenopus vitellogenin gene. This response element has a single base substitution (guanine to adenine) in one of its two half-sites. Comparison of the complexes containing consensus and non-consensus DNA shows that the protein accommodates this base change through the rearrangement of a lysine side chain (Lys28). When bound to a consensus half-site, this residue donates a hydrogen bond to the 06 of the guanine. On the variant half-site, Lys28 cannot accept a hydrogen bond from the equivalent NH 2 group of the adenine, and thus it moves so as to donate a hydrogen bond to the N7 of the adenine. This rearrangement leads to the loss of a salt bridge to a glutamic acid (Glu25) that also makes a base contact. All the other features of the protein-dna interface are essentially unperturbed. However, our binding studies show that these small changes result in a significant loss of DNA-binding affinity. It would appear that some natural targets have evolved to bind protein with less than maximal affinity and as such are presumably optimally suited to their biological function. Materials and methods DNA-binding assays Oligonucleotides used for the analysis of DNA-binding activity were synthesized and purified as described previously [23]. To visualize the DNA in the binding experiments the oligonucleotides were 5' end-labelled using [y 32 P]ATP and T4 polynucleotide kinase. Gel retardation assays were performed using 6-18% gradient acrylamide mini-gels (30% acrylamide: 0.8% bis-acrylamide) with 0.5XTB (45 mm Tris Borate ph 8.4) gel electrophoresis buffer. The binding buffer contained 20 mm MES ph 6.0, 2 mm MgC1 2, 3% glycerol and 0.05% NP40. The gels were visualized and the intensity of bands quantitated using a Molecular Dynamics Phosphorlmager. Overall dimer binding constants (ignoring the presence of the monomer species) were estimated by comparing the observed percentage complex with the theoretical plot in Fig. 11 - assuming a 1:1 ratio of dimer:dna. The ratio of the dissociation constants for the first and second protein monomers to bind to the response element was estimated assuming the following equilibria: KP P+D K3 PDon COD + P PD + P K2 P2D K4 We can assume from the overall dissociation constants that: K3>>KI therefore [PDvar]<[PDcon] and thus K - [PD] 2 K2 - [P2D][DI This analysis has the advantage that it involves just the ratio of quantities that can be directly measured from the gels and is independent of the absolute concentrations of protein and DNA. It also gives a clear measure of the cooperativity of binding, i.e. how much more strongly the protein binds to the second site compared with the first. Preparation of crystals Protein and oligonucleotides for crystallization were purified and mixed to prepare complex in the same manner as previously described [23,24]. Crystals grow over a range of conditions. The single crystal used in this study was grown at 20 C U [DNA]/Kd 00 Fig. 11. Plot of the theoretical relationship between the percentage of DNA in the form of a complex versus the ratio between DNA concentration and dissociation constant, assuming a 1:1 protein:dna ratio in the complex and first-order binding kinetics.

212 Structure 1995, Vol 3 No 2 using a direct addition method in a volume of 20 p,1 in a sealed tube. The crystallization mix contained -70 p.m complex, 20 mm MES at ph 5.75, 1.8 mm spermine, 2 M ZnC1 2, 30 mm NaCl, 12 mm CaC12 and 10% 2-methyl-2,4-pentanediol (MPD). Crystals grown in this way are monoclinic (space group C2 with a=121.67 A, b=113.09 A, c=62.36 A, 3=117.45 ) and when frozen (95 K) diffract to beyond 3 A. If the crystals are not frozen the diffraction falls back to 5 A after only 30 min exposure to X-rays from a CuKax source. The crystal used to collect the data described here diffracted anisotropically to 2.5 A. Data collection Prior to data collection the crystal was stabilized by transferring to an artificial mother liquor containing a higher concentration of MPD (28%). The crystal was then mounted in a nylon loop and frozen in a stream of nitrogen gas at 95 K (Oxford Cryostream apparatus). 205 of data were collected in 1 images with a CuKoa source and a 30 cm MAR image plate detector. The data were processed using the program IPX- MOSFLM [32]. As a consequence of the relatively high mosaic spread, 0.90 (itself a consequence of freezing), there were no fully recorded reflections. The data were scaled and merged using the CCP4 suite of programs (SCALA, AGROVATA and TRUNCATE) [33]. The final dataset of 85446 reflections (22960 unique, multiplicity 3.7) is 99.5% complete to 2.6 A with an overall Rmerge of 7.0%. Molecular replacement solution and structure refinement The key to the molecular replacement solution was that the molecular packing could be readily solved by inspection. DNA arcs in the diffraction pattern at -3.4 A indicated that the DNA is oriented with its axis parallel to the b axis of the crystal. Since this axis (113.09 A) is approximately twice the expected length of the 17 bp oligonucleotide (2x17x3.4=115.6 A), it seemed likely that the DNA is arranged in pseudo-continuous helices oriented down the b axis of the crystal. The unit cell volume is consistent with two or three complexes in the asymmetric unit (3.1 A 3 Da-' or 2.1 A 3 Da-'). A very strong peak in the self-patterson function (38.5% of the origin peak) suggested that there were in fact two complexes in the asymmetric unit related by the translation (a=0, b=0.143, c=0.5). Given that the translation in b is not close to 0.5 it was clear that successive molecules in the pseudo-continuous DNA helix must be crystallographically related and hence approximately positioned on the two-fold screw axes. Thus for the molecular replacement solution the only unknown parameter was the rotation of the complex about the axis of the DNA. This was found using a one-dimensional direct rotation search in X-PLOR [29] using dimer A in the orthorhombic crystals as the search molecule [24]. An unambiguous solution was found. The orientation was then refined in three dimensions in X- PLOR. Two complexes in this orientation were approximately positioned on adjacent screw axes of the unit cell with a relative translation in the b axis according to the self-patterson function. This packing is compared with the packing in the ERDBD/ERE-CON crystals in Fig. 4. A local translation search in x and z directions was used to refine the position of the two complexes. Finally, a rigid-body refinement, using data from 12.5 A to 3.2 A and with the four protein molecules and four DNA half-sites grouped separately, reduced the R-factor from 50.1% to 36.3%. Several rounds of rebuilding, positional refinement, simulated annealing and restrained B-factor refinement, followed by the addition of 134 water molecules, reduced the R-factor from 41.2% to 22.2% (7 A to 2.6 A). Note that sugar puckers were unrestrained during the refinement. The rms deviations from ideal bond lengths/angles are 0.007 A/1.193 for the protein and 0.017 A/3.548 for the DNA. Atomic coordinates for the ERDBD-VitB1(2) complex structure will be deposited with the Brookhaven Protein databank. Acknowledgements: We thank Phil Evans, John Finch, Nobutoshi Ito, Aaron Klug and Jade Li for helpful discussions. We are also grateful to Louise Fairall for assistance with the preparation of figures. References 1. Pabo, C. & Sauer, R. (1992). Transcription factors: structural families and principles of DNA recognition. Annu. Rev. Biochem. 61, 1053-1095. 2. Aggarwal, A.K., Rodgers, D.W., Drottar, M., Ptashne, M. & Harrison, S.C. (1988). Recognition of a DNA operator by the repressor of phage 434: a view at high resolution. Science 242, 899-907. 3. Shimon, L.J.W. & Harrison, S.C. (1993). The phage 434 OR2/R1-69 complex at 2.5 A resolution. J. Mol. Biol. 232, 826-838. 4. Rodgers, D.W. & Harrison, S.C. (1993). The complex between the phage 434 repressor DNA-binding domain and operator site OR3: structural differences between consensus and non-consensus halfsites. Structure 1, 227-240. 5. Martinez, E. & Wahli, W. (1991). Characterization of hormone response elements. In Nuclear Hormone Receptors: Molecular Mechanisms, Cellular Functions, Clinical Abnormalities. (Parker, M.G., ed), pp. 125-153, Academic Press, London. 6. Klein-Hitpag, L., Schorpp, M., Wagner, U. & Ryffel, G.U. (1986). An estrogen-responsive element derived from the 5' flanking region of the Xenopus vitellogenin A2 gene functions in transfected human cells. Cell 46, 1053-1061. 7. Martinez, E., Givel, F. & Wahli, W. (1987). The oestrogen-responsive element as an inducible enhancer: DNA sequence requirements and conversion to a glucocorticoid-responsive element. EMBO J. 6, 3719-3727. 8. Klein-Hitpag, L., Ryffel, G.U., Heitlinger, E. & Cato, A.C.B. (1988). A 13 bp palindrome is a functional estrogen responsive element and interacts specifically with estrogen receptor. Nucleic Acids Res. 16, 647-663. 9. Burch, J.B.E., Evans, M.I., Friedman, T.M. & O'Malley, P.J. (1988). Two functional oestrogen response elements are located upstream of the major chicken vitellogenin gene. Mol. Cell. Biol. 8, 1123-1131. 10. Klein-Hitpag, L., Tsai, S.Y., Greene, G.L., Clark, J.H., Tsai, M.J. & O'Malley, B.W. (1989). Specific binding of estrogen receptor to the estrogen response element. Mol. Cell. Biol. 9, 43-49. 11. Berry, M., Nunez, A.-M. & Chambon, P. (1989). Oestrogen-responsive element of the human PS2 gene is an imperfectly palindromic sequence. Proc. Natl. Acad. Sci. USA 86, 1218-1222. 12. Weisz, A. & Rosales, R. (1990). Identification of an estrogen response element upstream of the human c-fos gene that binds the estrogen receptor and the AP-1 transcription factor. Nucleic Acids Res. 18, 5097-5106. 13. Hyder, S.M., Cram, L.F. & Loose-Mitchell, D.S. (1991). Sequence of a 1.4-kb region in the 3'-flanking region of the murine c-fos protooncogene which contains an estrogen-response element. Gene 105, 281-282. 14. Klein-Hitpag, L., Kaling, M. & Ryffel, G.U. (1988). Synergism of closely adjacent oestrogen-responsive elements increases their regulatory potential. J. Mol. Biol. 201, 537-544. 15. Martinez, E. & Wahli, W. (1989). Cooperative binding of estrogen receptor to imperfect ERE's correlates with their synergistic hormone-dependent enhancer activity. EMBOJ. 8, 3781-3791. 16. Evans, R.M. (1988). The steroid and thyroid hormone receptor super family. Science 240, 889-895. 17. Luisi, B.F., Schwabe, J.W.R. & Freedman, L. (1994). The steroid/nuclear receptors: from three-dimensional structure to complex function. In Steroids. (Litwack, G., ed), pp. 1-47, Academic Press, London. 18. Kumar, V. & Chambon, P. (1988). The estrogen receptor binds tightly to its responsive element as a ligand-induced homodimer. Cell 55, 145-156. 19. Schwabe, J.W.R., Neuhaus, D. & Rhodes, D. (1990). Solution structure of the DNA-binding domain of the oestrogen receptor. Nature 348, 458-461.

Recognizing a non-consensus DNA target Schwabe, Chapman and Rhodes 213 20. Hard, T., et al., & Kaptein, R. (1990). Solution structure of the glucocorticoid receptor DNA-binding domain. Science 249, 157-160. 21. Knegtel, R.M.A., et al., & Kaptein, R. (1993). The solution structure of the human retinoic acid receptor-s DNA-binding domain. J. Biomol. NMR 3, 1-17. 22. Lee, M.S., Kliewer, S.A., Provencal, J., Wright, P. & Evans, R.M. (1993). Structure of the retinoid X receptor a DNA binding domain: a helix required for homodimeric DNA binding. Science 260, 1117-1121. 23. Schwabe, J.W.R., Chapman, L., Finch, J.T., Rhodes, D. & Neuhaus, D. (1993). DNA recognition by the oestrogen receptor: from solution to the crystal. Structure 1, 187-204. 24. Schwabe, J.W.R., Chapman, L., Finch, J.T. & Rhodes, D. (1993). The crystal structure of the estrogen receptor DNA-binding domain bound to DNA: how receptors discriminate between their response elements. Cell 75, 567-578. 25. Luisi, B.F., Xu, W.X., Otwinowski, Z., Freedman, L.P., Yamamoto, K.R. & Sigler, P.B. (1991). Crystallographic analysis of the interaction of the glucocorticoid receptor with DNA. Nature 352, 497-505. 26. Schultz, S.C., Shields, G.C. & Steitz, T.A. (1991). Crystal structure of a CAP-DNA complex: the DNA is bent by 90. Science 253, 1001-1007. 27. Somers, W.S. & Phillips, S.E.V. (1992). Crystal structure of the met repressor-operator complex at 2.8 A resolution reveals DNA recognition by 1-strands. Nature 359, 387-393. 28. Fairall, L., Schwabe, J.W.R., Chapman, L., Finch, J.T. & Rhodes, D. (1993). The crystal structure of a two zinc-finger peptide reveals an extension to the rules for zinc-finger/dna recognition. Nature 366, 483-487. 29. Bringer, A.T. (1992). X-PLOR Manual Version 3.0. Yale University, New Haven, CT. 30. Lavery, R. & Sklenar, H. (1989). Defining the structure of irregular nucleic acids: conventions and principles. J. Biomol. Struct. Dynam. 6, 655-667. 31. Xu, W., Allroy, I., Freedman, L.F. & Sigler, P.B. (1993). Stereochemistry of specific steroid receptor-dna interactions. Cold Spring Harbor Symp. Quant. Biol. 58, 133-139. 32. Leslie, A.G.W. (1992). Recent changes to the MOSFLM package for processing film and image plate data. CCP4 and ESF-EACMB Newsletter on Protein Crystallography, 26. 33. Collaborative Computational Project Number 4 (1994). The CCP4 suite: programs for protein crystallography. Acta Crystallogr. D 50, 760-763. Received: 8 Dec 1994; revisions requested: 30 Dec 1994; revisions received: 13 an 1995. Accepted: 19 an 1995.