SUPPLEMENTARY DATA. DNAproDB: an interactive tool for structural analysis of DNA-protein complexes

Similar documents
Nucleic acids. How DNA works. DNA RNA Protein. DNA (deoxyribonucleic acid) RNA (ribonucleic acid) Central Dogma of Molecular Biology

Structural Bioinformatics (C3210) DNA and RNA Structure

Hmwk # 8 : DNA-Binding Proteins : Part II

Molecular Biology (1)

DNA & DNA : Protein Interactions BIBC 100

RNA does not adopt the classic B-DNA helix conformation when it forms a self-complementary double helix

BCH222 - Greek Key β Barrels

Chapter 8 DNA Recognition in Prokaryotes by Helix-Turn-Helix Motifs

CSE : Computational Issues in Molecular Biology. Lecture 19. Spring 2004

Canonical B-DNA CGCGTTGACAACTGCAGAATC GC AT CG TA AT GC TA TA CG AT 20 Å. Minor Groove 34 Å. Major Groove 3.4 Å. Strands are antiparallel

Components of DNA. Components of DNA. Aim: What is the structure of DNA? February 15, DNA_Structure_2011.notebook. Do Now.

Diversity in DNA recognition by p53 revealed by crystal structures with Hoogsteen base pairs

Structure/function relationship in DNA-binding proteins

MCB 110:Biochemistry of the Central Dogma of MB. MCB 110:Biochemistry of the Central Dogma of MB

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Secondary Structure Prediction

Nucleic Acid Structure. Nucleic Acid Sequence Abbreviations. Sequence Abbreviations, con t.

All Rights Reserved. U.S. Patents 6,471,520B1; 5,498,190; 5,916, North Market Street, Suite CC130A, Milwaukee, WI 53202

BETA STRAND Prof. Alejandro Hochkoeppler Department of Pharmaceutical Sciences and Biotechnology University of Bologna

Chapter 5: Nucleic Acids, etc.

Sequence Analysis '17 -- lecture Secondary structure 3. Sequence similarity and homology 2. Secondary structure prediction

Protein Folding Problem I400: Introduction to Bioinformatics

STRUCTURAL BIOLOGY. α/β structures Closed barrels Open twisted sheets Horseshoe folds

Name 10 Molecular Biology of the Gene Test Date Study Guide You must know: The structure of DNA. The major steps to replication.

Globins. The Backbone structure of Myoglobin 2. The Heme complex in myoglobin. Lecture 10/01/2009. Role of the Globin.

Chapter Fundamental Molecular Genetic Mechanisms

Packing of Secondary Structures

Understanding DNA Structure

Specificity: Induced Fit

Bi 8 Lecture 7. Ellen Rothenberg 26 January Reading: Ch. 3, pp ; panel 3-1

BIOLOGICAL SCIENCE. Lecture Presentation by Cindy S. Malone, PhD, California State University Northridge. FIFTH EDITION Freeman Quillin Allison

RNA Part I: Chemical Structure of RNA

Paper : 03 Structure and Function of Biomolecules II Module: 02 Nucleosides, Nucleotides and type of Nucleic Acids

Molecular Genetics Quiz #1 SBI4U K T/I A C TOTAL

Supplementary Figure 1 Two distinct conformational states of the HNH domain in crystal structures. a

Differential Gene Expression

PrimePCR Assay Validation Report

BIRKBECK COLLEGE (University of London)

Part IV => DNA and RNA. 4.1 Nucleotide Properties 4.1a Nucleotide Nomenclature 4.1b DNA Sequencing

What Are the Chemical Structures and Functions of Nucleic Acids?

Proteins Higher Order Structures

The Double Helix. DNA and RNA, part 2. Part A. Hint 1. The difference between purines and pyrimidines. Hint 2. Distinguish purines from pyrimidines

Nucleotides: structure and functions. Prof. Dalė Vieželienė Biochemistry department Room No

Nucleic Acids, Proteins, and Enzymes

Protein Synthesis Notes

Answers to Module 1. An obligate aerobe is an organism that has an absolute requirement of oxygen for growth.

Super Models. Nucleotides Molecular Model Kit Copyright 2015 Ryler Enterprises, Inc. Recommended for ages 10 - adult

Multiple choice questions (numbers in brackets indicate the number of correct answers)

38. Inter-basepair Hydrogen Bonds in DNA

Nature Structural & Molecular Biology: doi: /nsmb.3175

Protocol S1: Supporting Information

MOLECULAR STRUCTURE OF DNA

NUCLEIC ACIDS Genetic material of all known organisms DNA: deoxyribonucleic acid RNA: ribonucleic acid (e.g., some viruses)

Chapter 12 Packet DNA 1. What did Griffith conclude from his experiment? 2. Describe the process of transformation.

Retrieving and Viewing Protein Structures from the Protein Data Base

Final exam. Please write your name on the exam and keep an ID card ready.

Structure formation and association of biomolecules. Prof. Dr. Martin Zacharias Lehrstuhl für Molekulardynamik (T38) Technische Universität München

Gene Expression - Transcription

Syllabus for GUTS Lecture on DNA and Nucleotides

Lesson Overview DNA Replication

The Genetic Code and Transcription. Chapter 12 Honors Genetics Ms. Susan Chabot

Supplementary Fig. S1. SAMHD1c has a more potent dntpase activity than. SAMHD1c. Purified recombinant SAMHD1c and SAMHD1c proteins (with

Ch Molecular Biology of the Gene

DNA AND CHROMOSOMES. Genetica per Scienze Naturali a.a prof S. Presciuttini

Table S1. Crystallographic Data and Refinement Statistics, Related to Experimental Procedures hsddb1-drddb2 CPD

Chapter 1 Structure of Nucleic Acids DNA The structure of part of a DNA double helix

Your Name: MID TERM ANSWER SHEET SIN: ( )

Algorithms in Bioinformatics

Fig Ch 17: From Gene to Protein

Structural bioinformatics

RNA synthesis/transcription I Biochemistry 302. February 6, 2004 Bob Kelm

Nucleic Acids and the RNA World. Pages Chapter 4

has only one nucleotide, U20, between G19 and A21, while trna Glu CUC has two

C. Incorrect! Threonine is an amino acid, not a nucleotide base.

5. Structure and Replication of DNA

Study Guide for Chapter 12 Exam DNA, RNA, & Protein Synthesis

TERTIARY MOTIF INTERACTIONS ON RNA STRUCTURE

MBMB451A Section1 Fall 2008 KEY These questions may have more than one correct answer

Prot-SSP: A Tool for Amino Acid Pairing Pattern Analysis in Secondary Structures

Chapter 25: Regulating Eukaryotic Transcription The Ligand Responsive Activators

Lecture 2: Central Dogma of Molecular Biology & Intro to Programming

Chapter 13 Section 2: DNA Replication

Problem Set 8. Answer Key

Genes - DNA - Chromosome. Chutima Talabnin Ph.D. School of Biochemistry,Institute of Science, Suranaree University of Technology

DNA Transcription. Visualizing Transcription. The Transcription Process

Introduction to Proteins

Regulation of gene expression. (Lehninger pg )

Protein Structure Prediction. christian studer , EPFL

RNA is a single strand molecule composed of subunits called nucleotides joined by phosphodiester bonds.

Protein-Protein Interactions I

KEY CONCEPT DNA was identified as the genetic material through a series of experiments. Found live S with R bacteria and injected

Unit Description: The unit on DNA replication will include the following activities:

Molecular Biology I. The Chemical Nature of DNA. Dr. Obaidur rahman

6-Foot Mini Toober Activity

RNA Structure Prediction and Comparison. RNA Biology Background

Solution Structure of the DNA-binding Domain of GAL4 from Saccharomyces cerevisiae

Spring 2006 Biochemistry 302 Exam 1

Lectures 9 and 10: Random Walks and the Structure of Macromolecules (contd.)

Transcription in Eukaryotes

CHEM 4420 Exam I Spring 2013 Page 1 of 6

Supplementary Table 1: List of CH3 domain interface residues in the first chain (A) and

Transcription:

SUPPLEMENTARY DATA DNAproDB: an interactive tool for structural analysis of DNA-protein complexes Jared M. Sagendorf 1, Helen M. Berman 2, *, and Remo Rohs 1, * 1 Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA 2 RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA *To whom correspondence should be addressed. Tel: +1 213 740 0552; Fax: +1 821 4257; Email: rohs@usc.edu (R.R.) or Tel: +1 848 445 4667; Email: berman@rcsb.rutgers.edu (H.M.B.) SUPPLEMENTARY METHODS FEATURE EXTRACTION Protein secondary structure DNAproDB assigns a three-state secondary structure to each residue in the DNA-protein interface using the program DSSP (1). DSSP assigns an eight-state secondary structure; H α helix, G 3 10 helix, I π helix, E extended strand, B isolated β-bridge, S bend, T hydrogen bonded turn, and blank - loop/irregular. To map from eight-state to three-state secondary structure, H, G and I are assigned as helix (H); E and B are assigned as strand (S); and S, T, and blank are assigned as loop (L). Individual strands are treated independently without regard to the formation of β-sheets. Buried solvent-accessible surface area The buried solvent-accessible surface area (BASA) is the difference between the solventaccessible surface area (SASA) (2) of some unit in the structure when it is in the free state and when it is in the bound state. For example, for every residue in the protein, SASA is calculated with all DNA removed from the structure, which is defined as the free state of the protein. All residues with a free SASA (SASA F ) > 0 constitute the protein surface, and may potentially 1

interact with the DNA. SASA values are re-calculated with the DNA present to determine the complex SASA (SASA C ). The BASA of each residue is defined as BASA = SASA F SASA C, which will always be greater than or equal to zero. Residues with BASA > 0 are considered to be in contact with the DNA, and the BASA value describes the extent of the contact. The same calculation is performed for each nucleotide, with the free state of the DNA corresponding to the structure with all protein residues removed. Different contributions of the BASA values are determined. For each residue, it is possible to specify how much of the total BASA is due to contact with the DNA major groove, minor groove, or backbone. To calculate these contributions, instead of removing the entire DNA structure for the free-state calculation, only part of the DNA is removed. To calculate the contribution of the DNA major groove to the BASA value of each residue, only the DNA major groove atoms are removed from the structure in the free state. Therefore, BASA wg = SASA F (wg) SASA C, where BASA wg is the surface area in Å 2 of lost SASA for each residue due to contact with the DNA major groove (wg for wide groove ); and SASA F (wg) is the SASA of each residue with only DNA major groove atoms removed. In this way, BASA wg (major groove; wg), BASA sg (minor groove; sg for small groove ) and BASA bb (backbone; bb for backbone ) contributions are determined for each residue. Similarly, for each DNA nucleotide, the BASA contributions due to helices, strands, and loops are determined. This analysis gives a more detailed description of the interface in terms of how much different parts of the structure are contacting each other. However, the sum of each contribution of the BASA may be less than the total BASA (but never greater) because calculating the BASA contributions in this way excludes overlapping surface areas. BASA values are also determined for each residue-nucleotide pair. Here, the entire structure is removed except for the pair, and the free and complex states are calculated as described. Individual contributions of the BASA are not determined for residue-nucleotide pairs. 2

SUPPLEMENTARY TABLE PDB ID 5CM DI Modified Nucleotides CHEMICAL NAME PDB ID MSE SEP Modified Residues CHEMICAL NAME SELENOMETHIONINE PHOSPHOSERINE 5-METHYL-2 -DEOXY-CYTIDINE-5-2 -DEOXYINOSINE-5-6OG 6-O-METHYL GUANOSINE-5 - TPO PHOSPHOTHREONINE DOC 2,3 -DIDEOXYCYTIDINE-5 - PTR O-PHOSPHOTYROSINE 8OG 8-OXO-2 -DEOXY-GUANOSINE-5 - S,S-(2- CME HYDROXYETHYL)THIOCYSTEINE DDG 2,3 -DIDEOXY-GUANOSINE-5 - OCS CYSTEINESULFONIC ACID 2DA 2,3 -DIDEOXYADENOSINE-5 - ALY N(6)-ACETYLLYSINE 2DT 3 -DEOXYTHYMIDINE-5-2PR 2-AMINO-9-[2-DEOXYRIBOFURANOSYL]-9H- PURINE-5 - MRG N2-(3-MERCAPTOPROPYL)-2 - DEOXYGUANOSINE-5 - MERCAPTOPROPYL)-2 - DEOXYGUANOSINE-5 - CTG (5R,6S)-5,6-DIHYDRO-5,6- DIHYDROXYTHYMIDINE-5-1CC 5-CARBOXY-2 -DEOXYCYTIDINE Supplementary Table 1. Currently supported chemically modified nucleotides and residues. Any non-standard residues or nucleotide not in this table will be removed from the structure during processing. 3

SUPPLEMENTARY FIGURES Supplementary Figure S1. Illustration of the axial coordinate system used to specify the position of each SSE in the DNA-protein interface. Let P be a point in space that indicates the position of an SSE, C be the curve defining the DNA helical axis, and P be a point on C where a perpendicular line can be drawn from P to C. The axial coordinates that define P are (φ, ρ, s). ρ is the distance from P to P ; s is the distance along the curve from a specified point on the curve to P (here, chosen as the end of the helix axis, corresponding to the 5 end of the first DNA strand in the structure); and φ is the angle made in a plane defined by the tangent vector of the curve at P between the line joining P and P which lies in this plane, and the projection of a line joining P and a fixed point in space to the plane. This fixed point is arbitrary; changing the fixed point simply amounts to adding some constant offset to φ relative to the original choice. In practice, the center of mass of the protein is chosen, which allows similar structures to be compared consistently. If this center of mass lies too near the DNA helix axis (which is the case for structures with perfect two-fold symmetry), the point is chosen to lie along a principal axis of the protein. (A) The curve in the figure represents the DNA helix axis (C), which, in this case, is not linear. The red circle is the position of an SSE (P). The length of the line joining P and P is the coordinate ρ. The distance along C from the 5 end to P is the coordinate s. The green plane contains the line joining P and P, and is defined by the tangent vector of C at P. (B) The angle between the projection of the line that joins P and the fixed point to the plane, and the line that joins P and P is the coordinate φ. 4

Supplementary Figure S2. Polar contact maps of selected structures meeting the search criteria described in Table 1. The complex with PDB ID 1JFI (3) contains a ternary complex of a TATA-binding protein (TBP) with Negative Cofactor 2 (NC2) bound to a TATA-box. The polar contact map shows a characteristic TBP interaction, with a series of strands in the minor groove (mg) in addition to two helices that contact the minor groove belonging to an NC2 domain. The complex with PDB ID 2GKD (4) shows the bacterial resistance protein Calicheamicin gene Cluster (CalC), which binds with a single helix in the minor groove and few other contacts. The complexes with PDB IDs 1J46 (5) and 3U2B (6) contain proteins that predominantly bind with two helices and several loop contacts in the minor groove. 5

Supplementary Figure S3. Different visualizations for a TATA-box binding DNA-protein complex (PDB ID 1VTO) (7). Green triangles represent strand SSEs; and blue squares represent loops. (A) Polar contact map for this structure. The DNA helix axis is curved, but the visualization shows the strands as contacting on a single side of the DNA (i.e. confined to one half of the figure). This visualization is intuitive when compared with the 3D structure in (B). (B) 3D representation of the structure, with strand and loop residues that contact the minor groove colored according to SSE type in green and blue, respectively. (C) Linear contact map of this structure, showing minor groove (mg) and major groove (MG) contacts. (D) Nucleotide-residue contact map, showing only DNA backbone (BB) interactions with strand residues. Loop interactions have been turned off. 6

SUPPLEMENTARY REFERENCES 1. Kabsch, W. and Sander, C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577-2637. 2. Lee, B. and Richards, F.M. (1971) The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol. 55, 379-400. 3. Kamada, K., Shu, F., Chen, H., Malik, S., Stelzer, G., Roeder, R.G., Meisterernst, M. and Burley, S.K. (2001) Crystal structure of negative cofactor 2 recognizing the TBP- DNA transcription complex. Cell 106, 71-81. 4. Singh, S., Hager, M.H., Zhang, C., Griffith, B.R., Lee, M.S., Hallenga, K., Markley, J.L. and Thorson, J.S. (2006) Structural insight into the self-sacrifice mechanism of enediyne resistance. ACS Chem. Biol. 1, 451-460. 5. Murphy, E.C., Zhurkin, V.B., Louis, J.M., Cornilescu, G. and Clore, G.M. (2001) Structural basis for SRY-dependent 46-X,Y sex reversal: modulation of DNA bending by a naturally occurring point mutation. J. Mol. Biol. 312, 481-499. 6. Jauch, R., Ng, C.K., Narasimhan, K. and Kolatkar, P.R. (2012) The crystal structure of the Sox4 HMG domain-dna complex suggests a mechanism for positional interdependence in DNA recognition. Biochem. J. 443, 39-47. 7. Kim, J.L. and Burley, S.K. (1994) 1.9 A resolution refined structure of TBP recognizing the minor groove of TATAAAAG. Nat. Struct. Biol. 1, 638-653. 7