Retrieving and Viewing Protein Structures from the Protein Data Base 7.88J Protein Folding Prof. David Gossard September 15, 2003 1
PDB Acknowledgements The Protein Data Bank (PDB - http://www.pdb.org/) is the single worldwide repository for the processing and distribution of 3-D biological macromolecular structure data. Berman, H. M., J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne. The Protein Data Bank. Nucleic Acids Research 28 (2000): 235-242. (PDB Advisory Notice on using materials available in the archive: http://www.pdb.org/pdb/static.do?p=general_information/about_ pdb/pdb_advisory.html) Data used in the Retrieving, Viewing Protein Structures from the Protein Data Base Lecture Notes for 7.88J - Protein Folding PDB site data on page 4 ( PDB Growth ) and 7 ( Not all Structures are Different ). PDB screenshots: pages 9 14. Pages 20-22 contain screen shots from ExPASy (SwissPDB): Appel, R. D., A. Bairoch, and D. F. Hochstrasser. A new generation of information retrieval tools for biologists: the example of the ExPASy WWW server. Trends. Biochem. Sci. 19 (1994): 258-260. (ExPASy (Expert Protein Analysis System) proteomics server disclaimer: http://us.expasy.org/disclaimer.html) 2
Protein Data Base Established in 1971 Funded by NSF, DOE, NIH Operated by Rutgers, SDSC, NIST Purpose: Make protein structure data available to the entire scientific community In the beginning: less than a dozen protein structures Currently has 22,333 protein structures Growing at 20% per year New structures 50 times larger than those in 1971 are commonplace 3
PDB Growth 4
Why the Knee in the Curve? Engineered bacteria as a source of proteins Improved crystal-growing conditions More intense sources of X-rays Cryogenic treatment of crystals Improved detectors & data collection New method - NMR: Accounts for 15% of new structures in PDB Enables determination of structure of proteins in solution Protein Structures: From Famine to Feast, Berman, et.al. American Scientist v.90, p.350-359, July-August 2002 5
Why is the PDB Important? Collective Leverage for Understanding molecular machinery Rational drug design Engineering new molecules Structural genomics 6
Not all Structures are Different PDB Growth in New Folds 7
Structure vs Sequence New protein structures are being solved more slowly than new protein sequences are being solved Currently, known protein sequences outnumber known protein structures The sequence-structure gap continues to widen number Known Sequences Known Structures time 8
PDB Website http://www.rcsb.org/pdb/ Enter what you know 9
Query Result Browser Which one do I want? Let s look at this one 10
Yep, that s the right one Structure Explorer View it Download it 11
View Structure Static Images 12
Download/Display Display the file header Download the file (Select this file format) 13
Header Information 14
Visualizing Proteins High complexity Multiple levels of structure Important properties are distributed throughout the 3D structure No single/simple point at which to look 15
Visualization Objectives Structure Backbone; secondary, tertiary & quaternary Side chain groups Hydrophobic, charged, polar, acidic/base, etc. Cross-links Hydrogen bonds, disulfide bonds Surfaces VanderWaals, solvent-accessible Charge distributions, distances & angles, etc. 16
Display Conventions Wireframe Spacefill Ribbon Molecular Surface 17
Visualization Tools Viewers (free) 1960 s : MAGE, RasMol, Chime 2003 : SwissPDB, Protein Explorer, Cn3D, etc. Operating systems Unix, Windows, Mac Our choice (arbitrary) : Chime (plug-in to NETSCAPE) SwissPDB (stand-alone) 18
Important URL s Protein Data Base http://www.rcsb.org/pdb/ Chime https://en.wikipedia.org/wiki/mdl_chime SwissPDB http://www.expasy.ch/spdbv/ History of Visualization of Macromolecules http://www.umass.edu/microbio/rasmol/history.htm 19
SwissPDB 20
SwissPDB Toolbar Center Translate Zoom Rotate Distance between two atoms Angle between three atoms Measure omega, phi and psi angles Provenance of an atom Display groups a certain distance from an atom 21
Control Panel Chain Helix/sheet Residue Main chain Color target Color Side chain Label Surface Ribbon 22
Point of Information Today s lecture material is: a subset of the information available to you in online tutorials presented to get you started quickly and to shorten the learning curve not exhaustive or even sufficient => should be augmented by actually working through the online tutorials 23
MIT OpenCourseWare http://ocw.mit.edu 7.88J / 5.48J / 10.543J Protein Folding and Human Disease Spring 2015 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.