Lecture 9 (10/2/17) Lecture 9 (10/2/17)

Size: px
Start display at page:

Download "Lecture 9 (10/2/17) Lecture 9 (10/2/17)"

Transcription

1 Lecture 9 (10/2/17) Reading: Ch4; , , (a-helix) Ch4; , , 133, (b-sheets) Problems: Ch4 (text); 2, 3, 4, 8, 13, 14 NEXT Reading: Ch4; 125, , Problems: Ch4 (text); 7, 11 Ch4 (study guide); 1, 2 OUTLINE I. Protein Structure A. Primary 1. Determination Lecture 9 (10/2/17) a. Sequence determination; PHYSICAL i. Mass Spectrometry for proteins ii. Use of tandem MS/MS for sequence determination iii. Isolation of proteins by 2D PAGE; Isoelectric focusing x SDS-PAGE b. Sequence determination; BIOLOGICAL i. Genome sequenced ii. Bioinformatics to predict protein sequences in predicted genes iii. Use of CHEMICAL and/or PHYSICAL methods to get partial sequence B. Secondary 1. Conformational structure 2. Pauling & Corey s predictions a. a-helix b. b-sheets/strands c. Connections between b-strands d. Connections between a-helices; angle not important 3. Super secondary structure 1

2 Determination of primary structure THREE basic ways to know the primary structure. Only the CHEMICAL method will give the entire covalent structure, including any disulfide bonds. But other methods are more sensitive. One can classify these methods by: CHEMICAL PHYSICAL BIOINFOMATICAL We just went through the CHEMICAL. The PHYSICAL method still requires the same strategy, including purification, fragmentation, chromatography, and alignment. But, instead of an Edman degradation the use of tandem Mass Spectrometry (MS) is employed. Lets look at the use of MS in biochemistry Ions fly in a vacuum toward a target with a velocity z/m (charge-to-mass ratio) Molecules with higher charge and lower mass get detected first. Molecules with a lower charge and higher mass get detected last. Plotted as m/z to read peaks from left to right Instruments can distinguish molecules with same charge by < 1 Da Determine the Sequence: Tandem MS The major problem in using MS for macromolecules is getting them to fly in a vacuum with a charge. TWO major methods: 1) Electro-Spray Ionization (ESI) 2) Matrix-Assisted Laser-Desorption Ionization (MALDI) ESI Mass = (m/z) x z 893.3z = 848.7(z+1) 893.3z = 848.7z = z(44.6) 19 = z Mass = 893x19 = 16,967 Mass = 848.7x20 = 16,974 Mass = x10 = 16,963 Figure

3 Determine the Sequence: Tandem MS The major problem in using MS for macromolecules is getting them to fly in a vacuum with a charge. TWO major methods: 1) Electro-Spray Ionization (ESI) 2) Matrix-Assisted Laser-Desorption Ionization (MALDI) ESI Mass = (m/z) x z 893.3z = 848.7(z+1) 893.3z = 848.7z = z(44.6) 19 = z Mass = 893x19 = 16,967 Mass = 848.7x20 = 16,974 Mass = x10 = 16,963 Figure 5-17 Determine the Sequence: Tandem MS Matrix-Assisted Laser- Desorption Ionization (MALDI) MALDI How do you use MS to get protein sequences? 3

4 Determine the Sequence: Tandem MS Mass spectrometry uses mass-to-charge ratio of different ions to determine mass Tandem MS-MS: First selects a peptide, then fragmentation, and second determines mass. By comparing all of fragments, those that different by mass of one amino acid to determine sequence Determine the Sequence: Tandem MS EXAMPLE AVAW AA MW Ala 90 Val 109 Trp NH 3 a 3 a 2 CH 3 CH CH 3 CH 3 CH 3 CH c 3 c 2 C 1 (y 1 ) a 1 (b 1 ) AWVA AVAW a 3 a 2 a 1 c 1 c 2 c 3 m 0 a 3 a 2 c 1 a 1 c 2 c 3 m 0 There is an issue with K & Q, which have the same MW! 4

5 Determine the Sequence: Tandem MS EXAMPLE AVAW AA MW Ala 90 Val 109 Trp NH 3 a 3 a 2 CH 3 CH CH 3 CH 3 CH 3 CH c 3 c 2 C 1 (y 1 ) a 1 (b 1 ) AWVA AVAW a 3 a 2 a 1 c 1 c 2 c 3 m 0 a 3 a 2 c 1 a 1 c 2 c 3 m 0 There is an issue with K & Q, which have the same MW! Determination of primary structure THREE basic ways to know the primary structure: CHEMICAL Edman Degradation requires >100 pmole (1-5 µg) PHYSICAL MS/MS requires >1-10 pmole ( ng) BIOINFOMATICAL We just went through the CHEMICAL and PHYSICAL. The BIOINFORMATICAL method requires information from chemical or physical, but only a limited amount of sequence. Example: a sequence of 6 AA is only possible as one of 20 6 possible hexa-peptide sequences (1 of 64x10 6 ). There are no more than 50,000 protein-coding genes with 400 AA on average. This is ~20 x 10 6 possible unique sequences. So, a hexamer is not likely to appear more than once. Once you have at least 6 AA sequence, you can compare that to all possible proteins encoded in the entirety of the gene sequences (genome) for a species for which the genome is known using appropriate bioinformatic tools. This will then give you the entire protein sequence. There is one remaining issue: Where are the Disulfides, if any?...this requires chemical and/or physical methods 5

6 Determination of primary structure THREE basic ways to know the primary structure: CHEMICAL Edman Degradation requires >100 pmole (1-5 µg) PHYSICAL MS/MS requires >1-10 pmole ( ng) BIOINFOMATICAL We just went through the CHEMICAL and PHYSICAL. The BIOINFORMATICAL method requires information from chemical or physical, but only a limited amount of sequence. Example: a sequence of 6 AA is only possible as one of 20 6 possible hexa-peptide sequences (1 of 64x10 6 ). There are no more than 50,000 protein-coding genes with 400 AA on average. This is ~20 x 10 6 possible unique sequences. So, a hexamer is not likely to appear more than once. Once you have at least 6 AA sequence, you can compare that to all possible proteins encoded in the entirety of the gene sequences (genome) for a species for which the genome is known using appropriate bioinformatic tools. This will then give you the entire protein sequence. There is one remaining issue: Where are the Disulfides, if any?...this requires chemical and/or physical methods Determine which Cys are in Disulfide bonds Cys Cys Figure 5-13 Cys Cys Which of these Cys is bonded to. This Cys Cys Wiley Guided_explorations ( damentals_of_biochemistry_5e_isbneprof12 533/media/guided_explorations/ge_04_Protein_s equence_determination/protein_sequence.html) 6

7 Determine which Cys are in Disulfide bonds Recall that in the disruption of S-S- bonds is needed to separate polypeptides for sequence analysis And, that prevention of their re-oxidation can be done by alkylation. O = HCOOH Performic Acid Oxidize disulfide, which breaks -S S- bond Sulfonic acids + O3 O3 Prevention of their re-oxidation can be done by peroxidation as well. How do we use this to find the Disulfides? Separate and sequence the polypeptides Determine which Cys are in Disulfide bonds We change the protection step for sulfhydryl groups. Cleave/Protect (reduction/alkylation or oxidation) AFTER fragmentation Separate fragments as before, but any linked by S S- will not separate and remain together (e.g., orange peptide). THEN break S S- bonds, and re-separate. Determine the sequence of those peptides that fall off the diagonal by either Edman degradation or tandem MS/MS. Technique is called 2D-diagonal electrophoresis. 7

8 Determine which Cys are in Disulfide bonds We change the protection step for sulfhydryl groups. Cleave/Protect (reduction/alkylation or oxidation) AFTER fragmentation Separate fragments as before, but any linked by S S- will not separate and remain together (e.g., orange peptide). THEN break S S- bonds, and re-separate. Wiley Guided_explorations Determine the sequence of those peptides that fall off the diagonal by either Edman degradation or tandem MS/MS. Technique is called 2D-diagonal electrophoresis. More complicated actual example of ribonuclease that combines this approach with changing up the fragmentation by employing CNBr after separation. Protein Structure Conformational Structure How does the polypeptide chain fold? 8

9 4 levels of protein structure In order to understand these levels of structure, you need to understand the nature of the polymer first. In other words, the linkage or PEPTIDE BOND The 4 S s for secondary structure: Size Solubility Stability Shape -dependent on number of amino acids -dependent on AA composition and shape -complex and not well understood Why is there Secondary Structure? 9

10 The Levinthal Paradox (1969): Theoretical calculation: Consider just the a-carbon backbone.. If there are 4 clearly different angles allowed of all the angles at the a-carbon (f and y), then each residue has 2x4=8 degrees of freedom. For a protein of 100 residues, there are possible conformations to test for optimal energetics = 2 x different conformations At 1000 billion tests per second (1/psec), this is 2 x seconds to find the best. 7 x years Well... The age of the universe is 14 x 10 9 years The shortcut proteins use to fold is the use of 2 structure where most of these degrees of freedom are prescribed by a regular structure. What are these regular structures? In the early 1950 s, Linus Pauling and Robert Corey predicted some rules that proteins should follow to find the lowest energy conformation. 1) The peptide bond must be planar without free rotation 2) The degree of H-bonding of should be maximized to achieve the lowest energy state 3) The best H-bonds are linear 4) There should be repeating units of conformation (same) as you go from one residue to the next Using these rules they predicted two basic structures: a-helix b-sheet [consider energetic consequences in the (unfolded) water (folded) water transition] 10

11 There were no known protein structures until 1957, when Kendrew solved the structure of myoglobin: Imagine the excitement when indeed there were the very helices Pauling predicted! The α Helix 1 Right handed 11

12 The α Helix 4 1 Right handed Electrostatics -H-bonds all satisfied (as predicted) -Overall polarity of helix 2 Dimensions -5 Å diameter -100 rotation/residue -1.5 Å rise/residue -length variable (ave ~45Å (8 residues) 3 Conformational angles - f = 57 - y = 47 The α Helix d 1 Right handed 3 Conformational angles - f = 57 - y = 47 2 Dimensions -5 Å diameter d rotation/residue -1.5 Å rise/residue.. Rise -length variable (ave ~45Å (8 residues) 3.6 residues/turn 5.4 Å/turn. Pitch 4 Electrostatics -H-bonds all satisfied (as predicted) -Overall polarity of helix 12

13 The α Helix 3.6 residues per turn (repeat) 5.4 Å per turn (repeat) R-groups -all point out, away from the helix The α Helix: Space Filling Model Helices can be hydrophobic Helices can be hydrophilic Helices can be mixed: amphipathic Oxy-Myoglobin PDBid 1A6M 13

14 The α Helix Helices can be mixed: amphipathic LLQSLLSLLQSLLSLLLQW hydrophilic hydrophobic Like: -prefer small, medium, hydrophobic/charged -no steric hindrance at C b -Glu, Met, Ala, Leu, Lys, Phe, Gln Don t Like: -Pro -Gly Pro has fixed f = -65 Gly is opposite: has too many degrees of 82 freedom 14

15 The α Helix Increasing pitch (distance per turn) & rise (distance per residue) à Alpha helices twist Repeat units/turn # of atoms in one turn ded_explorations/ge_06_stable_helices_in_proteins_the_a_helix/alpha_helix.html The α Helix: examples Globular Proteins Myoglobin (Mb) 15

16 The α Helix: examples Fibrous Proteins: the a-keratin helix The α Helix: examples Fibrous Proteins: the a-keratin helix 5.1 Å per turn (repeat) 3.5 residues per turn (repeat) A Coiled Coil 16

17 The α Helix: examples a-keratin packing Two a-helices are coiled together to make a coiled-coil, which is straight Residues at interface are hydrophobic to stabilize the coiled coil. Residues at interface (pink) must be small to accommodate the packing The α Helix: examples Fibrous Proteins: the a-keratin helix a-b-c-d-e-f-g hf hf (L,V,A,G) (E,D,R) +/ +/ b-c-&-f tend to be Ser, Gln, Cys A Coiled Coil 17

18 Notice the >10% Cys a&d ~25-30% e&g b,c,&f ~33% Chemistry of Permanent Waving 18

19 The α Helix: examples Fibrous Proteins: the a-keratin helix One model-in text Hierarchy (small-to-large): a-keratin dimer-coiled-coil Protofilament-two dimers Protofibril-two filaments Microfibril-4 protofibrils (Int. Fil.) The α Helix: examples Fibrous Proteins: the a-keratin helix Another model Hierarchy: a-keratin dimer-coiled-coil Protofibril-coiled two dimers No Protofilament, same as microfibril Microfibril-8 protofibrils 19

20 The α Helix: Summary Right handed helix 3.6 aa per turn 5.4 Å rise per turn Carbonyl of residue n is H-bonded to NH of n+4 residue Has tightly packed core of main-chain atoms R-groups project outward Has overall macro-dipole (N-term +; C-term ) 93 Can be amphipathic β Sheets- antiparallel Using his rules, Pauling predicted two basic structures: a-helix b-sheet, which he called a back-and-forth structure 20

21 β Sheets- antiparallel 1 Right handed (2.0 7 ) 2 Dimensions -it s a sheet Almost fully extended: -3.4 Å rise/residue -2 residues/repeat à Pitch is 6.8 Å -length variable 3 Conformational angles - f = y = Electrostatics -H-bonds all satisfied (as predicted) -no polarity β Sheets- parallel Pauling did not predict a b-sheet made of b-strands going the same direction. 21

22 β Sheets N C ~ C N β Sheets Structure f y Rise (Dist/residue) (Å) Residues/ Repeat Pitch (Distance/ repeat) (Å) Diameter (Å) a-helix Anti- b-sheet Parallel b-sheet

23 β Sheets 5 R-groups -alternate and point away from sheet ~ 99 β Sheet: Space Filling Model Concanavalin A PDBid 2CNA 23

24 β Sheets Like: -prefer large, bulky groups -Val, Ile, Leu, Tyr, Trp, Phe Don t Like: -Pro (same reason) -Glu/Asp (full charges too close) The β Sheet: examples Globular Proteins Neuraminidase 6 x β Sheet- funnel to active site right-handed twist 24

25 The β Sheet: examples Fibrous Proteins: Silk fibroin (b-keratin) Silk all parallel b-sheet Sequence repeats: (GAGAGSGAAG(SGAGAG) 8 Y) X (x>10) -Gly is every-other residue, and Ala as well -Recall the alternating R-groups -Therefore, Gly is all on one side of sheet, and Ala on the other side of sheet The β Sheet: examples Silk fibroin (bkeratin) Silk's tensile strength is comparable to that of steel and about half as strong as Kevlar 25

26 β Sheets ~ ~ 6.4 Å 26

27 a-helix b-sheet Purple=polar Gold=hydrophobic Recall that b-sheets require strands from different parts of the same polymer, so. How are the connected? Oxy-Myoglobin and Concanavalin A PDBids 1A6M and 2CNA 27

28 Reverse Turns in Polypeptide Chains i+2 y = 0 i+3 N C f = 90 f = 139 Also called b-turn, b-bend, or hairpin y = 30 i+1 f = y = +135 i C N Structure f y Rise (Dist/resid ue) (Å) Residues/Re peat Pitch (Distance/repeat) (Å) Diameter (Å) a-helix Anti- b-sheet Parallel b-sheet b-turn-type I i i b-turn-type II i i Start and stop with same angles 28

29 Reverse Turns in Polypeptide Chains Often Asn (satisfy H-bonding) Often Gly (f = 65 ) Often Pro -Pro-Gly-? Turns in Polypeptide Chains often use cis-pro 29

30 Reverse Turns are used for anti-parallel sheets, but how do parallel strands find each other to make a sheet? These strands are not straight as we saw, and this twist helps the conformation of these loops. Summary These loops are very often not just random conformations, but form a-helices 30

31 What is happening? Different pieces of 2 structure are mixing together. These are called Motifs or Super-secondary Structures What are the structures and names of some of the most common motifs? Protein Structure-Supersecondary Short turn Long turn bab b-hairpin aa Take a b-hairpin, which is an antiparallel b- sheet, but the topology of the strands is not sequential babab=rossmann Fold aa with different sized loops 31

32 Protein Structure-Supersecondary Short turn Long turn bab b-hairpin aa Take a b-hairpin, which is an antiparallel b- sheet, but the topology of the strands is not sequential babab=rossmann Fold aa with different sized loops Protein Structure-Supersecondary Due to the right-handed twist in the b-strands, as you add more strands the structure comes back on itself to form barrels b-meander (b) 8 Greek Key (b) 8 (ab) 8 -Barrel 4 (bab) motifs connected by 4 a-helices 32

33 Protein Structure-Supersecondary Due to the right-handed twist in the b-strands, as you add more strands the structure comes back on itself to form barrels b-meander (b) 8 Greek Key (b) 8 (ab) 8 -Barrel 4 (bab) motifs connected by 4 a-helices Bovine Carboxypeptidase A Carboxypeptidase A PDBid 3CPA 33