Germ-line vs somatic-variation theories

BME 128 Tuesday April 26 (1) Filling in the gaps Antibody diversity, how is it achieved? - by specialised (!) mechanisms Chp6 (Protein Diversity & Sequence Analysis) - more about the main concepts in this chapter (2) BME128 Glossary From Homework 1 Part B Thursday April 28 Selected parts from: Membrane Structure/Function (parts of Chp5) Protein Folding and Targetting (parts of Chp9+11) Antibody diversity, how is it achieved? Germ-line vs somatic-variation theories Germ-line: stated that each antibody had its own gene.nothing special, but required billions of genes to account for numbers of antibodies Somatic-variation: some mutation and recombination created vast number of genes for antibody formation This introduced a new concept: targeted mutation or recombination of DNA: is it possible?? Paradox: how could stability be maintained in C region and diversity exist in V region?

Antibody diversity, how is it achieved? Tonegawa s demonstration 1976 used restriction enzymes and DNA probes to show that germ cell DNA contained several smaller DNA segments compared to DNA taken from developed lymphocytes (=blot from a dsdna gel) Antibody diversity, how is it achieved? 7 means of generating antibody diversity

Antibody diversity, how is it achieved? Multigene organization of Ig genes Three loci, two for light-chain and one for heavy-chain production. Each locus features a number of ORFs (open reading frames) that can be combined in controlled ways, to make up the mature mrna of each molecule. What more general process for generating diversity at the protein level, does this remind you of? Antibody diversity, how is it achieved? Kappa light chain rearrangement

Heavy chain rearrangement Antibody diversity, how is it achieved? Mechanism of variable region rearrangements Each V, D and J segments of DNA are flanked by special sequences (RSS recombination signal sequences) of two sizes Single turn and double turn sequences (each turn of DNA is 10 base pairs long) Only single turn can combine with a double turn sequence Joining rule ensures that V segment joins only with a J segment in the proper order Recombinases join segments together

Antibody diversity, how is it achieved? P and N region nucleotide alteration adds to diversity of V region During recombination some nucleotide bases are cut from or added to the coding regions (p nucleotides) Up to 15 or so randomly inserted nucleotide bases are added at the cut sites of the V, D and J regions (n nucleotides) TdT (terminal deoxynucleotidyl transferase) a unique enzyme found only in lymphocytes Since these bases are random, the amino acid sequence generated by these bases will also be random Antibody diversity, how is it achieved? Randomness in joining process helps generate diversity by creating hypervariable regions of antigen binding site

Antibody diversity, how is it achieved? Some rearrangements are productive, others are non-productive: frame shift alterations are non-productive Antibody diversity, how is it achieved? Diversity calculations

Antibody diversity, how is it achieved? Somatic hypermutation adds even more variability B cell multiplication introduces additional opportunities for alterations to rearranged VJ or VDJ segments These regions are extremely susceptible to mutation compared to regular DNA, about one base in 600 is altered per two generations of dividing (expanding) lymphocyte population Antibody diversity, how is it achieved? Combination of heavy and light chains adds final diversity of variable region - putting it all together 8262 possible heavy chain combinations 320 light chain combinations Over 2 million combinations P and N nucleotide additions and subtractions multiply this by 10 4 Possible combinations over 10 10

Antibody diversity, how is it achieved? Class switching among constant regions: generation of IgG, IgA and IgE with same antigenic determinants idiotypes Edman Degradation = Protein Sequencing Characterizing Proteins Amino Acid Analysis Used to identify a protein or peptide based on amino acid composition Can be used to ID unusual amino acids Btw: to calculate amino acid composition from sequence use ProtParam http://www.expasy.org/tools/protparam.html Edman Microsequencing Used to determine the N-terminal (5-10 residues) of a new protein Mass Spectrometry State of the art for protein ID Slide credit: D.Wishart Micro343 (modified)

Edman Degradation = Protein Sequencing Edman Microsequencing Edman degradation or Edman microsequencing refers to determining amino acid sequence by chemical degradation at the N-terminal of the protein Edman degradation, named after its developer Pehr Edman who worked the chemistry out in the 1950's First applied by Fred Sanger to insulin Slide credit: D.Wishart Micro343 (modified) Edman Degradation = Protein Sequencing Edman Degradation Can be used to determine AA sequences for up to 20 residues Can be used to determine the sequence of entire proteins as long as the protein is first broken up into small (<20 AA) peptides (now replaced by DNA sequencing methods which are faster) Now primarily used to ID proteins from N- termini using database searches Slide credit: D.Wishart Micro343 (modified)

Edman Degradation = Protein Sequencing Edman Microsequencing Involves 3 basic steps Isolation or immobilization of pure protein (or peptide) Repeated PITC labelling and cleavage of N terminal (PTH-derivatized) amino acids Sequential separation (by HPLC) and identification of PTH amino acids Process takes ~ 2 hours Slide credit: D.Wishart Micro343 (modified) Edman Reaction Slide credit: xxx (modified)

Edman Degradation = Protein Sequencing Microsequencing Slide credit: D.Wishart Micro343 (modified); animation: http://www.protein.iastate.edu/nsequence494.html Protein Substitution/Scoring Matrices This is a substitution probability matrix for DNA mutations

Protein Substitution/Scoring Matrices Protein Substitution/Scoring Matrices Point (per 100 residues) reflects

Protein Substitution/Scoring Matrices in BLAST searches Protein Substitution/Scoring Matrices : simulates 1 mutation / 100 res

Protein Substitution/Scoring Matrices Original publication: Dayhoff, M. O.; Schwartz, R. M.; Orcutt, B. C. (1978). "A model of evolutionary change in proteins". Atlas of Protein Sequence and Structure 5 (3): 345-352. Homology & Relatedness

Homology & Relatedness Homology & Relatedness Q: are frog! and chick" an orthologous or a paralogous pair?

Homology & Relatedness Whitford Chp6 Fig 6.22 shows a nice example Homology & Relatedness If a sequence-derived gene tree looks very unusual (w.r.t. species represented), this could be the reason

BME 128 Chapter 6 what you do not have to know: Chou-Fasman Secondary Structure Prediction method BUT you want to look at the propensity table & know which amino acids are particularly frequently, or rarely, found in!-helices / "-strands / turns The idea of separating sequence homology and structure homology is rubbish! BUT don t miss their actual point: that evolutionary relatedness can still be detected via structural resemblance sometimes, even if the sequences seem completely dissimilar Q: how do superfolds fit into this picture? Btw: I also disagree with Whitfordʼs definition of proteomics more and more computational/bioinformatics people describe this as studying *all* possible proteins, however that is better defined as the theoretical proteome ; the proteome is all protein molecules present in the cell under the conditions (# functional genomics) BME 128 Chapter 6 most important concepts: Protein sequencing (including sequence-specific proteases) Evolution of proteins: diverge by duplication+mutation+ recombination; whereas selection imposes constraints and is responsible for conservation over long evolutionary distances Substitution/Scoring matrices are empirical & reflect both The structural similarity between related proteins is detectable longer in their 3-D structures than in their sequences (sometimes you will hear: Sequence evolves faster than structure Different proteins evolve at different rates Divergent evolution vs Convergent Evolution Computational prediction of protein structure (already touched on in earlier lecture) Strawberry Challenge II: There is a specific mistake (typo) In a figure legend in Chp6; can you find it and correct it (2 chars)?