Enhanced protein fold recognition using secondary structure information from nmr

NMR offers the possibility of accurate secondary structure for proteins that would be too large for structure determination. In the absence of an X‐ray crystal structure, this information should be useful as an adjunct to protein fold recognition methods based on low resolution force fields. The value of this information has been tested by adding varying amounts of artificial secondary structure data and threading a sequence through a library of candidate folds. Using a literature test set, the threading method alone has only a one‐third chance of producing a correct answer among the top ten guesses. With realistic secondary structure information, one can expect a 60–80% chance of finding a homologous structure. The method has then been applied to examples with published estimates of secondary structure. This implementation is completely independent of sequence homology, and sequences are optimally aligned to candidate structures with gaps and insertions allowed. Unlike work using predicted secondary structure, we test the effect of differing amounts of relatively reliable data.

[1]  M J Sippl,et al.  Knowledge-based potentials for proteins. , 1995, Current opinion in structural biology.

[2]  S Brunak,et al.  Relationship between protein structure and geometrical constraints , 1996, Protein science : a publication of the Protein Society.

[3]  P Argos,et al.  Folding the main chain of small proteins with the genetic algorithm. , 1994, Journal of molecular biology.

[4]  A. Bax,et al.  Empirical correlation between protein backbone conformation and C.alpha. and C.beta. 13C nuclear magnetic resonance chemical shifts , 1991 .

[5]  R. Powers,et al.  Resonance assignments for Oncostatin M, a 24-kDa alpha-helical protein. , 1996, Journal of Biomolecular NMR.

[6]  R. L. Jernigan,et al.  A NEW APPROACH TO PROTEIN FOLDING CALCULATIONS , 1994 .

[7]  G. Shaw,et al.  Assignment and secondary structure of calcium-bound human S100B , 1997, Journal of biomolecular NMR.

[8]  O. Ohlenschläger,et al.  NMR secondary structure of the plasminogen activator protein staphylokinase , 1997, Journal of biomolecular NMR.

[9]  K. Wüthrich,et al.  Protein conformation and proton nuclear-magnetic-resonance chemical shifts. , 1983, European journal of biochemistry.

[10]  G. Böhm,et al.  New approaches in molecular structure prediction. , 1996, Biophysical chemistry.

[11]  C. Wolf,et al.  1H, 15N and 13C NMR resonance assignment, secondary structure and global fold of the FMN-binding domain of human cytochrome P450 , 1997, Journal of biomolecular NMR.

[12]  Burkhard Rost,et al.  PHD - an automatic mail server for protein secondary structure prediction , 1994, Comput. Appl. Biosci..

[13]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[14]  W. M. Westler,et al.  A relational database for sequence-specific protein NMR data , 1991, Journal of biomolecular NMR.

[15]  Chris Sander,et al.  Touring protein fold space with Dali/FSSP , 1998, Nucleic Acids Res..

[16]  G. Barton,et al.  Protein fold recognition by mapping predicted secondary structures. , 1996, Journal of molecular biology.

[17]  M J Sternberg,et al.  Recognition of analogous and homologous protein folds--assessment of prediction success and associated alignment accuracy using empirical substitution matrices. , 1998, Protein engineering.

[18]  1H and 15N NMR resonance assignments and solution secondary structure of oxidized Desulfovibrio desulfuricans flavodoxin , 1996, Journal of biomolecular NMR.

[19]  M J Sippl,et al.  Threading thrills and threats. , 1996, Structure.

[20]  C. Sander,et al.  The FSSP database of structurally aligned protein fold families. , 1994, Nucleic acids research.

[21]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[22]  R. Levy,et al.  Global folding of proteins using a limited number of distance constraints. , 1993, Protein engineering.

[23]  F. Richards,et al.  The chemical shift index: a fast and simple method for the assignment of protein secondary structure through NMR spectroscopy. , 1992, Biochemistry.

[24]  A E Torda,et al.  Perspectives in protein-fold recognition. , 1997, Current opinion in structural biology.

[25]  J M Thornton,et al.  Protein structure prediction. , 1998, Current opinion in biotechnology.

[26]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[27]  I. C. O. B. Nomenclature IUPAC-IUB Commission on Biochemical Nomenclature. Abbreviations and symbols for the description of the conformation of polypeptide chains. Tentative rules (1969). , 1970, Biochemistry.

[28]  Janet M. Thornton,et al.  Protein fold recognition , 1993, J. Comput. Aided Mol. Des..

[29]  J. Skolnick,et al.  MONSSTER: a method for folding globular proteins with a small number of distance restraints. , 1997, Journal of molecular biology.

[30]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[31]  J. Garnier,et al.  Protein topology recognition from secondary structure sequences: application of the hidden Markov models to the alpha class proteins. , 1997, Journal of molecular biology.

[32]  Cbrister,et al.  Empirical Correlation between Protein Backbone Conformation and Ca and C @ 13 C Nuclear Magnetic Resonance Chemical Shifts , 2022 .

[33]  Chris Sander,et al.  Dali/FSSP classification of three-dimensional protein folds , 1997, Nucleic Acids Res..

[34]  W. Taylor,et al.  Global fold determination from a small number of distance restraints. , 1995, Journal of molecular biology.

[35]  T. Huber,et al.  Protein fold recognition without Boltzmann statistics or explicit physical basis , 1998, Protein science : a publication of the Protein Society.

[36]  A. Gronenborn,et al.  Three‐dimensional solution structure of the 44 kDa ectodomain of SIV gp41 , 1998, The EMBO journal.

[37]  S H Bryant,et al.  A retrospective analysis of CASP2 threading predictions , 1997, Proteins.

[38]  B. Rost,et al.  Combining evolutionary information and neural networks to predict protein secondary structure , 1994, Proteins.

[39]  Portland Press Ltd IUPAC-IUB Commission on Biochemical Nomenclature. Abbreviations and symbols for the description of the conformation of polypeptide chains. Tentative rules (1969). , 1971, Biochemistry.

[40]  A M Lesk,et al.  CASP2: Report on ab initio predictions , 1997, Proteins.

[41]  M Levitt,et al.  Competitive assessment of protein fold recognition and alignment accuracy , 1997, Proteins.

[42]  B D Sykes,et al.  Chemical shifts as a tool for structure determination. , 1994, Methods in enzymology.

[43]  L. Perkins,et al.  Chemical shift assignments and secondary structure of the Grb2 SH2 domain by heteronuclear NMR spectroscopy , 1996, Journal of biomolecular NMR.

[44]  D. Wishart,et al.  The 13C Chemical-Shift Index: A simple method for the identification of protein secondary structure using 13C chemical-shift data , 1994, Journal of biomolecular NMR.

[45]  M. Sippl,et al.  Detection of native‐like models for amino acid sequences of unknown three‐dimensional structure in a data base of known protein conformations , 1992, Proteins.

[46]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[47]  R. Powers,et al.  Resonance assignments for Oncostatin M, a 24-kDa α-helical protein , 1996 .

[48]  C. Sander,et al.  Searching protein structure databases has come of age , 1994, Proteins.

[49]  G. Clore,et al.  Determination of the secondary structure and global topology of the 44 kDa ectodomain of gp41 of the simian immunodeficiency virus by multidimensional nuclear magnetic resonance spectroscopy. , 1997, Journal of molecular biology.

[50]  52 – PROTEIN STRUCTURE PREDICTION , 1990 .

[51]  F. Richards,et al.  Relationship between nuclear magnetic resonance chemical shift and protein secondary structure. , 1991, Journal of molecular biology.

[52]  R. Jernigan,et al.  Structure-derived potentials and protein simulations. , 1996, Current opinion in structural biology.

[53]  David C. Jones,et al.  Potential energy functions for threading. , 1996, Current opinion in structural biology.

[54]  Chris Sander,et al.  The FSSP database: fold classification based on structure-structure alignment of proteins , 1996, Nucleic Acids Res..

[55]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[56]  B. Rost,et al.  Protein fold recognition by prediction-based threading. , 1997, Journal of molecular biology.

[57]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[58]  B. Masters,et al.  Three-dimensional structure of NADPH-cytochrome P450 reductase: prototype for FMN- and FAD-containing enzymes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[59]  Organon Scientific Commission on Biochemical Nomenclature , 1987 .

[60]  D Eisenberg,et al.  A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. , 1997, Journal of molecular biology.