Role of evolutionary information in predicting the disulfide‐bonding state of cysteine in proteins

A neural network‐based predictor is trained to distinguish the bonding states of cysteine in proteins starting from the residue chain. Training is performed by using 2,452 cysteine‐containing segments extracted from 641 nonhomologous proteins of well‐resolved three‐dimensional structure. After a cross‐validation procedure, efficiency of the prediction scores were as high as 72% when the predictor is trained by using protein single sequences. The addition of evolutionary information in the form of multiple sequence alignment and a jury of neural networks increases the prediction efficiency up to 81%. Assessment of the goodness of the prediction with a reliability index indicates that more than 60% of the predictions have an accuracy level greater than 90%. A comparison with a statistical method previously described and tested on the same database shows that the neural network‐based predictor is performing with the highest efficiency. Proteins 1999;36:340–346. © 1999 Wiley‐Liss, Inc.

[1]  M. Sternberg,et al.  Analysis and classification of disulphide connectivity in proteins. The entropic effect of cross-linkage. , 1994, Journal of molecular biology.

[2]  M. Sternberg,et al.  The disulphide beta-cross: from cystine geometry and clustering to classification of small disulphide-rich protein folds. , 1996, Journal of molecular biology.

[3]  G. Rose,et al.  Hydrophobicity of amino acid residues in globular proteins. , 1985, Science.

[4]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[5]  E. Freire Structural thermodynamics: prediction of protein stability and protein binding affinities. , 1993, Archives of biochemistry and biophysics.

[6]  B. Rost,et al.  Conservation and prediction of solvent accessibility in protein families , 1994, Proteins.

[7]  A. Krogh,et al.  Statistical mechanics of ensemble learning , 1997 .

[8]  P. Privalov,et al.  Stability of protein structure and hydrophobic interaction. , 1988, Advances in protein chemistry.

[9]  B. Rost,et al.  Topology prediction for helical transmembrane proteins at 86% accuracy–Topology prediction at 86% accuracy , 1996, Protein science : a publication of the Protein Society.

[10]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[11]  Piero Fariselli,et al.  Predicting Free Energy Contribution to the Conformational Stability of Folded Proteins From the Residue Sequence with Radial Basis Function Networks , 1995, ISMB.

[12]  S. Betz Disulfide bonds and the stability of globular proteins , 1993, Protein science : a publication of the Protein Society.

[13]  B. Rost,et al.  Combining evolutionary information and neural networks to predict protein secondary structure , 1994, Proteins.

[14]  D. Eisenberg Proteins. Structures and molecular properties, T.E. Creighton. W. H. Freeman and Company, New York (1984), 515, $36.95 , 1985 .

[15]  Steven M. Muskal,et al.  Prediction of the disulfide-bonding state of cysteine in proteins. , 1990, Protein engineering.

[16]  A. Fiser,et al.  Different sequence environments of cysteines and half cystines in proteins Application to predict disulfide forming residues , 1992, FEBS letters.

[17]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.