CSSP2: An improved method for predicting contact-dependent secondary structure propensity

The calculation of contact-dependent secondary structure propensity (CSSP) has been reported to sensitively detect non-native beta-strand propensities in the core sequences of amyloidogenic proteins. Here we describe a noble energy-based CSSP method implemented on dual artificial neural networks that rapidly and accurately estimate the potential for the non-native secondary structure formation in local regions of protein sequences. In this method, we attempted to quantify long-range interaction patterns in diverse secondary structures by potential energy calculations and decomposition on a pairwise per-residue basis. The calculated energy parameters and seven-residue sequence information were used as inputs for artificial neural networks (ANNs) to predict sequence potential for secondary structure conversion. The trained single ANN using the >(i, i+/-4) interaction energy parameter exhibited 74% accuracy in predicting the secondary structure of test sequences in their native energy state, while the dual ANN-based predictor using (i, i+/-4) and >(i, i+/-4) interaction energies showed 83% prediction accuracy. The present method provides a simple and accurate tool for predicting sequence potential for secondary structure conversions without using 3D structural information.

[1]  B. Rost PHD: predicting one-dimensional protein structure by profile-based neural networks. , 1996, Methods in enzymology.

[2]  James C. Sacchettini,et al.  Therapeutic strategies for human amyloid diseases , 2002, Nature Reviews Drug Discovery.

[3]  Patrice Koehl,et al.  The ASTRAL compendium for protein structure and sequence analysis , 2000, Nucleic Acids Res..

[4]  Pierre Baldi,et al.  Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles , 2002, Proteins.

[5]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[6]  C M Dobson,et al.  Designing conditions for in vitro formation of amyloid protofilaments and fibrils. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Elena Orlova,et al.  Cryo‐electron microscopy structure of an SH3 amyloid fibril and model of the molecular packing , 1999, The EMBO journal.

[8]  Christopher M. Dobson,et al.  Amyloid fibrils from muscle myoglobin , 2001, Nature.

[9]  Sukjoon Yoon,et al.  Rapid assessment of contact‐dependent secondary structure propensity: Relevance to amyloidogenic sequences , 2005, Proteins.

[10]  Sukjoon Yoon,et al.  Analysis of Chameleon Sequences by Energy Decomposition on a Pairwise Per-residue Basis , 2006, The protein journal.

[11]  Krzysztof Sliwa,et al.  Functions of WW domains in the nucleus , 2001, FEBS letters.

[12]  P. S. Kim,et al.  Context-dependent secondary structure formation of a designed protein sequence , 1996, Nature.

[13]  William J Welsh,et al.  Detecting hidden sequence propensity for amyloid fibril formation , 2004, Protein science : a publication of the Protein Society.

[14]  Yuguang Mu,et al.  Folding, misfolding, and amyloid protofibril formation of WW domain FBP28. , 2006, Biophysical journal.

[15]  S. Sudarsanam,et al.  Structural diversity of sequentially identical subsequences of proteins: Identical octapeptides can have different conformations , 1998, Proteins.