SPI - Structure predictability index for protein sequences

Estimation of structure predictability for a particular protein is difficult. Many methods estimate it in an a posteriori system evaluating the final, native protein structure. The SPI scale is intended to estimate the structure predictability of a particular amino acid sequence in an a priori system. A sequence-to-structure library was created based on the complete Protein Data Bank. The tetrapeptide was selected as a unit representing a well-defined structural motif. The early-stage folding structure (a model of which was presented elsewhere) was taken as the object for protein structure classification. Seven structural forms were distinguished for structure classification. The degree of determinability was estimated for the sequence-to-structure and structure-to-sequence relations particularly interesting for threading methods. A comparative analysis of the SPI and Q7 scales with the commonly used SOV and Q3 scales is presented. The complete contingency table, supplementary materials and all the programs used are available on request.

[1]  Lisa N Kinch,et al.  CASP5 target classification , 2003, Proteins.

[2]  Scott R. Presnell,et al.  A segment-based approach to protein secondary structure prediction. , 1991, Biochemistry.

[3]  Alexandre G. de Brevern,et al.  Use of a structural alphabet for analysis of short loops connecting repetitive structures , 2004, BMC Bioinformatics.

[4]  Irena Roterman-Konieczna,et al.  Geometrical Analysis of Structural Changes in Immunoglobulin Domains' Transition From Native to Molten State , 1995, Comput. Chem..

[5]  J L Sussman,et al.  A 3D building blocks approach to analyzing and predicting structure of proteins , 1989, Proteins.

[6]  Valerie Daggett,et al.  Molecular dynamics simulations of hydrophobic collapse of ubiquitin , 1998, Protein science : a publication of the Protein Society.

[7]  D. Shortle,et al.  Prediction of protein structure , 2000, Current Biology.

[8]  D Baker,et al.  Global properties of the mapping between local amino acid sequence and local structure in proteins. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Ariel Fernández,et al.  Coarse semiempirical solution to the protein folding problem , 2001 .

[10]  MICHAŁ BRYLIŃSKI,et al.  LIMITATION OF CONFORMATIONAL SPACE FOR PROTEINS – EARLY STAGE FOLDING SIMULATION OF HUMAN α AND β HEMOGLOBIN , 2004 .

[11]  B. Rost,et al.  Redefining the goals of protein secondary structure prediction. , 1994, Journal of molecular biology.

[12]  Irena Roterman-Konieczna,et al.  Limited conformational space for early-stage protein folding simulation , 2004, Bioinform..

[13]  J. M. Thornton,et al.  Prediction of super-secondary structure in proteins , 1983, Nature.

[14]  Ariel Fernández,et al.  Distinguishing foldable proteins from nonfolders: When and how do they differ? , 2002, Proteins.

[15]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[16]  T L Blundell,et al.  The use of amino acid patterns of classified helices and strands in secondary structure prediction. , 1996, Journal of molecular biology.

[17]  A. Efimov,et al.  A novel super‐secondary structure of proteins and the relation between the structure and the amino acid sequence , 1984, FEBS letters.

[18]  P. K. Mehta,et al.  A simple and fast approach to prediction of protein secondary structure from multiply aligned sequences with accuracy above 70% , 1995, Protein science : a publication of the Protein Society.

[19]  I Roterman,et al.  Modelling the optimal simulation path in the peptide chain folding--studies based on geometry of alanine heptapeptide. , 1995, Journal of theoretical biology.

[20]  I Roterman,et al.  Lysozyme Folded In Silico According to the Limited Conformational Sub-space , 2004, Journal of biomolecular structure & dynamics.

[21]  P. Y. Chou,et al.  Prediction of protein conformation. , 1974, Biochemistry.

[22]  C. Orengo,et al.  Analysis and assessment of ab initio three‐dimensional prediction, secondary structure, and contacts prediction , 1999, Proteins.

[23]  Leszek Konieczny,et al.  Early-Stage Folding in Proteins (In Silico) Sequence-to-Structure Relation , 2005, Journal of biomedicine & biotechnology.

[24]  A A Salamov,et al.  Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. , 1995, Journal of molecular biology.

[25]  Serge A. Hazout,et al.  Local backbone structure prediction of proteins , 2004, Silico Biol..

[26]  Andrés Colubri,et al.  Prediction of Protein Structure by Simulating Coarse-grained Folding Pathways: A Preliminary Report , 2004, Journal of biomolecular structure & dynamics.

[27]  A. Liwo,et al.  A method for optimizing potential-energy functions by a hierarchical design of the potential-energy landscape: Application to the UNRES force field , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[28]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[29]  L Rychlewski,et al.  Secondary structure prediction using segment similarity. , 1997, Protein engineering.

[30]  C Venclovas,et al.  Numerical criteria for the evaluation of ab initio predictions of protein structure , 1997, Proteins.

[31]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[32]  Leszek Konieczny,et al.  Limitation of conformational space for proteins -- early stage folding simulation of human alpha and beta hemoglobin chains , 2004 .

[33]  Leszek Konieczny,et al.  Conformational subspace in simulation of early‐stage protein folding , 2004, Proteins.

[34]  J. Garnier,et al.  Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. , 1978, Journal of molecular biology.

[35]  H. Valadié,et al.  Extension of a local backbone description using a structural alphabet: A new approach to the sequence‐structure relationship , 2002, Protein science : a publication of the Protein Society.

[36]  B. Rost,et al.  A modified definition of Sov, a segment‐based measure for protein secondary structure prediction assessment , 1999, Proteins.

[37]  J. Gibrat,et al.  Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. , 1987, Journal of molecular biology.

[38]  I Roterman,et al.  The geometrical analysis of peptide backbone structure and its local deformations. , 1995, Biochimie.

[39]  J. Mesirov,et al.  Hybrid system for protein secondary structure prediction. , 1992, Journal of molecular biology.

[40]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[41]  E. Lander,et al.  Protein secondary structure prediction using nearest-neighbor methods. , 1993, Journal of molecular biology.

[42]  A A Salamov,et al.  Protein secondary structure prediction using local alignments. , 1997, Journal of molecular biology.

[43]  V. Thorsson,et al.  HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. , 2000, Journal of molecular biology.

[44]  P. Y. Chou,et al.  Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. , 1974, Biochemistry.

[45]  A. Liwo,et al.  Cumulant-based expressions for the multibody terms for the correlation between local and electrostatic interactions in the united-residue force field , 2001 .

[46]  Patrick Aloy,et al.  Predictions without templates: New folds, secondary structure, and contacts in CASP5 , 2003, Proteins.