SOPM: a self-optimized method for protein secondary structure prediction.

A new method called the self-optimized prediction method (SOPM) has been developed to improve the success rate in the prediction of the secondary structure of proteins. This new method has been checked against an updated release of the Kabsch and Sander database, 'DATABASE.DSSP', comprising 239 protein chains. The first step of the SOPM is to build sub-databases of protein sequences and their known secondary structures drawn from 'DATABASE.DSSP' by (i) making binary comparisons of all protein sequences and (ii) taking into account the prediction of structural classes of proteins. The second step is to submit each protein of the sub-database to a secondary structure prediction using a predictive algorithm based on sequence similarity. The third step is to iteratively determine the predictive parameters that optimize the prediction quality on the whole sub-database. The last step is to apply the final parameters to the query sequence. This new method correctly predicts 69% of amino acids for a three-state description of the secondary structure (alpha helix, beta sheet and coil) in the whole database (46,011 amino acids). The correlation coefficients are C alpha = 0.54, C beta = 0.50 and Cc = 0.48. Root mean square deviations of 10% in the secondary structure content are obtained. Implications for the users are drawn so as to derive an accuracy at the amino acid level and provide the user with a guide for secondary structure prediction. The SOPM method is available by anonymous ftp to ibcp.fr.