Review: protein secondary structure prediction continues to rise.

Methods predicting protein secondary structure improved substantially in the 1990s through the use of evolutionary information taken from the divergence of proteins in the same structural family. Recently, the evolutionary information resulting from improved searches and larger databases has again boosted prediction accuracy by more than four percentage points to its current height of around 76% of all residues predicted correctly in one of the three states, helix, strand, and other. The past year also brought successful new concepts to the field. These new methods may be particularly interesting in light of the improvements achieved through simple combining of existing methods. Divergent evolutionary profiles contain enough information not only to substantially improve prediction accuracy, but also to correctly predict long stretches of identical residues observed in alternative secondary structure states depending on nonlocal conditions. An example is a method automatically identifying structural switches and thus finding a remarkable connection between predicted secondary structure and aspects of function. Secondary structure predictions are increasingly becoming the work horse for numerous methods aimed at predicting protein structure and function. Is the recent increase in accuracy significant enough to make predictions even more useful? Because the recent improvement yields a better prediction of segments, and in particular of beta strands, I believe the answer is affirmative. What is the limit of prediction accuracy? We shall see.

[1]  L. Pauling,et al.  Configurations of Polypeptide Chains With Favored Orientations Around Single Bonds: Two New Pleated Sheets. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[2]  L. Pauling,et al.  The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[3]  A. Szent-Gyorgyi,et al.  Role of proline in polypeptide chain configuration of proteins. , 1957, Science.

[4]  M. Perutz,et al.  Structure of haemoglobin: a three-dimensional Fourier synthesis at 5.5-A. resolution, obtained by X-ray analysis. , 1960, Nature.

[5]  M. Perutz,et al.  Structure of Hæmoglobin: A Three-Dimensional Fourier Synthesis at 5.5-Å. Resolution, Obtained by X-Ray Analysis , 1960, Nature.

[6]  R. G. Hart,et al.  Structure of Myoglobin: A Three-Dimensional Fourier Synthesis at 2 Å. Resolution , 1960, Nature.

[7]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[8]  R. Dickerson,et al.  The cytochrome fold and the evolution of bacterial energy metabolism. , 1976, Journal of molecular biology.

[9]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[10]  M. Sternberg,et al.  Prediction of protein secondary structure and active sites using the alignment of homologous sequences. , 1987, Journal of molecular biology.

[11]  R Langridge,et al.  Improvements in protein secondary structure prediction by an enhanced neural network. , 1990, Journal of molecular biology.

[12]  K. Chou,et al.  An optimization approach to predicting protein structural class from amino acid composition , 1992, Protein science : a publication of the Protein Society.

[13]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[14]  P. Argos,et al.  Quantification of secondary structure prediction improvement using multiple alignments. , 1993, Protein engineering.

[15]  Robert B. Russell,et al.  Protein structure prediction , 1993, Nature.

[16]  M Reczko,et al.  Protein secondary structure prediction with partially recurrent neural networks. , 1993, SAR and QSAR in environmental research.

[17]  B. Rost,et al.  Combining evolutionary information and neural networks to predict protein secondary structure , 1994, Proteins.

[18]  B. Rost,et al.  Redefining the goals of protein secondary structure prediction. , 1994, Journal of molecular biology.

[19]  Søren Brunak,et al.  Protein structure by distance analysis , 1994 .

[20]  T. Hubbard,et al.  Fold recognition and ab initio structure predictions using hidden markov models and β‐strand pair potentials , 1995, Proteins.

[21]  Burkhard Rost,et al.  TOPITS: Threading One-Dimensional Predictions Into Three-Dimensional Structures , 1995, ISMB.

[22]  F E Cohen,et al.  Evaluation of current techniques for Ab initio protein structure prediction , 1995, Proteins.

[23]  L Serrano,et al.  Analysis of the effect of local interactions on protein stability. , 1996, Folding & design.

[24]  B. Rost PHD: predicting one-dimensional protein structure by profile-based neural networks. , 1996, Methods in enzymology.

[25]  R. King,et al.  Identification and application of the concepts important for accurate and reliable protein secondary structure prediction , 1996, Protein science : a publication of the Protein Society.

[26]  G. Barton,et al.  Protein fold recognition by mapping predicted secondary structures. , 1996, Journal of molecular biology.

[27]  P. S. Kim,et al.  Context-dependent secondary structure formation of a designed protein sequence , 1996, Nature.

[28]  S F Altschul,et al.  Local alignment statistics. , 1996, Methods in enzymology.

[29]  J. Thompson,et al.  Using CLUSTAL for multiple sequence alignments. , 1996, Methods in enzymology.

[30]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[31]  A A Salamov,et al.  Protein secondary structure prediction using local alignments. , 1997, Journal of molecular biology.

[32]  A. Lupas,et al.  Predicting coiled-coil regions in proteins. , 1997, Current opinion in structural biology.

[33]  B. Berger,et al.  MultiCoil: A program for predicting two‐and three‐stranded coiled coils , 1997, Protein science : a publication of the Protein Society.

[34]  R. Abagyan,et al.  Do aligned sequences share the same fold? , 1997, Journal of molecular biology.

[35]  Burkhard Rost,et al.  Sisyphus and prediction of protein structure , 1997, Comput. Appl. Biosci..

[36]  A Kolinski,et al.  A method for the prediction of surface “U”‐turns and transglobular connections in small proteins , 1997, Proteins.

[37]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[38]  M Nilges,et al.  Tertiary structure prediction using mean-force potentials and internal energy functions: successful prediction for coiled-coil geometries. , 1997, Folding & design.

[39]  C. Chothia,et al.  Intermediate sequences increase the detection of homology between sequences. , 1997, Journal of molecular biology.

[40]  M. Levitt,et al.  A structural census of the current population of protein sequences. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[41]  R. Aurora,et al.  Helix capping , 1998, Protein science : a publication of the Protein Society.

[42]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[43]  Geoffrey J. Barton,et al.  JPred : a consensus secondary structure prediction server , 1999 .

[44]  Richard Hughey,et al.  Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..

[45]  K. Nishikawa,et al.  Cooperative approach for the protein fold recognition , 1999, Proteins.

[46]  George D. Rose,et al.  Identifying two ancient enzymes in Archaea using predicted secondary structure alignment , 1999, Nature Structural Biology.

[47]  Anthony K. Felts,et al.  Protein tertiary structure prediction using a branch and bound algorithm , 1999, Proteins.

[48]  A. Torda,et al.  Enhanced protein fold recognition using secondary structure information from nmr , 1999, Protein science : a publication of the Protein Society.

[49]  F. Cohen,et al.  Evolutionary, mechanistic, and predictive analyses of the hydroxymethyldihydropterin pyrophosphokinase family of proteins. , 1999, Biochemical and biophysical research communications.

[50]  J. Thornton,et al.  Factors limiting the performance of prediction‐based fold recognition methods , 2008, Protein science : a publication of the Protein Society.

[51]  D. T. Jones,et al.  Successful recognition of protein folds using threading methods biased by sequence similarity and predicted secondary structure , 1999, Proteins.

[52]  Malin M. Young,et al.  Predicting conformational switches in proteins , 1999, Protein science : a publication of the Protein Society.

[53]  C. Orengo,et al.  Analysis and assessment of ab initio three‐dimensional prediction, secondary structure, and contacts prediction , 1999, Proteins.

[54]  Neural Networks to Study Invariant Features of Protein Folding , 1999 .

[55]  K. Chou,et al.  Prediction of membrane protein types and subcellular locations , 1999, Proteins.

[56]  Ram Samudrala,et al.  Ab initio protein structure prediction using a combined hierarchical approach , 1999, Proteins.

[57]  K. Chou,et al.  Prediction of protein secondary structure content. , 1999, Protein engineering.

[58]  G von Heijne,et al.  A turn propensity scale for transmembrane helices. , 1999, Journal of molecular biology.

[59]  A. Elofsson,et al.  Hidden Markov models that use predicted secondary structures for fold recognition , 1999, Proteins.

[60]  K Karplus,et al.  Predicting protein structure using only sequence information , 1999, Proteins.

[61]  B. Rost Twilight zone of protein sequence alignments. , 1999, Protein engineering.

[62]  Structural and functional analysis of the N-terminal extracellular region of beta-dystroglycan. , 1999, Biochemical and biophysical research communications.

[63]  N. Ben-Tal,et al.  kPROT: a knowledge-based scale for the propensity of residue orientation in transmembrane segments. Application to membrane protein structure prediction. , 1999, Journal of molecular biology.

[64]  Garland R. Marshall,et al.  A potential smoothing algorithm accurately predicts transmembrane helix packing , 1999, Nature Structural Biology.

[65]  Zhi-Xin Wang,et al.  What Is the Minimum Number of Residues to Determine the Secondary Structural State? , 1999, Journal of protein chemistry.

[66]  A. Lomize,et al.  Prediction of protein structure: The problem of fold multiplicity , 1999, Proteins.

[67]  K. T. Wang,et al.  Comparison of three classes of snake neurotoxins by homology modeling and computer simulation graphics. , 1999, Biochemical and biophysical research communications.

[68]  R. Copley,et al.  Fold recognition using sequence and secondary structure information , 1999, Proteins.

[69]  Heinz-Theodor Mevissen,et al.  Decision tree-based formation of consensus protein secondary structure prediction , 1999, Bioinform..

[70]  George D. Rose,et al.  A protein taxonomy based on secondary structure , 1999, Nature Structural Biology.

[71]  D Gorse,et al.  Prediction of the location and type of β‐turns in proteins using neural networks , 1999, Protein science : a publication of the Protein Society.

[72]  R Zhang,et al.  Skewed distribution of protein secondary structure contents over the conformational triangle. , 1999, Protein engineering.

[73]  S J Hamodrakas,et al.  A novel method for predicting transmembrane segments in proteins based on a statistical analysis of the SwissProt database: the PRED-TMR algorithm. , 1999, Protein engineering.

[74]  Malin M. Young,et al.  Predicting allosteric switches in myosins , 1999, Protein science : a publication of the Protein Society.

[75]  A. Panchenko,et al.  Threading with explicit models for evolutionary conservation of structure and sequence , 1999, Proteins.

[76]  D. Dryden,et al.  On the structure and operation of type I DNA restriction enzymes. , 1999, Journal of molecular biology.

[77]  B. Rost,et al.  A modified definition of Sov, a segment‐based measure for protein secondary structure prediction assessment , 1999, Proteins.

[78]  Giovanni Soda,et al.  Exploiting the past and the future in protein secondary structure prediction , 1999, Bioinform..

[79]  Christophe Geourjon,et al.  Improved performance in protein secondary structure prediction by inhomogeneous score combination , 1999, Bioinform..

[80]  G J Barton,et al.  Evaluation and improvement of multiple sequence methods for protein secondary structure prediction , 1999, Proteins.

[81]  Jaap Heringa,et al.  Two Strategies for Sequence Comparison: Profile-preprocessed and Secondary Structure-induced Multiple Alignment , 1999, Comput. Chem..

[82]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[83]  M Gerstein,et al.  Advances in structural genomics. , 1999, Current opinion in structural biology.

[84]  Cheng Che Chen,et al.  Using imperfect secondary structure predictions to improve molecular structure computations , 1999, Bioinform..

[85]  J. Tohá,et al.  Secondary structure of proteins and three-dimensional pattern recognition. , 1999, Journal of theoretical biology.

[86]  J. Wouters,et al.  Structure and function prediction of the Brucella abortus P39 protein by comparative modeling with marginal sequence similarities. , 1999, Protein engineering.

[87]  David C. Jones,et al.  GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. , 1999, Journal of molecular biology.

[88]  B. Maras,et al.  Structural and Functional Analysis of the N-Terminal Extracellular Region of β-Dystroglycan , 1999 .

[89]  V A Eyrich,et al.  Prediction of protein tertiary structure to low resolution: performance for a large and structurally diverse test set. , 1999, Journal of molecular biology.

[90]  Jean Garnier,et al.  FORESST: fold recognition from secondary structure predictions of proteins , 1999, Bioinform..

[91]  J. Skolnick,et al.  Ab initio folding of proteins using restraints derived from evolutionary information , 1999, Proteins.

[92]  J M Chandonia,et al.  New methods for accurate prediction of protein secondary structure , 1999, Proteins.

[93]  M. Whitlow,et al.  Protein fold analysis of the B30.2‐like domain , 1999, Proteins.

[94]  R. Wevers,et al.  Biochemical and molecular genetic characteristics of the severe form of tyrosine hydroxylase deficiency. , 1999, Clinical chemistry.

[95]  P. Rougé,et al.  A family of Arabidopsis plasma membrane receptors presenting animal beta-integrin domains. , 1999, Biochimica et biophysica acta.

[96]  Rita Casadio,et al.  Neural networks to study invariant features of protein folding , 1999 .

[97]  Douglas L. Brutlag,et al.  Bayesian Segmentation of Protein Secondary Structure , 2000, J. Comput. Biol..

[98]  G J Barton,et al.  Application of multiple sequence alignment profiles to improve protein secondary structure prediction , 2000, Proteins.

[99]  L Serrano,et al.  Protein engineering as a strategy to avoid formation of amyloid fibrils , 2000, Protein science : a publication of the Protein Society.

[100]  David M. Webster,et al.  Protein structure prediction : methods and protocols , 2000 .

[101]  R M Jackson,et al.  The serine protease inhibitor canonical loop conformation: examples found in extracellular hydrolases, toxins, cytokines and viral proteins. , 2000, Journal of molecular biology.

[102]  Adel Said Elmaghraby,et al.  Is it better to combine predictions? , 2000, Protein engineering.

[103]  B. Rost,et al.  Finding nuclear localization signals , 2000, EMBO reports.

[104]  R. Casadio,et al.  Predictions of protein segments with the same aminoacid sequence and different secondary structure: A benchmark for predictive methods , 2000, Proteins.

[105]  G Chelvanayagam,et al.  An analysis of the helix‐to‐strand transition between peptides with identical sequence , 2000, Proteins.

[106]  Pietro Liò,et al.  Wavelet change-point prediction of transmembrane proteins , 2000, Bioinform..

[107]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[108]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[109]  M Ouali,et al.  Cascaded multiple classifiers for secondary structure prediction , 2000, Protein science : a publication of the Protein Society.

[110]  W C Johnson,et al.  The relative order of helical propensity of amino acids changes with solvent environment , 2000, Proteins.

[111]  C Sander,et al.  Third generation prediction of secondary structures. , 2000, Methods in molecular biology.

[112]  O. Lund,et al.  Prediction of protein secondary structure at 80% accuracy , 2000, Proteins.

[113]  V. Thorsson,et al.  HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. , 2000, Journal of molecular biology.

[114]  S Brunak,et al.  Matching protein beta-sheet partners by feedforward and recurrent neural networks. , 2000, Proceedings. International Conference on Intelligent Systems for Molecular Biology.

[115]  Geoffrey J. Barton,et al.  ProtEST: protein multiple sequence alignments from expressed sequence tags , 2000, Bioinform..

[116]  P. S. Shah,et al.  Active site studies of bovine alpha1-->3-galactosyltransferase and its secondary structure prediction. , 2000, Biochimica et biophysica acta.

[117]  M. Sternberg,et al.  Enhanced genome annotation using structural profiles in the program 3D-PSSM. , 2000, Journal of molecular biology.

[118]  Eric Depiereux,et al.  Topology Prediction of Brucella Abortus Omp2b and Omp2a Porins After Critical Assessment of Transmembrane β Strands Prediction by Several Secondary Structure Prediction Methods , 2000, Journal of biomolecular structure & dynamics.

[119]  Pierre Baldi,et al.  Matching Protein b-Sheet Partners by Feedforward and Recurrent Neural Networks , 2000, ISMB.

[120]  Jorja G. Henikoff,et al.  PHAT: a transmembrane-specific substitution matrix , 2000, Bioinform..

[121]  Zheng Yuan,et al.  How good is prediction of protein structural class by the component‐coupled method? , 2000, Proteins.

[122]  R Samudrala,et al.  Constructing side chains on near-native main chains for ab initio protein structure prediction. , 2000, Protein engineering.

[123]  A. Baucom,et al.  Predicting protein function from structure: unique structural features of proteases. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[124]  Geoffrey J. Barton,et al.  Protein Sequence Alignment and Database Scanning , 2001 .