Protein secondary structure prediction.

While the prediction of a native protein structure from sequence continues to remain a challenging problem, over the past decades computational methods have become quite successful in exploiting the mechanisms behind secondary structure formation. The great effort expended in this area has resulted in the development of a vast number of secondary structure prediction methods. Especially the combination of well-optimized/sensitive machine-learning algorithms and inclusion of homologous sequence information has led to increased prediction accuracies of up to 80%. In this chapter, we will first introduce some basic notions and provide a brief history of secondary structure prediction advances. Then a comprehensive overview of state-of-the-art prediction methods will be given. Finally, we will discuss open questions and challenges in this field and provide some practical recommendations for the user.

[1]  L. Pauling,et al.  Configurations of Polypeptide Chains With Favored Orientations Around Single Bonds: Two New Pleated Sheets. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[2]  L. Pauling,et al.  The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[3]  K. Nagano Logical analysis of the mechanism of protein folding. I. Predictions of helices, loops and beta-structures from primary structure. , 1973, Journal of molecular biology.

[4]  P. Y. Chou,et al.  Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. , 1974, Biochemistry.

[5]  V. Lim Structural principles of the globular organization of protein chains. A stereochemical theory of globular protein secondary structure. , 1974, Journal of molecular biology.

[6]  J. Richardson,et al.  The beta bulge: a common small unit of nonrepetitive protein structure. , 1978, Proceedings of the National Academy of Sciences of the United States of America.

[7]  J. Garnier,et al.  Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. , 1978, Journal of molecular biology.

[8]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[9]  W. Kabsch,et al.  How good are predictions of protein secondary structure? , 1983, FEBS letters.

[10]  D. Eisenberg,et al.  Analysis of membrane and surface protein sequences with the hydrophobic moment plot. , 1984, Journal of molecular biology.

[11]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[12]  M. Sternberg,et al.  Prediction of protein secondary structure and active sites using the alignment of homologous sequences. , 1987, Journal of molecular biology.

[13]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[14]  G. Schulz,et al.  A critical evaluation of methods for prediction of protein secondary structures. , 1988, Annual review of biophysics and biophysical chemistry.

[15]  D. Goldenberg,et al.  Mutational analysis of a protein-folding pathway , 1989, Nature.

[16]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[17]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[18]  A. Bairoch,et al.  The SWISS-PROT protein sequence data bank. , 1991, Nucleic acids research.

[19]  J. Thornton,et al.  Identification, classification, and analysis of beta‐bulges in proteins , 1993, Protein science : a publication of the Protein Society.

[20]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[21]  P. Argos,et al.  Quantification of secondary structure prediction improvement using multiple alignments. , 1993, Protein engineering.

[22]  A A Salamov,et al.  Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. , 1995, Journal of molecular biology.

[23]  J. Gibrat,et al.  GOR method for predicting protein secondary structure from amino acid sequence. , 1996, Methods in enzymology.

[24]  B. Rost PHD: predicting one-dimensional protein structure by profile-based neural networks. , 1996, Methods in enzymology.

[25]  G. Barton,et al.  Protein fold recognition by mapping predicted secondary structures. , 1996, Journal of molecular biology.

[26]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[27]  P. Argos,et al.  Seventy‐five percent accuracy in protein secondary structure prediction , 1997, Proteins.

[28]  J. Skolnick,et al.  MONSSTER: a method for folding globular proteins with a small number of distance restraints. , 1997, Journal of molecular biology.

[29]  B. Rost,et al.  Protein fold recognition by prediction-based threading. , 1997, Journal of molecular biology.

[30]  J. Thompson,et al.  The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. , 1997, Nucleic acids research.

[31]  G. Heijne,et al.  Genome‐wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms , 1998, Protein science : a publication of the Protein Society.

[32]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[33]  Geoffrey J. Barton,et al.  JPred : a consensus secondary structure prediction server , 1999 .

[34]  Richard Hughey,et al.  Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..

[35]  S F Altschul,et al.  Iterated profile searches with PSI-BLAST--a tool for discovery in protein databases. , 1998, Trends in biochemical sciences.

[36]  A. Elofsson,et al.  Hidden Markov models that use predicted secondary structures for fold recognition , 1999, Proteins.

[37]  K Karplus,et al.  Predicting protein structure using only sequence information , 1999, Proteins.

[38]  R. Copley,et al.  Fold recognition using sequence and secondary structure information , 1999, Proteins.

[39]  Giovanni Soda,et al.  Exploiting the past and the future in protein secondary structure prediction , 1999, Bioinform..

[40]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[41]  David C. Jones,et al.  GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. , 1999, Journal of molecular biology.

[42]  G J Barton,et al.  Application of multiple sequence alignment profiles to improve protein secondary structure prediction , 2000, Proteins.

[43]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[44]  M Ouali,et al.  Cascaded multiple classifiers for secondary structure prediction , 2000, Protein science : a publication of the Protein Society.

[45]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[46]  István Simon,et al.  The HMMTOP transmembrane topology prediction server , 2001, Bioinform..

[47]  Pierre Baldi,et al.  Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles , 2002, Proteins.

[48]  B. Rost,et al.  Alignments grow, secondary structure prediction improves , 2002, Proteins.

[49]  C. A. Andersen,et al.  Continuum secondary structure captures protein flexibility. , 2002, Structure.

[50]  Jonathan Casper,et al.  Combining local‐structure, fold‐recognition, and new fold methods for protein structure prediction , 2003, Proteins.

[51]  Marc A. Martí-Renom,et al.  EVA: evaluation of protein structure prediction servers , 2003, Nucleic Acids Res..

[52]  Harpreet Kaur,et al.  Prediction of transmembrane regions of beta-barrel proteins using ANN- and SVM-based methods. , 2004, Proteins.

[53]  Dmitrij Frishman,et al.  STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins , 2004, Nucleic Acids Res..

[54]  A. Krogh,et al.  A combined transmembrane topology and signal peptide prediction method. , 2004, Journal of molecular biology.

[55]  J. S. Sodhi,et al.  Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. , 2004, Journal of molecular biology.

[56]  Hongyi Zhou,et al.  Single‐body residue‐level knowledge‐based energy score combined with sequence‐profile and secondary structure information for fold recognition , 2004, Proteins.

[57]  Erik L. L. Sonnhammer,et al.  An HMM posterior decoder for sequence feature prediction that includes homology information , 2005, ISMB.

[58]  Jaap Heringa,et al.  PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information , 2005, Nucleic Acids Res..

[59]  Yaoqi Zhou,et al.  SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures. , 2005, Bioinformatics.

[60]  Aoife McLysaght,et al.  Porter: a new, accurate server for protein secondary structure prediction , 2005, Bioinform..

[61]  Zsuzsanna Dosztányi,et al.  PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank , 2004, Nucleic Acids Res..

[62]  Kuang Lin,et al.  A simple and fast secondary structure prediction method using hidden neural networks , 2005, Bioinform..

[63]  J. Heringa,et al.  Homology-extended sequence alignment , 2005, Nucleic acids research.

[64]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[65]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction—Round VII , 2007, Proteins.

[66]  Jimin Pei,et al.  PROMALS: towards accurate multiple sequence alignments of distantly related proteins , 2007, Bioinform..

[67]  David T. Jones,et al.  Improving the accuracy of transmembrane protein topology prediction using evolutionary information , 2007, Bioinform..

[68]  Kevin Karplus,et al.  Contact prediction using mutual information and neural nets , 2007, Proteins.

[69]  Jaap Heringa,et al.  PRALINETM: a strategy for improved multiple alignment of transmembrane proteins , 2008, Bioinform..

[70]  Christian Cole,et al.  The Jpred 3 secondary structure prediction server , 2008, Nucleic Acids Res..