Amino acid propensities for secondary structures are influenced by the protein structural class.

Amino acid propensities for secondary structures were used since the 1970s, when Chou and Fasman evaluated them within datasets of few tens of proteins and developed a method to predict secondary structure of proteins, still in use despite prediction methods having evolved to very different approaches and higher reliability. Propensity for secondary structures represents an intrinsic property of amino acid, and it is used for generating new algorithms and prediction methods, therefore our work has been aimed to investigate what is the best protein dataset to evaluate the amino acid propensities, either larger but not homogeneous or smaller but homogeneous sets, i.e., all-alpha, all-beta, alpha-beta proteins. As a first analysis, we evaluated amino acid propensities for helix, beta-strand, and coil in more than 2000 proteins from the PDBselect dataset. With these propensities, secondary structure predictions performed with a method very similar to that of Chou and Fasman gave us results better than the original one, based on propensities derived from the few tens of X-ray protein structures available in the 1970s. In a refined analysis, we subdivided the PDBselect dataset of proteins in three secondary structural classes, i.e., all-alpha, all-beta, and alpha-beta proteins. For each class, the amino acid propensities for helix, beta-strand, and coil have been calculated and used to predict secondary structure elements for proteins belonging to the same class by using resubstitution and jackknife tests. This second round of predictions further improved the results of the first round. Therefore, amino acid propensities for secondary structures became more reliable depending on the degree of homogeneity of the protein dataset used to evaluate them. Indeed, our results indicate also that all algorithms using propensities for secondary structure can be still improved to obtain better predictive results.

[1]  P K Ponnuswamy,et al.  Prediction of protein secondary structures from their hydrophobic characteristics. , 2009, International journal of peptide and protein research.

[2]  M M Gromiha,et al.  Protein secondary structure prediction in different structural classes. , 1998, Protein engineering.

[3]  Tianzi Jiang,et al.  Esub8: A novel tool to predict protein subcellular localizations in eukaryotic organisms , 2004, BMC Bioinformatics.

[4]  J. Chory,et al.  A New Class of Transcription Factors Mediates Brassinosteroid-Regulated Gene Expression in Arabidopsis , 2005, Cell.

[5]  C. Crasto,et al.  Sequence codes for extended conformation: A neighbor‐dependent sequence analysis of loops in proteins , 2001, Proteins.

[6]  Volker A. Eyrich,et al.  EVA: Large‐scale analysis of secondary structure prediction , 2001, Proteins.

[7]  J. Otlewski,et al.  Amyloid‐forming peptides selected proteolytically from phage display library , 2003, Protein science : a publication of the Protein Society.

[8]  M Michael Gromiha,et al.  Inter-residue interactions in protein folding and stability. , 2004, Progress in biophysics and molecular biology.

[9]  P. Y. Chou,et al.  Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. , 1974, Biochemistry.

[10]  Hans Wolf,et al.  Identification and characterization of conserved and variable regions in the envelope gene of HTLV-III/LAV, the retrovirus of AIDS , 1986, Cell.

[11]  K Nishikawa,et al.  The folding type of a protein is relevant to the amino acid composition. , 1986, Journal of biochemistry.

[12]  Gajendra P. S. Raghava,et al.  Bteval: a Server for Evaluation of beta-turn Prediction Methods , 2003, J. Bioinform. Comput. Biol..

[13]  K. Chou,et al.  Prediction and classification of domain structural classes , 1998, Proteins.

[14]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[15]  J. Ball,et al.  SU proteins from virulent and avirulent EIAV demonstrate distinct biological properties. , 2005, Virology.

[16]  P Argos,et al.  Prediction of secondary structural content of proteins from their amino acid composition alone. II. The paradox with secondary structural class , 1996, Proteins.

[17]  M. Murakami Critical amino acids responsible for conferring calcium channel characteristics are located on the surface and aroundβ-turn potentials of channel proteins , 1995, Journal of protein chemistry.

[18]  J. Rothman,et al.  Conformational change of chaperone Hsc70 upon binding to a decapeptide: A circular dichroism study , 1993, Protein science : a publication of the Protein Society.

[19]  C. DeLisi,et al.  Prediction of protein structural class from the amino acid sequence , 1986, Biopolymers.

[20]  S. Modrow,et al.  Characterization of two related Epstein-Barr virus-encoded membrane proteins that are differentially expressed in Burkitt lymphoma and in vitro-transformed cell lines. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[21]  P. Y. Chou,et al.  Empirical predictions of protein conformation. , 1978, Annual review of biochemistry.

[22]  P. Argos,et al.  Quantification of secondary structure prediction improvement using multiple alignments. , 1993, Protein engineering.

[23]  B. Rost,et al.  Combining evolutionary information and neural networks to predict protein secondary structure , 1994, Proteins.

[24]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[25]  J. Fan,et al.  Expression of the Epstein-Barr virus 138-kDa early protein in Escherichia coli for the use as antigen in diagnostic tests. , 1986, Gene.

[26]  D. Long,et al.  Localization and synthesis of an antigenic determinant of herpes simplex virus glycoprotein D that stimulates the production of neutralizing antibody , 1984, Journal of virology.

[27]  Jin-An Feng,et al.  NdPASA: A novel pairwise protein sequence alignment algorithm that incorporates neighbor‐dependent amino acid propensities , 2005, Proteins.

[28]  A A Salamov,et al.  Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. , 1995, Journal of molecular biology.

[29]  K. Chou A novel approach to predicting protein structural classes in a (20–1)‐D amino acid composition space , 1995, Proteins.

[30]  J M Chandonia,et al.  The importance of larger data sets for protein secondary structure prediction with neural networks , 1996, Protein science : a publication of the Protein Society.

[31]  E. Lander,et al.  Protein secondary structure prediction using nearest-neighbor methods. , 1993, Journal of molecular biology.

[32]  K. Nishikawa Assessment of secondary-structure prediction of proteins. Comparison of computerized Chou-Fasman method with others. , 1983, Biochimica et biophysica acta.

[33]  Fan Jiang,et al.  Prediction of protein secondary structure with a reliability score estimated by local sequence clustering. , 2003, Protein engineering.

[34]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[35]  Burkhard Rost,et al.  PHD - an automatic mail server for protein secondary structure prediction , 1994, Comput. Appl. Biosci..

[36]  Junwen Wang,et al.  Exploring the sequence patterns in the α‐helices of proteins , 2003 .

[37]  Robert B. Russell,et al.  GlobPlot: exploring protein sequences for globularity and disorder , 2003, Nucleic Acids Res..

[38]  C. Ramakrishnan,et al.  Stranded in isolation: structural role of isolated extended strands in proteins. , 2003, Protein engineering.

[39]  Jude W. Shavlik,et al.  Using knowledge-based neural networks to improve algorithms: Refining the Chou-Fasman algorithm for protein folding , 2004, Machine Learning.

[40]  M. Murakami Critical amino acids responsible for converting specificities of proteins and for enhancing enzyme evolution are located around β-turn potentials: Data-based prediction , 1993, Journal of protein chemistry.

[41]  T. Smith,et al.  Alignment of protein sequences using secondary structure: a modified dynamic programming method. , 1990, Protein engineering.

[42]  G. Fasman Prediction of Protein Structure and the Principles of Protein Conformation , 2012, Springer US.

[43]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[44]  J. Garnier,et al.  Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. , 1978, Journal of molecular biology.

[45]  P. Y. Chou,et al.  Prediction of the secondary structure of proteins from their amino acid sequence. , 2006 .

[46]  Yi Xiao,et al.  A common sequence-associated physicochemical feature for proteins of beta-trefoil family , 2005, Comput. Biol. Chem..

[47]  Hervé Seligmann,et al.  Cost-Minimization of Amino Acid Usage , 2003, Journal of Molecular Evolution.

[48]  D. Young,et al.  A Chimeric Multi-Human Epidermal Growth Factor Receptor-2 B Cell Epitope Peptide Vaccine Mediates Superior Antitumor Responses1 , 2003, The Journal of Immunology.

[49]  T. Gamblin,et al.  Potential structure/function relationships of predicted secondary structural elements of tau. , 2005, Biochimica et biophysica acta.

[50]  T. Ramakrishna,et al.  Role of the conserved SRLFDQFFG region of alpha-crystallin, a small heat shock protein. Effect on oligomeric size, subunit exchange, and chaperone-like activity. , 2003, Journal of Biological Chemistry.

[51]  J. Valjakka,et al.  Unreliability of the Chou-Fasman parameters in predicting protein secondary structure. , 1998, Protein engineering.

[52]  C Geourjon,et al.  SOPM: a self-optimized method for protein secondary structure prediction. , 1994, Protein engineering.

[53]  A. Alix,et al.  High accuracy prediction of β‐turns and their types using propensities and multiple alignments , 2005 .

[54]  B. A. Jameson,et al.  The antigenic index: a novel algorithm for predicting antigenic determinants , 1988, Comput. Appl. Biosci..

[55]  W. Kabsch,et al.  How good are predictions of protein secondary structure? , 1983, FEBS letters.

[56]  B. Flucher,et al.  Structural Requirements of the Dihydropyridine Receptor α1S II-III Loop for Skeletal-type Excitation-Contraction Coupling* , 2004, Journal of Biological Chemistry.

[57]  P. Y. Chou,et al.  Prediction of protein conformation. , 1974, Biochemistry.

[58]  S. Noguchi,et al.  Generalized resistance to thyroid hormone: Identification of a novel c-erbAβ thyroid hormone receptor variant (leu450) in a Japanese family and analysis of its secondary structure by the Chou and Fasman method , 1994, Japanese Journal of Human Genetics.

[59]  M. Murakami Occurrence of β-turn potentials around nuclear and nucleolar localization sequences , 1991, Journal of protein chemistry.

[60]  C. Chothia,et al.  Structural patterns in globular proteins , 1976, Nature.

[61]  R Langridge,et al.  Improvements in protein secondary structure prediction by an enhanced neural network. , 1990, Journal of molecular biology.

[62]  Deborah F. Kelly,et al.  Identification of the 1-integrin binding site on a-actinin by cryoelectron microscopy , 2005 .

[63]  T L Blundell,et al.  Use of amino acid environment-dependent substitution tables and conformational propensities in structure prediction from aligned sequences of homologous proteins. II. Secondary structures. , 1994, Journal of molecular biology.

[64]  A. George,et al.  Type I collagen N‐Telopeptides adopt an ordered structure when docked to their helix receptor during fibrillogenesis* , 2003, Proteins.