Protein Secondary Structure Prediction with Support Vector Machines

In this paper, a method for secondary structure with support vector machines is presented. The system used two layers of support vector machines, with a weighted cost function to balance the uneven class memberships. Using this method, prediction accuracy reaches 71.5%, comparable to the best techniques avaliable.

[1]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[2]  T. Jukes,et al.  The neutral theory of molecular evolution. , 2000, Genetics.

[3]  S. Wodak,et al.  Protein structure prediction by threading methods: Evaluation of current techniques , 1995, Proteins.

[4]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[5]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[6]  Janet M. Thornton,et al.  Protein fold recognition , 1993, J. Comput. Aided Mol. Des..

[7]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[8]  S. K. Riis,et al.  Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments. , 1996, Journal of computational biology : a journal of computational molecular cell biology.

[9]  F. Cohen,et al.  Scrapie prions: a three-dimensional model of an infectious fragment. , 1995, Folding & design.

[10]  L. Pauling,et al.  Configurations of Polypeptide Chains With Favored Orientations Around Single Bonds: Two New Pleated Sheets. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[11]  L. Pauling,et al.  The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[12]  G J Barton,et al.  Application of multiple sequence alignment profiles to improve protein secondary structure prediction , 2000, Proteins.

[13]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[14]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[15]  A. Szent-Gyorgyi,et al.  Role of proline in polypeptide chain configuration of proteins. , 1957, Science.

[16]  Cheng Che Chen,et al.  Using imperfect secondary structure predictions to improve molecular structure computations , 1999, Bioinform..

[17]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[18]  A A Salamov,et al.  Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. , 1995, Journal of molecular biology.

[19]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[20]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  J. Kendrew,et al.  A Three-Dimensional Model of the Myoglobin Molecule Obtained by X-Ray Analysis , 1958, Nature.

[22]  M. Karplus,et al.  Protein secondary structure prediction with a neural network. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Ram Samudrala,et al.  Ab initio protein structure prediction using a combined hierarchical approach , 1999, Proteins.

[24]  R. Unger,et al.  Finding the lowest free energy conformation of a protein is an NP-hard problem: proof and implications. , 1993, Bulletin of mathematical biology.

[25]  Shoshana J. Wodak,et al.  Generating and testing protein folds , 1993 .

[26]  Giovanni Soda,et al.  Exploiting the past and the future in protein secondary structure prediction , 1999, Bioinform..

[27]  Burkhard Rost,et al.  Rising Accuracy of Protein Secondary Structure Prediction , 2003 .

[28]  B. Rost,et al.  Combining evolutionary information and neural networks to predict protein secondary structure , 1994, Proteins.

[29]  Nello Cristianini,et al.  Advances in Kernel Methods - Support Vector Learning , 1999 .

[30]  P. Y. Chou,et al.  Prediction of the secondary structure of proteins from their amino acid sequence. , 2006 .

[31]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[32]  Chris Sander Databases of homology-derived protein structures , 1990 .

[33]  K. Dill Theory for the folding and stability of globular proteins. , 1985, Biochemistry.

[34]  B. Rost,et al.  Redefining the goals of protein secondary structure prediction. , 1994, Journal of molecular biology.

[35]  F. Cohen,et al.  Structures of prion proteins and conformational models for prion diseases. , 1996, Current topics in microbiology and immunology.

[36]  S. Hua,et al.  A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. , 2001, Journal of molecular biology.

[37]  T K Attwood,et al.  OWL--a non-redundant composite protein sequence database. , 1994, Nucleic acids research.

[38]  Jude W. Shavlik,et al.  Using Knowledge-Based Neural Networks to Improve Algorithms: Refining the Chou–Fasman Algorithm for Protein Folding , 2004, Machine Learning.

[39]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[40]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[41]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[42]  D. Dryden,et al.  On the structure and operation of type I DNA restriction enzymes. , 1999, Journal of molecular biology.

[43]  G J Barton,et al.  Evaluation and improvement of multiple sequence methods for protein secondary structure prediction , 1999, Proteins.

[44]  P Stolorz,et al.  Predicting protein secondary structure using neural net and statistical methods. , 1992, Journal of molecular biology.

[45]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[46]  M Ouali,et al.  Cascaded multiple classifiers for secondary structure prediction , 2000, Protein science : a publication of the Protein Society.

[47]  Malin M. Young,et al.  Predicting conformational switches in proteins , 1999, Protein science : a publication of the Protein Society.

[48]  M. Eigen,et al.  Molecular quasi-species. , 1988 .

[49]  V A Eyrich,et al.  Prediction of protein tertiary structure to low resolution: performance for a large and structurally diverse test set. , 1999, Journal of molecular biology.

[50]  David Eisenberg,et al.  Inverted protein structure prediction , 1993 .

[51]  F. Richards,et al.  Identification of structural motifs from protein coordinate data: Secondary structure and first‐level supersecondary structure * , 1988, Proteins.

[52]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[53]  W. Kabsch,et al.  How good are predictions of protein secondary structure? , 1983, FEBS letters.

[54]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[55]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .