论文信息 - Exemplar-Based Learning to Predict Protein Folding

Exemplar-Based Learning to Predict Protein Folding

Abstract A new machine learning technique is presented for predicting protein secondary structure from primary sequence data, a task that is a valuable step towards understanding protein folding. The technique involves storing large numbers of points in a multi-dimensional space, and using an extensively modified nearest neighbor method to make predictions. The learning program was trained on a set of 101 proteins of known structure, and tested on a separate set of 28 additional proteins. The maximum overall predictive accuracy was 71.0%, which surpasses recent tests using neural nets, as well as other, more traditional methods. We further observed that some sequences of residues were considerably easier to classify than others.

Steven L. Salzberg | Scott Cost