Human-Readable Rule Generator for Integrating Amino Acid Sequence Information and Stability of Mutant Proteins

Most of the bioinformatics tools developed for predicting mutant protein stability appear as a black box and the relationship between amino acid sequence/structure and stability is hidden to the users. We have addressed this problem and developed a human-readable rule generator for integrating the knowledge of amino acid sequence and experimental stability change upon single mutation. Using information about the original residue, substituted residue, and three neighboring residues, classification rules have been generated to discriminate the stabilizing and destabilizing mutants and explore the basis for experimental data. These rules are human readable, and hence, the method enhances the synergy between expert knowledge and computational system. Furthermore, the performance of the rules has been assessed on a nonredundant data set of 1,859 mutants and we obtained an accuracy of 80 percent using cross validation. The results showed that the method could be effectively used as a tool for both knowledge discovery and predicting mutant protein stability. We have developed a Web for classification rule generator and it is freely available at http://bioinformatics.myweb.hinet.net/irobot.htm.

[1]  Liang-Tsung Huang,et al.  iPTREE-STAB: interpretable decision tree based method for predicting protein stability changes upon mutations , 2007, Bioinform..

[2]  Nikolay V Dokholyan,et al.  Can contact potentials reliably predict stability of proteins? , 2004, Journal of molecular biology.

[3]  Arlo Z. Randall,et al.  Prediction of protein stability changes for single‐site mutations using support vector machines , 2005, Proteins.

[4]  B. Shirley,et al.  Protein Stability and Folding: Theory and Practice , 1995 .

[5]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[6]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[7]  M. N. Ponnuswamy,et al.  Average assignment method for predicting the stability of protein mutants , 2006, Biopolymers.

[8]  Akinori Sarai,et al.  ProTherm: Thermodynamic Database for Proteins and Mutants , 1999, Nucleic Acids Res..

[9]  Akinori Sarai,et al.  ProTherm, version 4.0: thermodynamic database for proteins and mutants , 2004, Nucleic Acids Res..

[10]  M. Michael Gromiha,et al.  CUPSAT: prediction of protein stability upon point mutations , 2006, Nucleic Acids Res..

[11]  B. Shirley,et al.  Protein Stability and Folding: Theory and Practice , 1995 .

[12]  Piero Fariselli,et al.  A neural-network-based method for predicting protein stability changes upon single point mutations , 2004, ISMB/ECCB.

[13]  Piero Fariselli,et al.  I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure , 2005, Nucleic Acids Res..

[14]  T. Adalsteinsson,et al.  Correlations among morphology, beta-sheet stability, and molecular structure in prion peptide aggregates. , 2005, Biochemistry.

[15]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[16]  S. Reed,et al.  Sequence analysis of temperature-sensitive mutations in the Saccharomyces cerevisiae gene CDC28 , 1986, Molecular and cellular biology.