Biological Features for Sequence-Based Prediction of Protein Stability Changes upon Amino Acid Substitutions

Protein destabilization is a common mechanism by which amino acid substitutions cause human diseases. In this study, a new machine learning method has been developed for sequence-based prediction of protein stability changes upon single amino acid substitutions. Support vector machines were trained with data from experimental studies on the free energy change of protein stability upon mutations. To construct accurate classifiers, twenty biological features were examined for input vector encoding. It was shown that classifier performance varied significantly by the use of different features. The most accurate classifier was constructed using a combination of several biological features. This classifier achieved an overall accuracy of 82.24% with 75.24% sensitivity and 85.36% specificity. Predictive results at this level of accuracy may be used in human genetic studies to distinguish between deleterious and tolerant alterations in disease candidate genes.

[1]  William Stafford Noble,et al.  Support vector machine , 2013 .

[2]  John M. Walker,et al.  The Proteomics Protocols Handbook , 2005, Humana Press.

[3]  J. Moult,et al.  SNPs, protein structure, and disease , 2001, Human mutation.

[4]  G. Ya. Wiederschain,et al.  The proteomics protocols handbook , 2006, Biochemistry (Moscow).

[5]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[6]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[7]  J. Moult,et al.  Identification and analysis of deleterious human SNPs. , 2006, Journal of molecular biology.

[8]  H. Hofmann,et al.  On the theoretical prediction of protein antigenic determinants from amino acid sequences. , 1987, Biomedica biochimica acta.

[9]  Liangjiang Wang,et al.  BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences , 2006, Nucleic Acids Res..

[10]  Akintola A. Aboderin,et al.  An empirical hydrophobicity scale for α-amino-acids and some of its applications , 1971 .

[11]  G. Rose,et al.  Hydrophobicity of amino acid residues in globular proteins. , 1985, Science.

[12]  Piero Fariselli,et al.  I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure , 2005, Nucleic Acids Res..

[13]  E. Lander,et al.  Characterization of single-nucleotide polymorphisms in coding regions of human genes , 1999 .

[14]  G Deléage,et al.  An algorithm for protein secondary structure prediction based on class prediction. , 1987, Protein engineering.

[15]  Gang Zhao,et al.  An amino acid “transmembrane tendency” scale that approaches the theoretical limit to accuracy for prediction of transmembrane helices: Relationship to biological hydrophobicity , 2006, Protein science : a publication of the Protein Society.

[16]  Piero Fariselli,et al.  A neural-network-based method for predicting protein stability changes upon single point mutations , 2004, ISMB/ECCB.

[17]  Minoru Kanehisa,et al.  AAindex: Amino Acid index database , 2000, Nucleic Acids Res..

[18]  Serafin Fraga,et al.  Theoretical prediction of protein antigenic determinants from amino acid sequences , 1982 .

[19]  Liang-Tsung Huang,et al.  iPTREE-STAB: interpretable decision tree based method for predicting protein stability changes upon mutations , 2007, Bioinform..

[20]  C. Pace,et al.  Contribution of hydrogen bonding to the conformational stability of ribonuclease T1. , 1992, Biochemistry.

[21]  S H White,et al.  Amino acid preferences of small proteins. Implications for protein stability and evolution. , 1992, Journal of molecular biology.

[22]  D. D. Jones,et al.  Amino acid properties and side-chain orientation in proteins: a cross correlation appraoch. , 1975, Journal of theoretical biology.

[23]  Liangjiang Wang,et al.  Prediction of Dna-binding Residues from Sequence Features , 2006, J. Bioinform. Comput. Biol..

[24]  Arlo Z. Randall,et al.  Prediction of protein stability changes for single‐site mutations using support vector machines , 2005, Proteins.

[25]  Martin Blackledge,et al.  Amino acid bulkiness defines the local conformations and dynamics of natively unfolded alpha-synuclein and tau. , 2007, Journal of the American Chemical Society.

[26]  P. Ponnuswamy,et al.  Positional flexibilities of amino acid residues in globular proteins , 2009 .

[27]  K. Takano,et al.  A new scale for side-chain contribution to protein stability based on the empirical stability analysis of mutant proteins. , 2001, Protein engineering.

[28]  Hongyi Zhou,et al.  Quantifying the effect of burial of amino acid residues on protein stability , 2003, Proteins.