Robust Prediction of Single and Multiple Point Protein Mutations Stability Changes

Accurate prediction of protein stability changes resulting from amino acid substitutions is of utmost importance in medicine to better understand which mutations are deleterious, leading to diseases, and which are neutral. Since conducting wet lab experiments to get a better understanding of protein mutations is costly and time consuming, and because of huge number of possible mutations the need of computational methods that could accurately predict effects of amino acid mutations is of greatest importance. In this research, we present a robust methodology to predict the energy changes of a proteins upon mutations. The proposed prediction scheme is based on two step algorithm that is a Holdout Random Sampler followed by a neural network model for regression. The Holdout Random Sampler is utilized to analysis the energy change, the corresponding uncertainty, and to obtain a set of admissible energy changes, expressed as a cumulative distribution function. These values are further utilized to train a simple neural network model that can predict the energy changes. Results were blindly tested (validated) against experimental energy changes, giving Pearson correlation coefficients of 0.66 for Single Point Mutations and 0.77 for Multiple Point Mutations. These results confirm the successfulness of our method, since it outperforms majority of previous studies in this field.

[1]  N. Risch Searching for genetic determinants in the new millennium , 2000, Nature.

[2]  Enrique J. deAndrés-Galiana,et al.  Sampling Defective Pathways in Phenotype Prediction Problems via the Holdout Sampler , 2018, IWBBIO.

[3]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[4]  K Nishikawa,et al.  Experimental verification of the 'stability profile of mutant protein' (SPMP) data using mutant human lysozymes. , 1999, Protein engineering.

[5]  Piero Fariselli,et al.  Predicting Free Energy Contribution to the Conformational Stability of Folded Proteins From the Residue Sequence with Radial Basis Function Networks , 1995, ISMB.

[6]  M J Sippl,et al.  Knowledge-based potentials for proteins. , 1995, Current opinion in structural biology.

[7]  N. Pokala,et al.  Energy functions for protein design: adjustment with protein-protein complex affinities, models for the unfolded state, and negative design of solubility and specificity. , 2005, Journal of molecular biology.

[8]  D. Jacobs,et al.  Protein flexibility predictions using graph theory , 2001, Proteins.

[9]  Yang Li,et al.  KINARI-Web: a server for protein rigidity analysis , 2011, Nucleic Acids Res..

[10]  K. Takano,et al.  Are the parameters of various stabilization factors estimated from mutant human lysozymes compatible with other proteins? , 2001, Protein engineering.

[11]  Philippe Bogaerts,et al.  Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0 , 2009, Bioinform..

[12]  Brian Hutchinson,et al.  Predicting the Effect of Single and Multiple Mutations on Protein Structural Stability , 2018, Molecules.

[13]  Tom L. Blundell,et al.  SDM: a server for predicting effects of mutations on protein stability , 2017, Nucleic Acids Res..

[14]  H. Gohlke,et al.  Exploiting the Link between Protein Rigidity and Thermostability for Data‐Driven Protein Engineering , 2008 .

[15]  Piero Fariselli,et al.  A neural-network-based method for predicting protein stability changes upon single point mutations , 2004, ISMB/ECCB.

[16]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[17]  S J Wodak,et al.  Contribution of the hydrophobic effect to protein stability: analysis based on simulations of the Ile-96----Ala mutation in barnase. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[18]  A. Fersht,et al.  Is there a unifying mechanism for protein folding? , 2003, Trends in biochemical sciences.

[19]  Hongyi Zhou,et al.  Quantifying the effect of burial of amino acid residues on protein stability , 2003, Proteins.

[20]  Nir Ben-Tal,et al.  Protein stability: a single recorded mutation aids in predicting the effects of other mutations in the same amino acid site , 2011, Bioinform..

[21]  Yang Zhang,et al.  Predicting the Effect of Mutations on Protein-Protein Binding Interactions through Structure-Based Interface Profiles , 2015, PLoS Comput. Biol..

[22]  F. Gnad,et al.  Assessment of computational methods for predicting the effects of missense mutations in human cancers , 2013, BMC Genomics.

[23]  Lei Jia,et al.  Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools , 2015, PloS one.

[24]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[25]  François Stricher,et al.  The FoldX web server: an online force field , 2005, Nucleic Acids Res..

[26]  L. Serrano,et al.  Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. , 2002, Journal of molecular biology.

[27]  Gaotao Shi,et al.  Fast Prediction of Protein Methylation Sites Using a Sequence-Based Feature Selection Technique , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[28]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[29]  H. Gohlke,et al.  Exploiting the Link between Protein Rigidity and Thermostability for Data‐Driven Protein Engineering , 2008 .

[30]  Rafael Najmanovich,et al.  ENCoM server: exploring protein conformational space and the effect of mutations on protein function and stability , 2015, Nucleic Acids Res..

[31]  Jaroslav Bendl,et al.  PredictSNP: Robust and Accurate Consensus Classifier for Prediction of Disease-Related Mutations , 2014, PLoS Comput. Biol..

[32]  Brian Hutchinson,et al.  Predicting the Effect of Point Mutations on Protein Structural Stability , 2017, BCB.

[33]  Piero Fariselli,et al.  I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure , 2005, Nucleic Acids Res..

[34]  Akinori Sarai,et al.  ProTherm and ProNIT: thermodynamic databases for proteins and protein–nucleic acid interactions , 2005, Nucleic Acids Res..

[35]  Jian Zhang,et al.  Prots: A fragment based protein thermo‐stability potential , 2012, Proteins.

[36]  R. Tibshirani,et al.  Combining Estimates in Regression and Classification , 1996 .

[37]  Lee Testing homology modeling on mutant proteins: predicting structural and thermodynamic effects in the Ala98-->Val mutants of T4 lysozyme. , 1995, Folding & design.

[38]  Haruki Nakamura,et al.  Announcing the worldwide Protein Data Bank , 2003, Nature Structural Biology.

[39]  Dan S. Tawfik,et al.  Quantifying and understanding the fitness effects of protein mutations: Laboratory versus nature , 2016, Protein science : a publication of the Protein Society.

[40]  C Lee Testing homology modeling on mutant proteins: predicting structural and thermodynamic effects in the Ala98-->Val mutants of T4 lysozyme. , 1996, Folding & design.

[41]  Rajni Verma,et al.  Computer-Aided Protein Directed Evolution: a Review of Web Servers, Databases and other Computational Tools for Protein Engineering , 2012, Computational and structural biotechnology journal.

[42]  S. Henikoff,et al.  Predicting the effects of amino acid substitutions on protein function. , 2006, Annual review of genomics and human genetics.

[43]  Ileana Streinu,et al.  Using rigidity analysis to probe mutation-induced structural changes in proteins , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).

[44]  Juan Luis Fernández-Martínez,et al.  Data kit inversion and uncertainty analysis , 2019, Journal of Applied Geophysics.

[45]  Nurit Haspel,et al.  An Evolutionary Conservation & Rigidity Analysis Machine Learning Approach for Detecting Critical Protein Residues , 2013, BCB.

[46]  Akinori Sarai,et al.  ProTherm, version 4.0: thermodynamic database for proteins and mutants , 2004, Nucleic Acids Res..

[47]  Philip D. Wasserman,et al.  Advanced methods in neural computing , 1993, VNR computer library.

[48]  A. Tropsha,et al.  Four-body potentials reveal protein-specific correlations to stability changes caused by hydrophobic core mutations. , 2001, Journal of molecular biology.

[49]  Jijun Tang,et al.  PhosPred-RF: A Novel Sequence-Based Predictor for Phosphorylation Sites Using Sequential Information Only , 2017, IEEE Transactions on NanoBioscience.

[50]  Douglas E. V. Pires,et al.  mCSM: predicting the effects of mutations in proteins using graph-based signatures , 2013, Bioinform..

[51]  Jorge J. Moré,et al.  The Levenberg-Marquardt algo-rithm: Implementation and theory , 1977 .

[52]  D Gilis,et al.  Predicting protein stability changes upon mutation using database-derived potentials: solvent accessibility determines the importance of local versus non-local interactions along the sequence. , 1997, Journal of molecular biology.

[53]  H Oschkinat,et al.  Improving the refolding yield of interleukin-4 through the optimization of local interactions. , 2000, Journal of biotechnology.

[54]  Jianwen Fang,et al.  PROTS-RF: A Robust Model for Predicting Mutation-Induced Protein Stability Changes , 2012, PloS one.

[55]  Quan Zou,et al.  HPSLPred: An Ensemble Multi‐Label Classifier for Human Protein Subcellular Location Prediction with Imbalanced Source , 2017, Proteomics.

[56]  Joan Teyra,et al.  ELASPIC web-server: proteome-wide structure-based prediction of mutation effects on protein stability and binding affinity , 2016, Bioinform..

[57]  Oscar Álvarez,et al.  The Importance of Biological Invariance in Drug Design , 2019, Biomedical Journal of Scientific & Technical Research.

[58]  T L Blundell,et al.  Prediction of the stability of protein mutants based on structural environment-dependent amino acid substitution and propensity tables. , 1997, Protein engineering.

[59]  Douglas E. V. Pires,et al.  DynaMut: predicting the impact of mutations on protein conformation, flexibility and stability , 2018, Nucleic Acids Res..

[60]  Breysse Denys,et al.  The uncertainty analysis in linear and nonlinear regression revisited: application to concrete strength estimation , 2018, Inverse Problems in Science and Engineering.

[61]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[62]  Ileana Streinu,et al.  Using rigidity analysis to probe mutation-induced structural changes in proteins , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).

[63]  Arlo Z. Randall,et al.  Prediction of protein stability changes for single‐site mutations using support vector machines , 2005, Proteins.