Smooth orientation-dependent scoring function for coarse-grained protein quality assessment

Motivation Protein quality assessment (QA) is a crucial element of protein structure prediction, a fundamental and yet open problem in structural bioinformatics. QA aims at ranking predicted protein models to select the best candidates. The assessment can be performed based either on a single model or on a consensus derived from an ensemble of models. The latter strategy can yield very high performance but substantially depends on the pool of available candidate models, which limits its applicability. Hence, single-model QA methods remain an important research target, also because they can assist the sampling of candidate models. Results We present a novel single-model QA method called SBROD. The SBROD (Smooth Backbone-Reliant Orientation-Dependent) method uses only the backbone protein conformation, and hence it can be applied to scoring coarse-grained protein models. The proposed method deduces its scoring function from a training set of protein models. The SBROD scoring function is composed of four terms related to different structural features: residue-residue orientations, contacts between backbone atoms, hydrogen bonding, and solvent-solute interactions. It is smooth with respect to atomic coordinates and thus is potentially applicable to continuous gradient-based optimization of protein conformations. Furthermore, it can also be used for coarse-grained protein modeling and computational protein design. SBROD proved to achieve similar performance to state-of-the-art single-model QA methods on diverse datasets (CASP11, CASP12, and MOULDER). Availability The standalone application implemented in C++ and Python is freely available at https://gitlab.inria.fr/grudinin/sbrod and supported on Linux, MacOS, and Windows. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Liam J. McGuffin,et al.  ModFOLD6: an accurate web server for the global and local quality estimation of 3D protein models , 2017, Nucleic Acids Res..

[2]  Sergei Grudinin,et al.  Pepsi-SAXS: an adaptive method for rapid and accurate computation of small-angle X-ray scattering profiles. , 2017, Acta crystallographica. Section D, Structural biology.

[3]  Sergei Grudinin,et al.  NOLB: Nonlinear Rigid Block Normal-Mode Analysis Method. , 2017, Journal of chemical theory and computation.

[4]  Richard B Sessions,et al.  An efficient, path-independent method for free-energy calculations. , 2006, The journal of physical chemistry. B.

[5]  Andrzej Kloczkowski,et al.  A global machine learning based scoring function for protein structure prediction , 2014, Proteins.

[6]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction: Progress and new directions in round XI , 2016, Proteins.

[7]  Qiwen Dong,et al.  MQAPRank: improved global protein model quality assessment by learning-to-rank , 2017, BMC Bioinformatics.

[8]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[9]  Björn Wallner,et al.  Improved model quality assessment using ProQ2 , 2012, BMC Bioinformatics.

[10]  Svetlana Artemova,et al.  A comparison of neighbor search algorithms for large rigid molecules , 2011, J. Comput. Chem..

[11]  Andrzej Kloczkowski,et al.  A global machine learning based scoring function for protein structure prediction , 2014 .

[12]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[13]  Chi Zhang,et al.  Fast and accurate prediction of protein side-chain conformations , 2011, Bioinform..

[14]  Renzhi Cao,et al.  Protein single-model quality assessment by feature-based probability density functions , 2016, Scientific Reports.

[15]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[16]  J. Skolnick,et al.  Erratum: Scoring function for automated assessment of protein structure template quality (Proteins: Structure, Function and Genetics (2004) 57, (702-710)) , 2007 .

[17]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[18]  A. Kolinski,et al.  Coarse-Grained Protein Models and Their Applications. , 2016, Chemical reviews.

[19]  Bernard Chazelle,et al.  The Fast Johnson--Lindenstrauss Transform and Approximate Nearest Neighbors , 2009, SIAM J. Comput..

[20]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[21]  Jianyang Zeng,et al.  Improving the orientation‐dependent statistical potential using a reference state , 2014, Proteins.

[22]  A. Sali,et al.  A composite score for predicting errors in protein structure models , 2006, Protein science : a publication of the Protein Society.

[23]  M Karplus,et al.  Calculation of free-energy differences by confinement simulations. Application to peptide conformers. , 2009, The journal of physical chemistry. B.

[24]  Roland L. Dunbrack,et al.  proteins STRUCTURE O FUNCTION O BIOINFORMATICS Improved prediction of protein side-chain conformations with SCWRL4 , 2022 .

[25]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[26]  N. Draper,et al.  Applied Regression Analysis , 1966 .