Protein single-model quality assessment by feature-based probability density functions

Protein quality assessment (QA) has played an important role in protein structure prediction. We developed a novel single-model quality assessment method–Qprob. Qprob calculates the absolute error for each protein feature value against the true quality scores (i.e. GDT-TS scores) of protein structural models, and uses them to estimate its probability density distribution for quality assessment. Qprob has been blindly tested on the 11th Critical Assessment of Techniques for Protein Structure Prediction (CASP11) as MULTICOM-NOVEL server. The official CASP result shows that Qprob ranks as one of the top single-model QA methods. In addition, Qprob makes contributions to our protein tertiary structure predictor MULTICOM, which is officially ranked 3rd out of 143 predictors. The good performance shows that Qprob is good at assessing the quality of models of hard targets. These results demonstrate that this new probability density distribution based method is effective for protein single-model quality assessment and is useful for protein structure prediction. The webserver of Qprob is available at: http://calla.rnet.missouri.edu/qprob/. The software is now freely available in the web server of Qprob.

[1]  Badri Adhikari,et al.  CONFOLD: residue-residue contact-guided ab initio protein folding , 2015 .

[2]  Zheng Wang,et al.  Designing and evaluating the MULTICOM protein local and global model quality prediction methods in the CASP10 experiment , 2014, BMC Structural Biology.

[3]  Arne Elofsson,et al.  Identification of correct regions in protein models using structural, alignment, and consensus information , 2006, Protein science : a publication of the Protein Society.

[4]  Jilong Li,et al.  A large-scale conformation sampling and evaluation server for protein tertiary structure prediction and its assessment in CASP11 , 2015, BMC Bioinformatics.

[5]  Liam J. McGuffin,et al.  The ModFOLD server for the quality assessment of protein structural models , 2008, Bioinform..

[6]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[7]  Jianlin Cheng,et al.  CONFOLD: Residue‐residue contact‐guided ab initio protein folding , 2015, Proteins.

[8]  Yang Zhang,et al.  A Novel Side-Chain Orientation Dependent Potential Derived from Random-Walk Reference State for Protein Fold Selection and Structure Prediction , 2010, PloS one.

[9]  Renzhi Cao,et al.  Integrated protein function prediction by mining function associations, sequences, and protein-protein and gene-gene interaction networks. , 2016, Methods.

[10]  Jilong Li,et al.  The MULTICOM protein tertiary structure prediction system. , 2014, Methods in molecular biology.

[11]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.

[12]  B Jayaram,et al.  Capturing native/native like structures with a physico-chemical metric (pcSM) in protein folding. , 2013, Biochimica et biophysica acta.

[13]  Renzhi Cao,et al.  Deciphering the association between gene function and spatial gene-gene interactions in 3D human genome conformation , 2015, BMC Genomics.

[14]  Marco Biasini,et al.  Toward the estimation of the absolute quality of individual protein structure models , 2010, Bioinform..

[15]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[16]  Pierre Baldi,et al.  SCRATCH: a protein structure and structural feature prediction server , 2005, Nucleic Acids Res..

[17]  Jianlin Cheng,et al.  Evaluating the absolute quality of a single protein model using structural features and support vector machines , 2009, Proteins.

[18]  Renzhi Cao,et al.  SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines , 2013, BMC Bioinformatics.

[19]  Björn Wallner,et al.  Improved model quality assessment using ProQ2 , 2012, BMC Bioinformatics.

[20]  Jilong Li,et al.  Massive integration of diverse protein quality assessment methods to improve template based modeling in CASP11 , 2016, Proteins.

[21]  Jilong Li,et al.  Large-scale model quality assessment for improving protein tertiary structure prediction , 2015, Bioinform..

[22]  Renzhi Cao,et al.  Three-Level Prediction of Protein Function by Combining Profile-Sequence Search, Profile-Profile Search, and Domain Co-Occurrence Networks , 2013, BMC Bioinformatics.

[23]  Zhe Zhang,et al.  Efficient digest of high-throughput sequencing data in a reproducible report , 2013, BMC Bioinformatics.

[24]  András Fiser,et al.  Effects of amino acid composition, finite size of proteins, and sparse statistics on distance‐dependent statistical pair potentials , 2007, Proteins.

[25]  Qingguo Wang,et al.  MUFOLD‐WQA: A new selective consensus method for quality assessment in protein structure prediction , 2011, Proteins.

[26]  D. Eisenberg,et al.  Assessment of protein models with three-dimensional profiles , 1992, Nature.

[27]  Lukasz A. Kurgan,et al.  SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles , 2012, J. Comput. Chem..

[28]  Anna Tramontano,et al.  Methods of model accuracy estimation can help selecting the best models from decoy sets: Assessment of model accuracy estimations in CASP11 , 2016, Proteins.

[29]  Nir Ben-Tal,et al.  Quality assessment of protein model-structures using evolutionary conservation , 2010, Bioinform..

[30]  Liam J. McGuffin,et al.  Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments , 2010, Bioinform..