Two New Heuristic Methods for Protein Model Quality Assessment

Protein tertiary structure prediction is an important open challenge in bioinformatics and requires effective methods to accurately evaluate the quality of protein 3-D models generated computationally. Many quality assessment (QA) methods have been proposed over the past three decades. However, the accuracy or robustness is unsatisfactory for practical applications. In this paper, two new heuristic QA methods are proposed: MUfoldQA_S and MUfoldQA_C. The MUfoldQA_S is a quasi-single-model QA method that assesses the model quality based on the known protein structures with similar sequences. This algorithm can be directly applied to protein fragments without the necessity of building a full structural model. A BLOSUM-based heuristic is also introduced to help differentiate accurate templates from poor ones. In MUfoldQA_C, the ideas from MUfoldQA_S were combined with the consensus approach to create a multi-model QA method that could also utilize information from existing reference models and have demonstrated improved performance. Extensive experimental results of these two methods have shown significant improvement over existing methods. In addition, both methods have been blindly tested in the CASP12 world-wide competition in the protein structure prediction field and ranked as top performers in their respective categories.

[1]  Zhaoyu Li,et al.  Deep Networks and Continuous Distributed Representation of Protein Sequences for Protein Quality Assessment , 2017, 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI).

[2]  Mario Inostroza-Ponta,et al.  A Memetic Algorithm for 3D Protein Structure Prediction Problem , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  Liam J. McGuffin,et al.  ModFOLD6: an accurate web server for the global and local quality estimation of 3D protein models , 2017, Nucleic Acids Res..

[4]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[5]  Yang Zhang,et al.  A Novel Side-Chain Orientation Dependent Potential Derived from Random-Walk Reference State for Protein Fold Selection and Structure Prediction , 2010, PloS one.

[6]  Andrzej Kloczkowski,et al.  MQAPsingle: A quasi single‐model approach for estimation of the quality of individual protein structure models , 2016, Proteins.

[7]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[8]  Qingguo Wang,et al.  MUFOLD: A new solution for protein 3D structure prediction , 2010, Proteins.

[9]  Ruqian Lu,et al.  Sorting protein decoys by machine-learning-to-rank , 2016, Scientific Reports.

[10]  Liam J. McGuffin,et al.  Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments , 2010, Bioinform..

[11]  Liam J. McGuffin,et al.  The ModFOLD4 server for the quality assessment of 3D protein models , 2013, Nucleic Acids Res..

[12]  Gui-jun Zhang,et al.  A Novel Method Using Abstract Convex Underestimation in Ab-Initio Protein Structure Prediction for Guiding Search in Conformational Feature Space , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Kliment Olechnovič,et al.  VoroMQA: Assessment of protein structure quality using interatomic contact areas , 2017, Proteins.

[14]  Miao Sun,et al.  QAcon: single model quality assessment using protein structural and contact information with machine learning techniques , 2016, Bioinform..

[15]  T. Blundell,et al.  Knowledge-based protein modeling. , 1994, Critical reviews in biochemistry and molecular biology.

[16]  Dong Xu,et al.  MUFOLD-DB: a processed protein structure database for protein structure prediction and analysis , 2014, BMC Genomics.

[17]  Jianpeng Ma,et al.  OPUS‐Ca: A knowledge‐based potential function requiring only Cα positions , 2007, Protein science : a publication of the Protein Society.

[18]  Yang Zhang,et al.  How significant is a protein structure similarity with TM-score = 0.5? , 2010, Bioinform..

[19]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.

[20]  Jie Hou,et al.  DeepQA: improving the estimation of single protein model quality with deep belief networks , 2016, BMC Bioinformatics.

[21]  Renzhi Cao,et al.  SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines , 2013, BMC Bioinformatics.

[22]  Jacek Blazewicz,et al.  SphereGrinder - reference structure-based tool for quality assessment of protein structural models , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[23]  Dong Xu,et al.  A New Hidden Markov Model for Protein Quality Assessment Using Compatibility Between Protein Sequence and Structure. , 2014, Tsinghua science and technology.

[24]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[25]  Qiwen Dong,et al.  MQAPRank: improved global protein model quality assessment by learning-to-rank , 2017, BMC Bioinformatics.

[26]  Anna Tramontano,et al.  Assessment of the assessment: Evaluation of the model quality estimates in CASP10 , 2014, Proteins.

[27]  Björn Wallner,et al.  Improved model quality assessment using ProQ2 , 2012, BMC Bioinformatics.

[28]  Yanqing Zhang,et al.  Protein model assessment using extented fuzzy decision tree with spatial neighborhood features , 2012, 2012 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[29]  Qingguo Wang,et al.  A multilayer evaluation approach for protein structure prediction and model quality assessment , 2011, Proteins.

[30]  Xiaogen Zhou,et al.  A population-based conformational optimal algorithm using replica-exchange in ab-initio protein structure prediction , 2016, 2016 Chinese Control and Decision Conference (CCDC).

[31]  Anna Tramontano,et al.  Evaluation of model quality predictions in CASP9 , 2011, Proteins.

[32]  Jilong Li,et al.  Massive integration of diverse protein quality assessment methods to improve template based modeling in CASP11 , 2016, Proteins.

[33]  Bin Liu,et al.  Protein model quality assessment by learning-to-rank , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[34]  Jeffrey Skolnick,et al.  Protein model quality assessment prediction by combining fragment comparisons and a consensus Cα contact potential , 2008, Proteins.

[35]  Renzhi Cao,et al.  Protein single-model quality assessment by feature-based probability density functions , 2016, Scientific Reports.

[36]  SödingJohannes Protein homology detection by HMM--HMM comparison , 2005 .

[37]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[38]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Jilong Li,et al.  Large-scale model quality assessment for improving protein tertiary structure prediction , 2015, Bioinform..

[40]  Zheng Wang,et al.  Designing and evaluating the MULTICOM protein local and global model quality prediction methods in the CASP10 experiment , 2014, BMC Structural Biology.

[41]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[42]  Janusz M. Bujnicki,et al.  GeneSilico protein structure prediction meta-server , 2003, Nucleic Acids Res..