Estimation of model accuracy in CASP13

Methods to reliably estimate the accuracy of 3D models of proteins are both a fundamental part of most protein folding pipelines and important for reliable identification of the best models when multiple pipelines are used. Here, we describe the progress made from CASP12 to CASP13 in the field of estimation of model accuracy (EMA) as seen from the progress of the most successful methods in CASP13. We show small but clear progress, that is, several methods perform better than the best methods from CASP12 when tested on CASP13 EMA targets. Some progress is driven by applying deep learning and residue‐residue contacts to model accuracy prediction. We show that the best EMA methods select better models than the best servers in CASP13, but that there exists a great potential to improve this further. Also, according to the evaluation criteria based on local similarities, such as lDDT and CAD, it is now clear that single model accuracy methods perform relatively better than consensus‐based methods.

[1]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[2]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[3]  M. Sippl Calculation of conformational ensembles from potentials of mena force , 1990 .

[4]  Roland L. Dunbrack,et al.  Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool. , 1997, Journal of molecular biology.

[5]  M. Levitt,et al.  A unified statistical framework for sequence comparison and structure comparison. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[6]  J. Richardson,et al.  Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation. , 1999, Journal of molecular biology.

[7]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[8]  Arne Elofsson,et al.  A study of quality measures for protein threading models , 2001, BMC Bioinformatics.

[9]  J Lundström,et al.  Pcons: A neural‐network–based consensus predictor that improves fold recognition , 2001, Protein science : a publication of the Protein Society.

[10]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[11]  Adrian A Canutescu,et al.  A graph‐theory algorithm for rapid protein side‐chain prediction , 2003, Protein science : a publication of the Protein Society.

[12]  A. Elofsson,et al.  Can correct protein models be identified? , 2003, Protein science : a publication of the Protein Society.

[13]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[14]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[15]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[16]  Pierre Baldi,et al.  SCRATCH: a protein structure and structural feature prediction server , 2005, Nucleic Acids Res..

[17]  Arne Elofsson,et al.  Pcons5: combining consensus, structural evaluation and fold recognition scores , 2005, Bioinform..

[18]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.

[19]  A T Zemla Protein Classification Based on Analysis of Local Sequence-Structure Correspondence , 2006 .

[20]  Arne Elofsson,et al.  Identification of correct regions in protein models using structural, alignment, and consensus information , 2006, Protein science : a publication of the Protein Society.

[21]  András Fiser,et al.  Effects of amino acid composition, finite size of proteins, and sparse statistics on distance‐dependent statistical pair potentials , 2007, Proteins.

[22]  Arne Elofsson,et al.  Prediction of global and local model quality in CASP7 using Pcons and ProQ , 2007, Proteins.

[23]  Arne Elofsson,et al.  Quality Assessment of Protein Models , 2008 .

[24]  Jianpeng Ma,et al.  OPUS-PSP: an orientation-dependent statistical all-atom potential derived from side-chain packing. , 2008, Journal of molecular biology.

[25]  Liam J. McGuffin,et al.  The ModFOLD server for the quality assessment of protein structural models , 2008, Bioinform..

[26]  Jianlin Cheng,et al.  Evaluating the absolute quality of a single protein model using structural features and support vector machines , 2009, Proteins.

[27]  Yang Zhang,et al.  A Novel Side-Chain Orientation Dependent Potential Derived from Random-Walk Reference State for Protein Fold Selection and Structure Prediction , 2010, PloS one.

[28]  Liam J. McGuffin,et al.  Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments , 2010, Bioinform..

[29]  Jianlin Cheng,et al.  APOLLO: a quality assessment service for single and multiple protein models , 2011, Bioinform..

[30]  J. Skolnick,et al.  GOAP: a generalized orientation-dependent, all-atom statistical potential for protein structure prediction. , 2011, Biophysical journal.

[31]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[32]  Björn Wallner,et al.  Improved model quality assessment using ProQ2 , 2012, BMC Bioinformatics.

[33]  Marco Biasini,et al.  lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests , 2013, Bioinform..

[34]  Liam J. McGuffin,et al.  The ModFOLD4 server for the quality assessment of 3D protein models , 2013, Nucleic Acids Res..

[35]  Daniel B. Roche,et al.  Assessing the quality of modelled 3D protein structures using the ModFOLD server. , 2014, Methods in molecular biology.

[36]  Marco Biasini,et al.  Assessing the local structural quality of transmembrane protein models using statistical potentials (QMEANBrane) , 2014, Bioinform..

[37]  Pierre Baldi,et al.  SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity , 2014, Bioinform..

[38]  Kliment Olechnovic,et al.  Voronota: A fast and reliable tool for computing the vertices of the Voronoi diagram of atomic balls , 2014, J. Comput. Chem..

[39]  Jilong Li,et al.  Massive integration of diverse protein quality assessment methods to improve template based modeling in CASP11 , 2016, Proteins.

[40]  Arne Elofsson,et al.  ProQ3: Improved model quality assessments using Rosetta energy terms , 2016, Scientific Reports.

[41]  Karolis Uziela,et al.  ProQ2: estimation of model accuracy implemented in Rosetta , 2016, Bioinform..

[42]  Renzhi Cao,et al.  Protein single-model quality assessment by feature-based probability density functions , 2016, Scientific Reports.

[43]  Jie Hou,et al.  DeepQA: improving the estimation of single protein model quality with deep belief networks , 2016, BMC Bioinformatics.

[44]  Yan Wang,et al.  ResQ: An Approach to Unified Estimation of B-Factor and Residue-Specific Error in Protein Structure Prediction. , 2016, Journal of molecular biology.

[45]  Liam J. McGuffin,et al.  ModFOLD6: an accurate web server for the global and local quality estimation of 3D protein models , 2017, Nucleic Acids Res..

[46]  Arne Elofsson,et al.  ProQ3D: improved model quality assessments using deep learning , 2016, Bioinform..

[47]  Kliment Olechnovič,et al.  VoroMQA: Assessment of protein structure quality using interatomic contact areas , 2017, Proteins.

[48]  Gyu-Il Cha,et al.  Concurrent Execution of Multiple Deep Learning Applications on GPU , 2017 .

[49]  Alessandro Barbato,et al.  Continuous Automated Model EvaluatiOn (CAMEO) complementing the critical assessment of structure prediction in CASP12 , 2018, Proteins.

[50]  David Menéndez Hurtado,et al.  Improved protein model quality assessments by changing the target function , 2018, Proteins.

[51]  Jie Hou,et al.  DNCON2: improved protein contact prediction using two-level deep convolutional neural networks , 2017, bioRxiv.

[52]  Arne Elofsson,et al.  Methods for estimation of model accuracy in CASP12 , 2017, bioRxiv.

[53]  Torsten Schwede,et al.  SWISS-MODEL: homology modelling of protein structures and complexes , 2018, Nucleic Acids Res..

[54]  Arne Elofsson,et al.  Deep transfer learning in the assessment of the quality of protein models , 2018, 1804.06281.

[55]  Renzhi Cao,et al.  Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13 , 2019, Proteins.

[56]  Liam J McGuffin,et al.  IntFOLD: an integrated web resource for high performance protein structure and function prediction , 2019, Nucleic Acids Res..

[57]  Renzhi Cao,et al.  Deep convolutional neural networks for predicting the quality of single protein structural models , 2019, bioRxiv.

[58]  Kliment Olechnovic,et al.  Comparative analysis of methods for evaluation of protein models against native structures , 2018, Bioinform..

[59]  Sergei Grudinin,et al.  Smooth orientation-dependent scoring function for coarse-grained protein quality assessment , 2018, Bioinform..