Quality Assessment of Protein Models

When building a protein model, with or without the aid of experimental information, it is oftennecessary to use an independent measure to evaluate the correctness of the model. This is the role ofModel Quality Assessment programs (MQAPs). Different types of MQAPs have been developed duringthe last decades. The goal of all these methods is to assess the quality of protein models. However, thedefinition of quality differs depending on the problem, thus it is always important to consider thespecific problem to be solved when using an MQAP.Traditionally MQAPs are methods that evaluate the quality of a protein model. Until recently mostwork has been focused on the development of methods aimed at detecting the native structures and toseparate these from incorrect models. However, during recent years other types of MQAPs, includingconsensus based MQAP, has increased in importance and today one of the most important uses ofMQAPs is to select the best out of a set of models built by homology or by other methods. Althoughthese two problems clearly are related, there is no guarantee that a method that works well on the firstproblem works well on the other, in particular when all of the plausible models are of low quality.In this chapter we will first discuss how MQAPs have been used in the past and how they are usedtoday and finally we will present an analysis of how MQAPs performed in CASP7 [1].The first use of MQAPs was to detect erroneous models from X-ray crystallography. X-ray basedmodels might be wrong when the resolution of the diffraction data is low, but other types of errors,including tracing a chain backwards through the electron density also occurs. In this case, the numberof residues in disallowed regions in the Ramachandran plot can often be used as an indicator for thequality of the X-ray model. This so called “stereochemical correctness” could for instance havedetected the wrongly built small subunit of Rubisco by Eisenberg and coworkers [2], which was built inthe reverse direction. In the early 1990’s a number of MQAPs were developed to identify wrongly builtmodels using Ramachandran plots and other measured of “stereochemical correctness” as the mainsource of information. The best known of these methods are PROCHECK [3] and WHATCHECK [4].Today with improved refinement methods the “stereochemical correctness” of X-ray models is almostalways very good. Also, with the introduction of the R-free method [5] where a fraction of the data isonly used for testing the need for MQAPs to validate protein models built from crystallographic2

[1]  Liam J. McGuffin,et al.  Improving sequence-based fold recognition by using 3D model quality assessment , 2005, Bioinform..

[2]  Kuang Lin,et al.  Threading Using Neural nEtwork (TUNE): the measure of protein sequence-structure compatibility , 2002, Bioinform..

[3]  Andrzej Kolinski,et al.  Protein fragment reconstruction using various modeling techniques , 2003, J. Comput. Aided Mol. Des..

[4]  Arne Elofsson,et al.  Pcons.net: protein structure prediction meta server , 2007, Nucleic Acids Res..

[5]  D. Eisenberg,et al.  Assessment of protein models with three-dimensional profiles , 1992, Nature.

[6]  A. Elofsson,et al.  Can correct protein models be identified? , 2003, Protein science : a publication of the Protein Society.

[7]  Daniel Fischer,et al.  3D‐SHOTGUN: A novel, cooperative, fold‐recognition meta‐predictor , 2003, Proteins.

[8]  Jakub Pas,et al.  Application of 3D‐Jury, GRDB, and Verify3D in fold recognition , 2003, Proteins.

[9]  Silvio C. E. Tosatto,et al.  The Victor/FRST Function for Model Quality Estimation , 2005, J. Comput. Biol..

[10]  Arne Elofsson,et al.  Automatic consensus‐based fold recognition using Pcons, ProQ, and Pmodeller , 2003, Proteins.

[11]  Arne Elofsson,et al.  Identification of correct regions in protein models using structural, alignment, and consensus information , 2006, Protein science : a publication of the Protein Society.

[12]  J Lundström,et al.  Pcons: A neural‐network–based consensus predictor that improves fold recognition , 2001, Protein science : a publication of the Protein Society.

[13]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[14]  Alfonso Valencia,et al.  Predicting reliable regions in protein alignments from sequence profiles. , 2003, Journal of molecular biology.

[15]  J. Skolnick,et al.  A distance‐dependent atomic knowledge‐based potential for improved protein structure selection , 2001, Proteins.

[16]  T. Yeates,et al.  Verification of protein structures: Patterns of nonbonded atomic interactions , 1993, Protein science : a publication of the Protein Society.

[17]  Alexander Tropsha,et al.  Development of a four-body statistical pseudo-potential to discriminate native from non-native protein conformations , 2003, Bioinform..

[18]  Liam J. McGuffin,et al.  Benchmarking consensus model quality assessment for protein fold recognition , 2007, BMC Bioinformatics.

[19]  M. Sippl Recognition of errors in three‐dimensional structures of proteins , 1993, Proteins.

[20]  Arne Elofsson,et al.  A study of quality measures for protein threading models , 2001, BMC Bioinformatics.

[21]  M S Chapman,et al.  Tertiary structure of plant RuBisCO: domains and their contacts. , 1988, Science.

[22]  D. Eisenberg,et al.  VERIFY3D: assessment of protein models with three-dimensional profiles. , 1997, Methods in enzymology.

[23]  Janusz M. Bujnicki,et al.  GeneSilico protein structure prediction meta-server , 2003, Nucleic Acids Res..

[24]  J. Thornton,et al.  PROCHECK: a program to check the stereochemical quality of protein structures , 1993 .

[25]  C Venclovas,et al.  Processing and analysis of CASP3 protein structure predictions , 1999, Proteins.

[26]  S. Wodak,et al.  Deviations from standard atomic volumes as a quality measure for protein crystal structures. , 1996, Journal of molecular biology.

[27]  F. Melo,et al.  Assessing protein structures with a non-local atomic interaction energy. , 1998, Journal of molecular biology.

[28]  Leszek Rychlewski,et al.  Fold-recognition detects an error in the Protein Data Bank , 2002, Bioinform..

[29]  M J Sippl,et al.  Helmholtz free energy of peptide hydrogen bonds in proteins. , 1996, Journal of molecular biology.

[30]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[31]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[32]  Anna Tramontano,et al.  Assessment of predictions in the model quality assessment category , 2007, Proteins.

[33]  Marcin Feder,et al.  A “FRankenstein's monster” approach to comparative modeling: Merging the finest fragments of Fold‐Recognition models and iterative model refinement aided by 3D structure evaluation , 2003, Proteins.

[34]  M. Karplus,et al.  Crystallographic R Factor Refinement by Molecular Dynamics , 1987, Science.