Toward the estimation of the absolute quality of individual protein structure models

Motivation: Quality assessment of protein structures is an important part of experimental structure validation and plays a crucial role in protein structure prediction, where the predicted models may contain substantial errors. Most current scoring functions are primarily designed to rank alternative models of the same sequence supporting model selection, whereas the prediction of the absolute quality of an individual protein model has received little attention in the field. However, reliable absolute quality estimates are crucial to assess the suitability of a model for specific biomedical applications. Results: In this work, we present a new absolute measure for the quality of protein models, which provides an estimate of the ‘degree of nativeness’ of the structural features observed in a model and describes the likelihood that a given model is of comparable quality to experimental structures. Model quality estimates based on the QMEAN scoring function were normalized with respect to the number of interactions. The resulting scoring function is independent of the size of the protein and may therefore be used to assess both monomers and entire oligomeric assemblies. Model quality scores for individual models are then expressed as ‘Z-scores’ in comparison to scores obtained for high-resolution crystal structures. We demonstrate the ability of the newly introduced QMEAN Z-score to detect experimentally solved protein structures containing significant errors, as well as to evaluate theoretical protein models. In a comprehensive QMEAN Z-score analysis of all experimental structures in the PDB, membrane proteins accumulate on one side of the score spectrum and thermostable proteins on the other. Proteins from the thermophilic organism Thermatoga maritima received significantly higher QMEAN Z-scores in a pairwise comparison with their homologous mesophilic counterparts, underlining the significance of the QMEAN Z-score as an estimate of protein stability. Availability: The Z-score calculation has been integrated in the QMEAN server available at: http://swissmodel.expasy.org/qmean. Contact: torsten.schwede@unibas.ch Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  B. Matthews,et al.  Structure of bacteriophage T4 lysozyme refined at 1.7 A resolution. , 1987, Journal of molecular biology.

[2]  Liam J. McGuffin,et al.  Improving sequence-based fold recognition by using 3D model quality assessment , 2005, Bioinform..

[3]  Pierre Baldi,et al.  SELECTpro: effective protein model selection using a structure-based energy function resistant to BLUNDERs , 2008, BMC Structural Biology.

[4]  Roberto Dominguez,et al.  Toxofilin from Toxoplasma gondii forms a ternary complex with an antiparallel actin dimer , 2007, Proceedings of the National Academy of Sciences.

[5]  S. White,et al.  Biophysical dissection of membrane proteins , 2009, Nature.

[6]  L. Poppe,et al.  Hepcidin Revisited, Disulfide Connectivity, Dynamics, and Structure , 2009, The Journal of Biological Chemistry.

[7]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.

[8]  T. Schwede,et al.  Protein structure homology modeling using SWISS-MODEL workspace , 2008, Nature Protocols.

[9]  Arne Svejgaard,et al.  Crystal structure of HLA-DQ0602 that protects against type 1 diabetes and confers strong susceptibility to narcolepsy , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[10]  S Clum,et al.  Dengue virus NS3 serine protease. Crystal structure and insights into interaction of the active site with substrates by molecular modeling and structural analysis of mutational effects. , 1999, The Journal of biological chemistry.

[11]  Marco Biasini,et al.  OpenStructure: a flexible software framework for computational structural biology , 2010, Bioinform..

[12]  A. Elofsson,et al.  Can correct protein models be identified? , 2003, Protein science : a publication of the Protein Society.

[13]  R. Padmanabhan,et al.  Dengue Virus NS3 Serine Protease , 1999, The Journal of Biological Chemistry.

[14]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[15]  Torsten Schwede,et al.  The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling , 2006, Bioinform..

[16]  Pierre Baldi,et al.  SCRATCH: a protein structure and structural feature prediction server , 2005, Nucleic Acids Res..

[17]  Adam Godzik,et al.  Structural genomics of thermotoga maritima proteins shows that contact order is a major determinant of protein thermostability. , 2005, Structure.

[18]  Takeshi Wada,et al.  Modified Uridines with C5-methylene Substituents at the First Position of the tRNA Anticodon Stabilize U·G Wobble Pairing during Decoding* , 2008, Journal of Biological Chemistry.

[19]  Robert W. Janes,et al.  The crystal structure of human endothelin , 1994, Nature Structural Biology.

[20]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[21]  Liam J. McGuffin,et al.  The ModFOLD server for the quality assessment of protein structural models , 2008, Bioinform..

[22]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[23]  Manfred J. Sippl,et al.  Thirty years of environmental health research--and growing. , 1996, Nucleic Acids Res..

[24]  F. Melo,et al.  Assessing protein structures with a non-local atomic interaction energy. , 1998, Journal of molecular biology.

[25]  Manuel C. Peitsch,et al.  SWISS-MODEL: an automated protein homology-modeling server , 2003, Nucleic Acids Res..

[26]  A. Sali,et al.  How well can the accuracy of comparative protein structure models be predicted? , 2008, Protein science : a publication of the Protein Society.

[27]  B. Rost Twilight zone of protein sequence alignments. , 1999, Protein engineering.

[28]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[29]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[30]  Anna Tramontano,et al.  Evaluation of CASP8 model quality predictions , 2009, Proteins.

[31]  Bernard Joris,et al.  Standardized evaluation of protein stability. , 2010, Biochimica et biophysica acta.

[32]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction—Round VII , 2007, Proteins.

[33]  S. Bryant,et al.  Critical assessment of methods of protein structure prediction (CASP): Round II , 1997, Proteins.

[34]  T. Schwede,et al.  QMEANclust: estimation of protein model quality by combining a composite scoring function with structural density information , 2009, BMC Structural Biology.

[35]  Silvio C. E. Tosatto,et al.  The Victor/FRST Function for Model Quality Estimation , 2005, J. Comput. Biol..

[36]  András Fiser,et al.  New statistical potential for quality assessment of protein models and a survey of energy functions , 2010, BMC Bioinformatics.

[37]  C. Eigenbrot,et al.  X-ray structure of human relaxin at 1.5 A. Comparison to insulin and implications for receptor binding determinants. , 1991, Journal of molecular biology.

[38]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[39]  Marc A. Martí-Renom,et al.  EVA: evaluation of protein structure prediction servers , 2003, Nucleic Acids Res..

[40]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[41]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[42]  References , 1971 .

[43]  Pascal Benkert,et al.  QMEAN: A comprehensive scoring function for model quality assessment , 2008, Proteins.

[44]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[45]  K. Henrick,et al.  Inference of macromolecular assemblies from crystalline state. , 2007, Journal of molecular biology.

[46]  Adam Godzik,et al.  Flexible structure alignment by chaining aligned fragment pairs allowing twists , 2003, ECCB.

[47]  Roland L Dunbrack,et al.  Outcome of a workshop on applications of protein models in biomedical research. , 2009, Structure.

[48]  R. Padmanabhan,et al.  Dengue virus NS3 serine protease. Crystal structure and insights into interaction of the active site with substrates by molecular modeling and structural analysis of mutational effects. , 2009, The Journal of Biological Chemistry.

[49]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[50]  Jianlin Cheng,et al.  Evaluating the absolute quality of a single protein model using structural features and support vector machines , 2009, Proteins.

[51]  Adam Godzik,et al.  Contribution of electrostatic interactions, compactness and quaternary structure to protein thermostability: lessons from structural genomics of Thermotoga maritima. , 2006, Journal of molecular biology.

[52]  Anna Tramontano,et al.  Exploiting evolutionary relationships for predicting protein structures , 2003, Biotechnology and bioengineering.

[53]  Roland L. Dunbrack Sequence comparison and protein structure prediction. , 2006, Current opinion in structural biology.

[54]  Pascal Benkert,et al.  QMEAN server for protein model quality estimation , 2009, Nucleic Acids Res..

[55]  Randy J Read,et al.  Automated server predictions in CASP7 , 2007, Proteins.

[56]  A. Kossiakoff,et al.  X-ray structure of human relaxin at 1·5Å , 1991 .

[57]  SödingJohannes Protein homology detection by HMM--HMM comparison , 2005 .

[58]  M. Sippl Recognition of errors in three‐dimensional structures of proteins , 1993, Proteins.

[59]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.