MetaMQAP: A meta-server for the quality assessment of protein models

BackgroundComputational models of protein structure are usually inaccurate and exhibit significant deviations from the true structure. The utility of models depends on the degree of these deviations. A number of predictive methods have been developed to discriminate between the globally incorrect and approximately correct models. However, only a few methods predict correctness of different parts of computational models. Several Model Quality Assessment Programs (MQAPs) have been developed to detect local inaccuracies in unrefined crystallographic models, but it is not known if they are useful for computational models, which usually exhibit different and much more severe errors.ResultsThe ability to identify local errors in models was tested for eight MQAPs: VERIFY3D, PROSA, BALA, ANOLEA, PROVE, TUNE, REFINER, PROQRES on 8251 models from the CASP-5 and CASP-6 experiments, by calculating the Spearman's rank correlation coefficients between per-residue scores of these methods and local deviations between C-alpha atoms in the models vs. experimental structures. As a reference, we calculated the value of correlation between the local deviations and trivial features that can be calculated for each residue directly from the models, i.e. solvent accessibility, depth in the structure, and the number of local and non-local neighbours. We found that absolute correlations of scores returned by the MQAPs and local deviations were poor for all methods. In addition, scores of PROQRES and several other MQAPs strongly correlate with 'trivial' features. Therefore, we developed MetaMQAP, a meta-predictor based on a multivariate regression model, which uses scores of the above-mentioned methods, but in which trivial parameters are controlled. MetaMQAP predicts the absolute deviation (in Ångströms) of individual C-alpha atoms between the model and the unknown true structure as well as global deviations (expressed as root mean square deviation and GDT_TS scores). Local model accuracy predicted by MetaMQAP shows an impressive correlation coefficient of 0.7 with true deviations from native structures, a significant improvement over all constituent primary MQAP scores. The global MetaMQAP score is correlated with model GDT_TS on the level of 0.89.ConclusionFinally, we compared our method with the MQAPs that scored best in the 7th edition of CASP, using CASP7 server models (not included in the MetaMQAP training set) as the test data. In our benchmark, MetaMQAP is outperformed only by PCONS6 and method QA_556 – methods that require comparison of multiple alternative models and score each of them depending on its similarity to other models. MetaMQAP is however the best among methods capable of evaluating just single models.We implemented the MetaMQAP as a web server available for free use by all academic users at the URL https://genesilico.pl/toolkit/

[1]  Kuang Lin,et al.  Threading Using Neural nEtwork (TUNE): the measure of protein sequence-structure compatibility , 2002, Bioinform..

[2]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[3]  Arne Elofsson,et al.  Identification of correct regions in protein models using structural, alignment, and consensus information , 2006, Protein science : a publication of the Protein Society.

[4]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[5]  Janusz M. Bujnicki,et al.  I-Ssp6803I: the first homing endonuclease from the PD-(D/E)XK superfamily exhibits an unusual mode of DNA recognition , 2007, Bioinform..

[6]  Janusz M. Bujnicki,et al.  HsdR Subunit of the Type I Restriction-Modification Enzyme EcoR124I: Biophysical Characterisation and Structural Modelling , 2008, Journal of molecular biology.

[7]  J Lundström,et al.  Pcons: A neural‐network–based consensus predictor that improves fold recognition , 2001, Protein science : a publication of the Protein Society.

[8]  Alfonso Valencia,et al.  Domain definition and target classification for CASP6 , 2005, Proteins.

[9]  Janusz M. Bujnicki,et al.  Bud23 Methylates G1575 of 18S rRNA and Is Required for Efficient Nuclear Export of Pre-40S Subunits , 2008, Molecular and Cellular Biology.

[10]  Arne Elofsson,et al.  Pcons5: combining consensus, structural evaluation and fold recognition scores , 2005, Bioinform..

[11]  Alfonso Valencia,et al.  Assessment of predictions submitted for the CASP6 comparative modeling category , 2005, Proteins.

[12]  D Gilis,et al.  Protein Decoy Sets for Evaluating Energy Functions , 2004, Journal of biomolecular structure & dynamics.

[13]  D T Jones,et al.  Prediction of novel and analogous folds using fragment assembly and fold recognition , 2005, Proteins.

[14]  Adam Godzik,et al.  The importance of alignment accuracy for molecular replacement. , 2004, Acta crystallographica. Section D, Biological crystallography.

[15]  Janusz M. Bujnicki,et al.  COLORADO3D, a web server for the visual analysis of protein structures , 2004, Nucleic Acids Res..

[16]  Giorgio Valle,et al.  Simple consensus procedures are effective and sufficient in secondary structure prediction. , 2003, Protein engineering.

[17]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.

[18]  Janusz M Bujnicki,et al.  Identification of a new subfamily of HNH nucleases and experimental characterization of a representative member, HphI restriction endonuclease , 2006, Proteins.

[19]  Janusz M Bujnicki,et al.  Bacteriophage Mu Mom protein responsible for DNA modification is a new member of the acyltransferase superfamily , 2008, Cell cycle.

[20]  D Thirumalai,et al.  Development of novel statistical potentials for protein fold recognition. , 2004, Current opinion in structural biology.

[21]  R A Sayle,et al.  RASMOL: biomolecular graphics for all. , 1995, Trends in biochemical sciences.

[22]  S. Wodak,et al.  Deviations from standard atomic volumes as a quality measure for protein crystal structures. , 1996, Journal of molecular biology.

[23]  Terri K. Attwood,et al.  BPROMPT: a consensus server for membrane protein prediction , 2003, Nucleic Acids Res..

[24]  Janusz M Bujnicki,et al.  Sequence–structure–function analysis of the bifunctional enzyme MnmC that catalyses the last two steps in the biosynthesis of hypermodified nucleoside mnm5s2U in tRNA , 2008, Proteins.

[25]  Marcin Feder,et al.  FRankenstein becomes a cyborg: The automatic recombination and realignment of fold recognition models in CASP6 , 2005, Proteins.

[26]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[27]  R Samudrala,et al.  Decoys ‘R’ Us: A database of incorrect conformations to improve protein structure prediction , 2000, Protein science : a publication of the Protein Society.

[28]  Anna Tramontano,et al.  Evaluating the usefulness of protein structure models for molecular replacement , 2005, ECCB/JBI.

[29]  Liam J. McGuffin,et al.  Benchmarking consensus model quality assessment for protein fold recognition , 2007, BMC Bioinformatics.

[30]  M. Karplus,et al.  Effective energy functions for protein structure prediction. , 2000, Current opinion in structural biology.

[31]  Janusz M Bujnicki,et al.  A model of restriction endonuclease MvaI in complex with DNA: A template for interpretation of experimental data and a guide for specificity engineering , 2007, Proteins.

[32]  F. Melo,et al.  Assessing protein structures with a non-local atomic interaction energy. , 1998, Journal of molecular biology.

[33]  Michal Boniecki,et al.  Structural bioinformatics analysis of enzymes involved in the biosynthesis pathway of the hypermodified nucleoside ms2io6A37 in tRNA , 2007, Proteins.

[34]  M. Sippl Recognition of errors in three‐dimensional structures of proteins , 1993, Proteins.

[35]  Marcin Feder,et al.  A “FRankenstein's monster” approach to comparative modeling: Merging the finest fragments of Fold‐Recognition models and iterative model refinement aided by 3D structure evaluation , 2003, Proteins.

[36]  Alexander Tropsha,et al.  Development of a four-body statistical pseudo-potential to discriminate native from non-native protein conformations , 2003, Bioinform..

[37]  Anna Tramontano,et al.  Assessment of predictions in the model quality assessment category , 2007, Proteins.

[38]  D. T. Jones,et al.  Evaluating the potential of using fold-recognition models for molecular replacement. , 2001, Acta crystallographica. Section D, Biological crystallography.

[39]  Arne Elofsson,et al.  Prediction of global and local model quality in CASP7 using Pcons and ProQ , 2007, Proteins.

[40]  Janusz M. Bujnicki,et al.  GeneSilico protein structure prediction meta-server , 2003, Nucleic Acids Res..

[41]  Arne Elofsson,et al.  Structure prediction meta server , 2001, Bioinform..

[42]  Harpreet Kaur Saini,et al.  BIOINFORMATICS APPLICATIONS NOTE Structural bioinformatics Meta-DP: domain prediction meta-server , 2022 .

[43]  J. Bujnicki,et al.  Probing of contacts between EcoRII DNA methyltransferase and DNA with the use of substrate analogs and molecular modeling , 2007, Molecular Biology.

[44]  Liam J. McGuffin,et al.  The ModFOLD server for the quality assessment of protein structural models , 2008, Bioinform..

[45]  Anna Tramontano,et al.  Assessment of homology‐based predictions in CASP5 , 2003, Proteins.

[46]  D. Eisenberg,et al.  VERIFY3D: assessment of protein models with three-dimensional profiles. , 1997, Methods in enzymology.

[47]  N. Guex,et al.  SWISS‐MODEL and the Swiss‐Pdb Viewer: An environment for comparative protein modeling , 1997, Electrophoresis.

[48]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[49]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[50]  Janusz M. Bujnicki,et al.  SUPPLEMENTARY FIGURES AND TABLES FOR : Structural bioinformatics analysis of enzymes involved in the biosynthesis pathway of the hypermodified nucleoside ms 2 io 6 A 37 in tRNA , 2007 .

[51]  Arne Elofsson,et al.  Pcons.net: protein structure prediction meta server , 2007, Nucleic Acids Res..

[52]  M. Sanner,et al.  Reduced surface: an efficient way to compute molecular surfaces. , 1996, Biopolymers.

[53]  E V Kudan,et al.  [Probing of contacts between EcoRII DNA methyltransferase and DNA using substrate analogs and molecular modeling]. , 2007, Molekuliarnaia biologiia.

[54]  D. Eisenberg,et al.  Assessment of protein models with three-dimensional profiles , 1992, Nature.

[55]  Lisa N Kinch,et al.  CASP5 target classification , 2003, Proteins.

[56]  Rolf Backofen,et al.  Backofen R: MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons , 2005 .

[57]  Liam J. McGuffin,et al.  Improving sequence-based fold recognition by using 3D model quality assessment , 2005, Bioinform..

[58]  Andrzej Kolinski,et al.  Protein fragment reconstruction using various modeling techniques , 2003, J. Comput. Aided Mol. Des..

[59]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[60]  Janusz M Bujnicki,et al.  Modeling and experimental analyses reveal a two-domain structure and amino acids important for the activity of aminoglycoside resistance methyltransferase Sgm. , 2008, Biochimica et biophysica acta.

[61]  D. Baker,et al.  Clustering of low-energy conformations near the native structures of small proteins. , 1998, Proceedings of the National Academy of Sciences of the United States of America.