Accuracy issues involved in modeling in vivo protein structures using PM7

Using the semiempirical method PM7, an attempt has been made to quantify the error in prediction of the in vivo structure of proteins relative to X‐ray structures. Three important contributory factors are the experimental limitations of X‐ray structures, the difference between the crystal and solution environments, and the errors due to PM7. The geometries of 19 proteins from the Protein Data Bank that had small R values, that is, high accuracy structures, were optimized and the resulting drop in heat of formation was calculated. Analysis of the changes showed that about 10% of this decrease in heat of formation was caused by faults in PM7, the balance being attributable to the X‐ray structure and the difference between the crystal and solution environments. A previously unknown fault in PM7 was revealed during tests to validate the geometries generated using PM7. Clashscores generated by the Molprobity molecular mechanics structure validation program showed that PM7 was predicting unrealistically close contacts between nonbonding atoms in regions where the local geometry is dominated by very weak noncovalent interactions. The origin of this fault was traced to an underestimation of the core‐core repulsion between atoms at distances smaller than the equilibrium distance. Proteins 2015; 83:1427–1435. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.

[1]  M. Head‐Gordon,et al.  A fifth-order perturbation comparison of electron correlation theories , 1989 .

[2]  Jack Snoeyink,et al.  Nucleic Acids Research Advance Access published April 22, 2007 MolProbity: all-atom contacts and structure validation for proteins and nucleic acids , 2007 .

[3]  A. Klamt,et al.  COSMO : a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient , 1993 .

[4]  S. Grimme,et al.  A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. , 2010, The Journal of chemical physics.

[5]  Marvin Edelman,et al.  The limit of accuracy of protein modeling: influence of crystal packing on protein structure. , 2005, Journal of molecular biology.

[6]  Alfredo Mayall Simas,et al.  RM1: A reparameterization of AM1 for H, C, N, O, P, S, F, Cl, Br, and I , 2006, J. Comput. Chem..

[7]  Vincent B. Chen,et al.  Correspondence e-mail: , 2000 .

[8]  Pavel Hobza,et al.  S66: A Well-balanced Database of Benchmark Interaction Energies Relevant to Biomolecular Structures , 2011, Journal of chemical theory and computation.

[9]  James J. P. Stewart,et al.  Optimization of parameters for semiempirical methods VI: more modifications to the NDDO approximations and re-optimization of parameters , 2012, Journal of Molecular Modeling.

[10]  Karel Berka,et al.  Quantum Chemical Benchmark Energy and Geometry Database for Molecular Clusters and Complex Molecular Systems (www.begdb.com): A Users Manual and Examples , 2008 .

[11]  M. Plesset,et al.  Note on an Approximation Treatment for Many-Electron Systems , 1934 .

[12]  A. Becke Density-functional thermochemistry. III. The role of exact exchange , 1993 .

[13]  Jirí Cerný,et al.  Benchmark database of accurate (MP2 and CCSD(T) complete basis set limit) interaction energies of small model complexes, DNA base pairs, and amino acid pairs. , 2006, Physical chemistry chemical physics : PCCP.

[14]  James J. P. Stewart,et al.  Application of the PM6 method to modeling proteins , 2009, Journal of molecular modeling.

[15]  Martin Korth,et al.  Third-Generation Hydrogen-Bonding Corrections for Semiempirical QM Methods and Force Fields , 2010 .

[16]  J. Stewart Optimization of parameters for semiempirical methods I. Method , 1989 .

[17]  Michal Otyepka,et al.  Transferable scoring function based on semiempirical quantum mechanical PM6-DH2 method: CDK2 with 15 structurally diverse inhibitors , 2011, J. Comput. Aided Mol. Des..

[18]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[19]  J. Stewart Optimization of parameters for semiempirical methods V: Modification of NDDO approximations and application to 70 elements , 2007, Journal of molecular modeling.