Benchmarking and validating algorithms that estimate pKa values of drugs based on their molecular structures

AbstractThe REGDIA regression diagnostics algorithm in S-Plus is introduced in order to examine the accuracy of pKa predictions made with four updated programs: PALLAS, MARVIN, ACD/pKa and SPARC. This report reviews the current status of computational tools for predicting the pKa values of organic drug-like compounds. Outlier predicted pKa values correspond to molecules that are poorly characterized by the pKa prediction program concerned. The statistical detection of outliers can fail because of masking and swamping effects. The Williams graph was selected to give the most reliable detection of outliers. Six statistical characteristics (Fexp, R2, $$ {\text{R}}^{2}_{{\text{P}}} $$, MEP, AIC, and s(e) in pKa units) of the results obtained when four selected pKa prediction algorithms were applied to three datasets were examined. The highest values of Fexp, R2, $$ {\text{R}}^{2}_{{\text{P}}} $$, the lowest values of MEP and s(e), and the most negative AIC were found using the ACD/pKa algorithm for pKa prediction, so this algorithm achieves the best predictive power and the most accurate results. The proposed accuracy test performed by the REGDIA program can also be applied to test the accuracy of other predicted values, such as log P, log D, aqueous solubility or certain physicochemical properties of drug molecules.

[1]  Yoshihiro Kudo,et al.  Automatic log P estimation based on combined additive modeling methods , 1990, J. Comput. Aided Mol. Des..

[2]  Christoph A. Sotriffer,et al.  Application of multivariate data analysis methods to Comparative Molecular Field Analysis (CoMFA) data: Proton affinities and pKa prediction for nucleic acids components , 1999, J. Comput. Aided Mol. Des..

[3]  R. Glen,et al.  Predicting pKa by Molecular Tree Structured Fingerprints and PLS. , 2003 .

[4]  P. Popelier,et al.  QSAR models based on quantum topological molecular similarity. , 2006, European journal of medicinal chemistry.

[5]  A. H. Yangjeh,et al.  Prediction Acidity Constant of Various Benzoic Acids and Phenols in Water Using Linear and Nonlinear QSPR Models , 2005 .

[6]  Franco Lombardo,et al.  Prediction of human volume of distribution values for neutral and basic drugs. 2. Extended data set and leave-class-out statistics. , 2004, Journal of medicinal chemistry.

[7]  Yvonne C. Martin,et al.  DIRECT PREDICTION OF LINEAR FREE ENERGY SUBSTITUENT EFFECTS FROM 3D STRUCTURES USING COMPARATIVE MOLECULAR FIELD ANALYSIS. I, ELECTRONIC EFFECTS OF SU BSTITUTED BENZOIC ACIDS , 1991 .

[8]  Søren Brunak,et al.  Prediction of pH-Dependent Aqueous Solubility of Druglike Molecules , 2006, J. Chem. Inf. Model..

[9]  Robert C. Glen,et al.  Novel Methods for the Prediction of logP, pKa, and logD , 2002, J. Chem. Inf. Comput. Sci..

[10]  Raimund Mannhold,et al.  On the Reliability of Calculated Log P-values: Rekker, Hansch/Leo and Suzuki Approach , 1993 .

[11]  Ola Engkvist,et al.  High-Throughput, In Silico Prediction of Aqueous Solubility Based on One- and Two-Dimensional Descriptors , 2002, J. Chem. Inf. Comput. Sci..

[12]  Zhide Hu,et al.  Prediction of pKa for Neutral and Basic Drugs Based on Radial Basis Function Neural Networks and the Heuristic Method , 2005, Pharmaceutical Research.

[13]  Y. Martin,et al.  Direct prediction of dissociation constants (pKa's) of clonidine-like imidazolines, 2-substituted imidazoles, and 1-methyl-2-substituted-imidazoles from 3D structures using a comparative molecular field analysis (CoMFA) approach. , 1991, Journal of medicinal chemistry.

[14]  T. Takagi,et al.  Introduction of solvent-accessible surface area in the calculation of the hydrophobicity parameter log P from an atomistic approach. , 1997, Journal of pharmaceutical sciences.

[15]  D. D. Perrin,et al.  pKa prediction for organic acids and bases , 1981 .

[16]  M. Forina,et al.  Chemometrics for analytical chemistry , 1992 .

[17]  Albert J. Leo,et al.  Critique of Recent Comparison of log P Calciulation Methods , 1995 .

[18]  A. Avdeef,et al.  pH-metric log P. 4. Comparison of partition coefficients determined by HPLC and potentiometric methods to literature values. , 1994, Journal of pharmaceutical sciences.

[19]  Sándor Suhai,et al.  Role of Isomerization Barriers in the pKa Control of the Retinal Schiff Base: A Density Functional Study , 1999 .

[20]  S. Hirono,et al.  Comparison of Reliability of log P Values for Drugs Calculated by Several Methods , 1994 .