Estimation of pKa for Druglike Compounds Using Semiempirical and Information-Based Descriptors

A pragmatic approach has been developed for the estimation of aqueous ionization constants (pKa) for druglike compounds. The method involves an algorithm that assigns ionization constants in a stepwise manner to the acidic and basic groups present in a compound. Predictions are made for each ionizable group using models derived from semiempirical quantum chemical properties and information-based descriptors. Semiempirical properties include the partial charge and electrophilic superdelocalizabilty of the atom(s) undergoing protonation or deprotonation. Importantly, the latter property has been extended to allow predictions to be made for multiprotic compounds, overcoming limitations of a previous approach described by Tehan et al. The information-based descriptions include molecular-tree structured fingerprints, based on the methodology outlined by Xing et al., with the addition of 2D substructure flags indicating the presence of other important structural features. These two classes of descriptor were found to complement one another particularly well, resulting in predictive models for a range of functional groups (including alcohols, amidines, amines, anilines, carboxylic acids, guanidines, imidazoles, imines, phenols, pyridines, and pyrimidines). A combined RMSE of 0.48 and 0.81 was obtained for the training set and an external test set compounds, respectively. The predictive models were based on compounds selected from the commercially available BioLoom database. The resultant speed and accuracy of the approach has also enabled the development of Web application on the Novartis intranet for pKa prediction.

[1]  Robert C. Glen,et al.  Predicting pKa by Molecular Tree Structured Fingerprints and PLS , 2003, J. Chem. Inf. Comput. Sci..

[2]  Paul K. Smith,et al.  THERMODYNAMIC PROPERTIES OF SOLUTIONS OF AMINO ACIDS AND RELATED SUBSTANCES VII. THE IONIZATION OF SOME HYDROXYAMINO ACIDS AND PROLINE IN AQUEOUS SOLUTION FROM ONE TO FIFTY DEGREES , 1942 .

[3]  Richard O. Roblin,et al.  Studies in Chemotherapy. VII. A Theory of the Relation of Structure to Activity of Sulfanilamide Type Compounds1 , 1942 .

[4]  W. Bremser Hose — a novel substructure code , 1978 .

[5]  Wolf-Dietrich Ihlenfeldt,et al.  Computation and management of chemical properties in CACTVS: An extensible networked approach toward modularity and compatibility , 1994, J. Chem. Inf. Comput. Sci..

[6]  Kenichi Fukui,et al.  Theory of Substitution in Conjugated Molecules , 1954 .

[7]  Gerd Folkers,et al.  Pharmacokinetic Profiling in Drug Research , 2006 .

[8]  John Comer,et al.  High-throughput measurement of pKa values in a mixed-buffer linear pH gradient system. , 2003, Analytical chemistry.

[9]  Emanuela Gancia,et al.  Estimation of pKa Using Semiempirical Molecular Orbital Methods. Part 1: Application to Phenols and Carboxylic Acids. , 2002 .

[10]  Robert C. Glen,et al.  Novel Methods for the Prediction of logP, pKa, and logD , 2002, J. Chem. Inf. Comput. Sci..

[11]  Lionel A. Carreira,et al.  Hydration Equilibrium Constants of Aldehydes, Ketones and Quinazolines , 2005 .

[12]  Fred Basolo,et al.  Steric Effects and the Stability of Complex Compounds. III. The Chelating Tendencies of N-Alkylglycines and N-Dialkylglycines with Copper(II) and Nickel(II) Ions1 , 1954 .

[13]  John Comer,et al.  Lipophilicity Profiles: Theory and Measurement , 2007 .

[14]  O E Schultz,et al.  [Relationships between structure and laxative action of triarylmethane derivates]. , 1974, Arzneimittel-Forschung.

[15]  Eamonn F. Healy,et al.  Development and use of quantum mechanical molecular models. 76. AM1: a new general purpose quantum mechanical molecular model , 1985 .

[16]  Emanuela Gancia,et al.  Estimation of pKa Using Semiempirical Molecular Orbital Methods. Part 2: Application to Amines, Anilines and Various Nitrogen Containing Heterocyclic Compounds. , 2002 .

[17]  Bernard Testa,et al.  Structure-Lipophilicity Relationships of Zwitterionic Amino Acids. , 1992 .

[18]  J. T. Edward,et al.  385. Hydrolysis of amides and related compounds. Part III. Methyl benzimidate in aqueous acids , 1957 .

[19]  J. Murray,et al.  Comparison of quantum chemical parameters and Hammett constants in correlating pK(a) values of substituted anilines. , 2001, The Journal of organic chemistry.

[20]  R B Barlow,et al.  Effects of some isomers and analogues of nicotine on junctional transmission. , 1962, British journal of pharmacology and chemotherapy.

[21]  VERSION , 1922 .

[22]  Adrien Albert,et al.  605. Ionization constants of heterocyclic substances. Part V. Mercapto-derivatives of diazines and benzodiazines , 1962 .

[23]  W. L. F. Armarego,et al.  813. Ionization and ultraviolet spectra of indolizines , 1964 .

[24]  R. Cramer,et al.  Validation of the general purpose tripos 5.2 force field , 1989 .

[25]  J. Gasteiger,et al.  FROM ATOMS AND BONDS TO THREE-DIMENSIONAL ATOMIC COORDINATES : AUTOMATIC MODEL BUILDERS , 1993 .

[26]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[27]  Paul M. Selzer,et al.  Web-based cheminformatics tools deployed via corporate Intranets , 2004 .

[28]  C. O. D. Silva,et al.  Ab Initio Calculations of Absolute pKa Values in Aqueous Solution I. Carboxylic Acids , 1999 .

[29]  Alexander Gero,et al.  Regularities in the Basicity of Some Tertiary Ethylenediamines, Trimethylenediamines and 2-Hydroxytrimethylenediamines , 1954 .