Predicting the pKa of Small Molecules

The biopharmaceutical profile of a compound depends directly on the dissociation constants of its acidic and basic groups, commonly expressed as the negative decadic logarithm pKa of the acid dissociation constant (Ka). We survey the literature on computational methods to predict the pKa of small molecules. In this, we address data availability (used data sets, data quality, proprietary versus public data), molecular representations (quantum mechanics, descriptors, structured representations), prediction methods (approaches, implementations), as well as pKa-specific issues such as mono- and multiprotic compounds. We discuss advantages, problems, recent progress, and challenges in the field.

[1]  J. Blake,et al.  On the Connection between Chemical Constitution and Physiological Action , 1886, Nature.

[2]  L. Hammett,et al.  Some Relations between Reaction Rates and Equilibrium Constants. , 1935 .

[3]  Richard O. Roblin,et al.  Studies in Chemotherapy. VII. A Theory of the Relation of Structure to Activity of Sulfanilamide Type Compounds1 , 1942 .

[4]  F. Rossotti,et al.  The determination of stability constants , and other equilibrium constants in solution , 1961 .

[5]  R. M. Muir,et al.  Correlation of Biological Activity of Phenoxyacetic Acids with Hammett Substituent Constants and Partition Coefficients , 1962, Nature.

[6]  C. Hansch,et al.  A NEW SUBSTITUENT CONSTANT, PI, DERIVED FROM PARTITION COEFFICIENTS , 1964 .

[7]  S. Free,et al.  A MATHEMATICAL CONTRIBUTION TO STRUCTURE-ACTIVITY STUDIES. , 1964, Journal of medicinal chemistry.

[8]  Arthur E. Martell,et al.  Stability constants of metal-ion complexes , 1964 .

[9]  D. D. Perrin Dissociation Constants of Organic Bases in Aqueous Solution , 1965 .

[10]  H. A. Sober,et al.  Handbook of Biochemistry: Selected Data for Molecular Biology , 1971 .

[11]  W. Hamer,et al.  Osmotic Coefficients and Mean Activity Coefficients of Uni‐univalent Electrolytes in Water at 25°C , 1972 .

[12]  M. Dewar,et al.  The PMO Theory of Organic Chemistry , 1975 .

[13]  I. R. Mcdonald,et al.  Theory of simple liquids , 1998 .

[14]  A. Leo,et al.  Substituent constants for correlation analysis in chemistry and biology , 1979 .

[15]  D. D. Perrin,et al.  pKa prediction for organic acids and bases , 1981 .

[16]  L. Janssen,et al.  Influence of ionization and ion-pair formation on lipophilicity of some 4-hydroxycoumarin derivatives in the octanol-water system , 1982 .

[17]  Parr,et al.  Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density. , 1988, Physical review. B, Condensed matter.

[18]  Hugo Kubinyi,et al.  Free Wilson Analysis. Theory, Applications and its Relationship to Hansch Analysis , 1988 .

[19]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[20]  William L. Jorgensen,et al.  A priori pKa calculations and the hydration of organic anions , 1989 .

[21]  V. Buss,et al.  Quantum-mechanically calculated properties for the development of quantitative structure-activity relationships (QSAR'S). pKA-values of phenols and aromatic and aliphatic carboxylic acids , 1989 .

[22]  H Ichikawa,et al.  Neural networks applied to quantitative structure-activity relationship analysis. , 1990, Journal of medicinal chemistry.

[23]  H. Terada,et al.  Uncouplers of oxidative phosphorylation. , 1990, Environmental health perspectives.

[24]  Y. Martin,et al.  Direct prediction of dissociation constants (pKa's) of clonidine-like imidazolines, 2-substituted imidazoles, and 1-methyl-2-substituted-imidazoles from 3D structures using a comparative molecular field analysis (CoMFA) approach. , 1991, Journal of medicinal chemistry.

[25]  K. Izutsu Acid-Base Dissociation Constants in Dipolar Aprotic Solvents , 1991 .

[26]  K. Ohta,et al.  Prediction of pKa Vlaues of Alkylphosphonic Acids. , 1992 .

[27]  Christopher A. Reynolds,et al.  Free energy calculations in molecular biophysics , 1992 .

[28]  A. Becke Density-functional thermochemistry. III. The role of exact exchange , 1993 .

[29]  P. G. Gassman,et al.  Understanding the rates of certain enzyme-catalyzed reactions: proton abstraction from carbon acids, acyl-transfer reactions, and displacement reactions of phosphodiesters. , 1993, Biochemistry.

[30]  N. Rice,et al.  Nomenclature for liquid-liquid distribution (solvent extraction) (IUPAC Recommendations 1993) , 1993 .

[31]  A. Klamt,et al.  COSMO : a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient , 1993 .

[32]  Ferenc Darvas,et al.  Expert system approach for predicting pKa , 1993 .

[33]  Peter C. Jurs,et al.  Estimation of pKa for organic oxyacids using calculated atomic charges , 1993, J. Comput. Chem..

[34]  Hans Lohninger Evaluation of neural networks based on radial basis functions and their application to the prediction of boiling points from structural parameters , 1993, J. Chem. Inf. Comput. Sci..

[35]  Peter A. Kollman,et al.  FREE ENERGY CALCULATIONS : APPLICATIONS TO CHEMICAL AND BIOCHEMICAL PHENOMENA , 1993 .

[36]  Jacopo Tomasi,et al.  Molecular Interactions in Solution: An Overview of Methods Based on Continuous Distributions of the Solvent , 1994 .

[37]  M. Gilson,et al.  Small Molecule pKa Prediction with Continuum Electrostatics Calculations , 1994 .

[38]  Gilles Klopman,et al.  Application of the multiple computer automated structure evaluation methodology to a quantitative structure–activity relationship study of acidity , 1994, J. Comput. Chem..

[39]  P. Müller Glossary of terms used in physical organic chemistry (IUPAC Recommendations 1994) , 1994 .

[40]  William L. Jorgensen,et al.  Free Energies of Hydration for Organic Molecules from Monte Carlo Simulations , 1995 .

[41]  Lionel A. Carreira,et al.  A RIGOROUS TEST FOR SPARC'S CHEMICAL REACTIVITY MODELS : ESTIMATION OF MORE THAN 4300 IONIZATION PKAS , 1995 .

[42]  L. A. Carreira,et al.  Estimation of Chemical Reactivity Parameters and Physical Properties of Organic Molecules Using SPARC , 1995 .

[43]  G. Schüürmann,et al.  Structure—activity relationships for chloro‐ and nitrophenol toxicity in the pollen tube growth test , 1996 .

[44]  M Karplus,et al.  Evolutionary optimization in quantitative structure-activity relationship: an application of genetic neural networks. , 1996, Journal of medicinal chemistry.

[45]  Helmut Segner,et al.  Multivariate mode-of-action analysis of acute toxicity of phenols , 1997 .

[46]  Peter Ertl,et al.  Simple Quantum Chemical Parameters as an Alternative to the Hammett Sigma Constants in QSAR Studies , 1997 .

[47]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[48]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[49]  Jacopo Tomasi,et al.  PREDICTION OF THE PKA OF CARBOXYLIC ACIDS USING THE AB INITIO CONTINUUM-SOLVATION MODEL PCM-UAHF , 1998 .

[50]  Christoph A. Sotriffer,et al.  Application of multivariate data analysis methods to Comparative Molecular Field Analysis (CoMFA) data: Proton affinities and pKa prediction for nucleic acids components , 1999, J. Comput. Aided Mol. Des..

[51]  Paul L. A. Popelier,et al.  Quantum molecular similarity. 1. BCP space , 1999 .

[52]  Béla Noszál,et al.  Protonation microequilibrium treatment of polybasic compounds with any possible symmetry , 1999 .

[53]  B. Roux,et al.  Implicit solvent models. , 1999, Biophysical chemistry.

[54]  C. O. D. Silva,et al.  Ab Initio Calculations of Absolute pKa Values in Aqueous Solution I. Carboxylic Acids , 1999 .

[55]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[56]  Gregory J. Tawa,et al.  Acidity of Organic Molecules in the Gas Phase and in Aqueous Solvent , 2000 .

[57]  M. Nascimento,et al.  Ab Initio Calculations of Absolute pKa Values in Aqueous Solution II. Aliphatic Alcohols, Thiols, and Halogenated Carboxylic Acids , 2000 .

[58]  Henry N. Po,et al.  The Henderson-Hasselbalch Equation: Its History and Limitations , 2001 .

[59]  G. Shields,et al.  Accurate pK(a) calculations for carboxylic acids using complete basis set and Gaussian-n models combined with CPCM continuum solvation methods. , 2001, Journal of the American Chemical Society.

[60]  J. Murray,et al.  Comparison of quantum chemical parameters and Hammett constants in correlating pK(a) values of substituted anilines. , 2001, The Journal of organic chemistry.

[61]  P. F. Fitzpatrick,et al.  Substrate dehydrogenation by flavoproteins. , 2001, Accounts of chemical research.

[62]  J. Richard,et al.  Proton transfer at carbon. , 2001, Current opinion in chemical biology.

[63]  Roman Rosipal,et al.  Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space , 2002, J. Mach. Learn. Res..

[64]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[65]  Paul L. A. Popelier,et al.  Quantum Molecular Similarity. 3. QTMS Descriptors , 2001, J. Chem. Inf. Comput. Sci..

[66]  Nicholas C. Handy,et al.  Assessment of a new local exchange functional OPTX , 2001 .

[67]  W. Guida,et al.  Accurate Prediction of Acidity Constants in Aqueous Solution via Density Functional Theory and Self-Consistent Reaction Field Methods , 2002 .

[68]  Christopher M. Hadad,et al.  Comparison of different atomic charge schemes for predicting pKa variations in substituted anilines and phenols , 2002 .

[69]  J. R. Pliego,et al.  Gibbs energy of solvation of organic ions in aqueous and dimethyl sulfoxide solutions , 2002 .

[70]  Emanuela Gancia,et al.  Estimation of pKa Using Semiempirical Molecular Orbital Methods. Part 1: Application to Phenols and Carboxylic Acids. , 2002 .

[71]  R. D. Levie,et al.  The Henderson Approximation and the Mass Action Law of Guldberg and Waage , 2002 .

[72]  Ruisheng Zhang,et al.  Radial basis function neural network-based QSPR for the prediction of critical temperature , 2002 .

[73]  Emanuela Gancia,et al.  Estimation of pKa Using Semiempirical Molecular Orbital Methods. Part 2: Application to Amines, Anilines and Various Nitrogen Containing Heterocyclic Compounds. , 2002 .

[74]  P. Seybold,et al.  Absolute pK(a) determinations for substituted phenols. , 2002, Journal of the American Chemical Society.

[75]  QSPR study of the acidity of carbon acids in aqueous solution , 2002 .

[76]  J. R. Pliego,et al.  Theoretical Calculation of pKa Using the Cluster−Continuum Model , 2002 .

[77]  G. Ullmann,et al.  Relations between Protonation Constants and Titration Curves in Polyprotic Acids: A Critical View , 2003 .

[78]  J. R. Pliego Thermodynamic cycles and the calculation of pKa , 2003 .

[79]  D. E. Clark In silico prediction of blood-brain barrier permeation. , 2003, Drug discovery today.

[80]  Andreas Klamt,et al.  First Principles Calculations of Aqueous pKa Values for Organic and Inorganic Acids Using COSMO-RS Reveal an Inconsistency in the Slope of the pKa Scale. , 2003, The journal of physical chemistry. A.

[81]  Alex Avdeef,et al.  Absorption and Drug Development: Solubility, Permeability, and Charge State , 2003 .

[82]  J. Mccammon,et al.  Calculating pKa values in enzyme active sites , 2003, Protein science : a publication of the Protein Society.

[83]  Kristin P. Bennett,et al.  An Optimization Perspective on Kernel Partial Least Squares Regression , 2003 .

[84]  Andrew Williams,et al.  Free energy relationships , 2003 .

[85]  E. Knapp,et al.  Accurate pKa determination for a heterogeneous group of organic molecules. , 2004, Chemphyschem : a European journal of chemical physics and physical chemistry.

[86]  A. M. Magill,et al.  Basicity of nucleophilic carbenes in aqueous and nonaqueous solvents-theoretical predictions. , 2004, Journal of the American Chemical Society.

[87]  J. Murray,et al.  Relationships between aqueous acidities and computed surface-electrostatic potentials and local ionization energies of substituted phenols and benzoic acids , 2004 .

[88]  P. Popelier,et al.  Estimation of pKa using quantum topological molecular similarity descriptors: application to carboxylic acids, anilines and phenols. , 2004, The Journal of organic chemistry.

[89]  D. Story Bench-to-bedside review: A brief history of clinical acid–base , 2004, Critical care.

[90]  Fumio Hirata,et al.  Molecular Theory of Solvation , 2004 .

[91]  The Chemical Potential , 2004, cond-mat/0408103.

[92]  Elena Soriano,et al.  Computational determination of pKa values. A comparison of different theoretical approaches and a novel procedure , 2004 .

[93]  J. Tomasi,et al.  Quantum mechanical continuum solvation models. , 2005, Chemical reviews.

[94]  Zhide Hu,et al.  Prediction of pKa for Neutral and Basic Drugs Based on Radial Basis Function Neural Networks and the Heuristic Method , 2005, Pharmaceutical Research.

[95]  A. H. Yangjeh,et al.  Prediction Acidity Constant of Various Benzoic Acids and Phenols in Water Using Linear and Nonlinear QSPR Models , 2005 .

[96]  Thierry Kogej,et al.  Database mining for pKa prediction. , 2005, Current drug discovery technologies.

[97]  Igor V. Tetko,et al.  Virtual Computational Chemistry Laboratory – Design and Description , 2005, J. Comput. Aided Mol. Des..

[98]  Johann Gasteiger,et al.  Prediction of pKa Values for Aliphatic Carboxylic Acids and Alcohols with Empirical Atomic Charge Descriptors , 2006, J. Chem. Inf. Model..

[99]  Andreas Klamt,et al.  Accurate prediction of basicity in aqueous solution with COSMO‐RS , 2006, J. Comput. Chem..

[100]  T. Brown,et al.  Computational determination of aqueous pKa values of protonated benzimidazoles (Part 2). , 2006, The journal of physical chemistry. B.

[101]  Georg Job,et al.  Chemical potential—a quantity in search of recognition , 2006 .

[102]  M. Namazian,et al.  Calculations of pKa values of carboxylic acids in aqueous solution using density functional theory , 2006 .

[103]  R. Parthasarathi,et al.  pKa prediction using group philicity. , 2006, The journal of physical chemistry. A.

[104]  Gisbert Schneider,et al.  Kernel Approach to Molecular Similarity Based on Iterative Graph Similarity , 2007, J. Chem. Inf. Model..

[105]  Milan Randić,et al.  Variable connectivity model for determination of pK[sub]a values for selected organic acids , 2007 .

[106]  Milan Meloun,et al.  Benchmarking and validating algorithms that estimate pKa values of drugs based on their molecular structures , 2007, Analytical and bioanalytical chemistry.

[107]  R. Lobrutto,et al.  HPLC for Pharmaceutical Scientists , 2007 .

[108]  Jaroslaw Polanski,et al.  Modeling Robust QSAR, 2. Iterative Variable Elimination Schemes for CoMSA: Application for Modeling Benzoic Acid pKa Values , 2007, J. Chem. Inf. Model..

[109]  Saravanaraj N. Ayyampalayam,et al.  In Silico Prediction of Ionization Constants of Drugs , 2007 .

[110]  Jakob P Ulmschneider,et al.  A generalized born implicit-membrane representation compared to experimental insertion free energies. , 2007, Biophysical journal.

[111]  Steven D. Brown,et al.  QSPR study for estimation of acidity constants of some aromatic acids derivatives using multiple linear regression (MLR) analysis , 2007 .

[112]  V. A. Palyulin,et al.  Estimation of ionization constants for different classes of organic compounds with the use of the fragmental approach to the search of structure-property relationships , 2007 .

[113]  Ovidiu Ivanciuc,et al.  Applications of Support Vector Machines in Chemistry , 2007 .

[114]  D. Manallack,et al.  Drug Targeting of α-Synuclein Oligomerization in Synucleinopathies , 2007 .

[115]  Jeremy R. Greenwood,et al.  Epik: a software program for pKa prediction and protonation state generation for drug-like molecules , 2007, J. Comput. Aided Mol. Des..

[116]  Igor V. Tetko,et al.  The Good, the Bad and the Ugly of Distribution Coefficients: Current Status, Views and Outlook , 2007 .

[117]  R. Prankerd Critical Compilation of pK(a) Values for Pharmaceutical Substances. , 2007, Profiles of drug substances, excipients, and related methodology.

[118]  Chang-Guo Zhan,et al.  First-principles calculation of pKa for cocaine, nicotine, neurotransmitters, and anilines in aqueous solution. , 2007, The journal of physical chemistry. B.

[119]  Peter Ertl,et al.  Estimation of pKa for Druglike Compounds Using Semiempirical and Information-Based Descriptors , 2007, J. Chem. Inf. Model..

[120]  Igor V. Tetko,et al.  Critical Assessment of QSAR Models of Environmental Toxicity against Tetrahymena pyriformis: Focusing on Applicability Domain and Overfitting by Variable Selection , 2008, J. Chem. Inf. Model..

[121]  Alex Smola,et al.  Kernel methods in machine learning , 2007, math/0701907.

[122]  Franco Lombardo,et al.  Measurement of dissociation constants (pKa values) of organic compounds by multiplexed capillary electrophoresis using aqueous and cosolvent buffers. , 2008, Journal of pharmaceutical sciences.

[123]  Gordon M. Crippen,et al.  pKa Prediction of Monoprotic Small Molecules the SMARTS Way , 2008, J. Chem. Inf. Model..

[124]  P. Seybold Analysis of the pKas of aliphatic amines using quantum chemical descriptors , 2008 .

[125]  E. Anslyn,et al.  Electrophilic coordination catalysis: a summary of previous thought and a new angle of analysis. , 2008, Accounts of chemical research.

[126]  György M. Keserű,et al.  Comparative Evaluation of in Silico pKa Prediction Tools on the Gold Standard Dataset , 2009 .

[127]  L. Pedersen,et al.  Estimation of molecular acidity via electrostatic potential at the nucleus and valence natural atomic orbitals. , 2009, The journal of physical chemistry. A.

[128]  D. Dissanayake,et al.  Thermodynamic cycle for the calculation of ab initio pKa values for hydroxamic acids , 2009 .

[129]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[130]  M. Coote,et al.  A universal approach for continuum solvent pKa calculations: are we there yet? , 2009 .

[131]  Donald G Truhlar,et al.  Performance of SM6, SM8, and SMD on the SAMPL1 test set for the prediction of small-molecule solvation free energies. , 2009, The journal of physical chemistry. B.

[132]  R. Mannhold,et al.  Calculation of molecular lipophilicity: state of the art and comparison of methods on more than 96000 compounds , 2009, Journal of pharmaceutical sciences.

[133]  Marc C. Nicklaus,et al.  Comparison of Nine Programs Predicting pKa Values of Pharmaceutical Substances , 2009, J. Chem. Inf. Model..

[134]  Charles L. Brooks,et al.  λ‐Dynamics free energy simulation methods , 2009, J. Comput. Chem..

[135]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[136]  S. B. Chang,et al.  Acid dissociation constants of melamine derivatives from density functional theory calculations. , 2009, The journal of physical chemistry. A.

[137]  Yu. E. Zevatskii,et al.  Empirical procedure for the calculation of ionization constants of organic compounds in water from their molecular volume , 2009 .

[138]  Vladimir Potemkin,et al.  Technique for Energy Decomposition in the Study of "Receptor-Ligand" Complexes , 2009, J. Chem. Inf. Model..

[139]  Andreas Klamt,et al.  Prediction of the free energy of hydration of a challenging set of pesticide-like compounds. , 2009, The journal of physical chemistry. B.

[140]  Zsuzsanna Kovács,et al.  Triprotic acid-base microequilibria and pharmacokinetic sequelae of cetirizine. , 2009, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[141]  M. Goodarzi,et al.  Prediction of the acidic dissociation constant (pKa) of some organic compounds using linear and nonlinear QSPR methods , 2009 .

[142]  Eslam Pourbasheer,et al.  Application of principal component-genetic algorithm-artificial neural network for prediction acidity constant of various nitrogen-containing compounds in water , 2009 .

[143]  Manfred Kansy,et al.  Extending pKa prediction accuracy: high-throughput pKa measurements to understand pKa modulation of new chemical series. , 2010, European journal of medicinal chemistry.

[144]  Igor V Tetko,et al.  Estimation of Acid Dissociation Constants Using Graph Kernels , 2010, Molecular informatics.

[145]  Pablo R. Duchowicz,et al.  pKa modeling and prediction of a series of pH indicators through genetic algorithm-least square support vector regression , 2010 .

[146]  John Manchester,et al.  Evaluation of pKa Estimation Methods on 211 Druglike Compounds , 2010, J. Chem. Inf. Model..

[147]  Gisbert Schneider,et al.  Graph Kernels for Molecular Similarity , 2010, Molecular informatics.

[148]  J. Regenstein,et al.  Ionization Constants of Acids and Bases , 2010 .

[149]  C. Selassie,et al.  History of Quantitative Structure–Activity Relationships , 2010 .

[150]  Yue Zeng,et al.  Thermodynamic Estimate of pKa Values of the Carboxylic Acids in Aqueous Solution with the Density Functional Theory , 2010 .

[151]  Maxim V Fedorov,et al.  Accurate calculations of the hydration free energies of druglike molecules using the reference interaction site model. , 2010, The Journal of chemical physics.