In Silico Prediction of Aqueous Solubility Using Simple QSPR Models: The Importance of Phenol and Phenol-like Moieties

Recently the authors published a robust QSPR model of aqueous solubility which exploited the computationally derived molecular descriptor topographical polar surface area (TPSA) alongside experimentally determined melting point and logP. This model (the "TPSA model") is able to accurately predict to within ± one log unit the aqueous solubility of 87% of the compounds in a chemically diverse data set of 1265 molecules. This is comparable to results achieved for established models of aqueous solubility e.g. ESOL (79%) and the General Solubility Equation (81%). Hierarchical clustering of this data set according to chemical similarity shows that a significant number of molecules with phenolic and/or phenol-like moieties are poorly predicted by these equations. Modification of the TPSA model to additionally incorporate a descriptor pertaining to a simple count of phenol and phenol-like moieties improves the predictive ability within ± one log unit to 89% for the full data set (1265 compounds -8.48 < logS < 1.58) and 82% for a reduced data set (1160 compounds 6.00 < logS < 0.00) which excludes compounds at the sparsely populated extremities of the data range. This improvement can be rationalized as the additional descriptor in the model acting as a correction factor which acknowledges the effect of phenolic substituents on the electronic characteristics of aromatic molecules i.e. the generally positive contribution to aqueous solubility made by phenolic moieties.

[1]  Jogoth Ali,et al.  Revisiting the General Solubility Equation: In Silico Prediction of Aqueous Solubility Incorporating the Effect of Topographical Polar Surface Area , 2012, J. Chem. Inf. Model..

[2]  J. Delaney Predicting aqueous solubility from structure. , 2005, Drug discovery today.

[3]  S. Yalkowsky,et al.  Estimation of the aqueous solubility I: application to organic nonelectrolytes. , 2001, Journal of pharmaceutical sciences.

[4]  P. Ertl,et al.  Computational approaches to determine drug solubility. , 2007, Advanced drug delivery reviews.

[5]  L. Hammett The Effect of Structure upon the Reactions of Organic Compounds. Benzene Derivatives , 1937 .

[6]  J. Dearden In silico prediction of aqueous solubility , 2006, Expert opinion on drug discovery.

[7]  M. Fakhree,et al.  Experimental and Computational Methods Pertaining to Drug Solubility , 2012 .

[8]  S. Yalkowsky,et al.  Estimation of aqueous solubility of organic compounds by using the general solubility equation. , 2002, Chemosphere.

[9]  John S. Delaney,et al.  ESOL: Estimating Aqueous Solubility Directly from Molecular Structure , 2004, J. Chem. Inf. Model..

[10]  P. Selzer,et al.  Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. , 2000, Journal of medicinal chemistry.

[11]  A. Kaplan,et al.  A Beginner's Guide to Partial Least Squares Analysis , 2004 .

[12]  Igor V. Tetko,et al.  Virtual Computational Chemistry Laboratory – Design and Description , 2005, J. Comput. Aided Mol. Des..

[13]  Samuel H Yalkowsky,et al.  Prediction of aqueous solubility from SCRATCH. , 2010, International journal of pharmaceutics.

[14]  Junmei Wang,et al.  Recent advances on aqueous solubility prediction. , 2011, Combinatorial chemistry & high throughput screening.

[15]  A. Beresford,et al.  The emerging importance of predictive ADME simulation in drug discovery. , 2002, Drug discovery today.

[16]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.