Development of Reliable Aqueous Solubility Models and Their Application in Druglike Analysis

In this work, two reliable aqueous solubility models, ASMS (aqueous solubility based on molecular surface) and ASMS-LOGP (aqueous solubility based on molecular surface using ClogP as a descriptor), were constructed by using atom type classified solvent accessible surface areas and several molecular descriptors for a diverse data set of 1708 molecules. For ASMS (without using ClogP as a descriptor), the leave-one-out q(2) and root-mean-square error (RMSE) were 0.872 and 0.748 log unit, respectively. ASMS-LOGP was slightly better than ASMS (q(2) = 0.886, RMSE = 0.705). Both models were extensively validated by three cross-validation tests and encouraging predictability was achieved. High throughput aqueous solubility prediction was conducted for a number of data sets extracted from several widely used databases. We found that real drugs are about 20-fold more soluble than the so-called druglike molecules in the ZINC database, which have no violation of Lipinski's "Rule of 5" at all. Specifically, oral drugs are about 16-fold more soluble, while injection drugs are 50-60-fold more soluble. If the criterion of a molecule to be soluble is set to -5 log unit, about 85% of real drugs are predicted as soluble; in contrast only 50% of druglike molecules in ZINC are soluble. We concluded that the two models could be served as a rule in druglike analysis and an efficient filter in prioritizing compound libraries prior to high throughput screenings (HTS).

[1]  Tingjun Hou,et al.  ADME Evaluation in Drug Discovery. 4. Prediction of Aqueous Solubility Based on Atom Contribution Approach , 2004, J. Chem. Inf. Model..

[2]  John S. Delaney,et al.  ESOL: Estimating Aqueous Solubility Directly from Molecular Structure , 2004, J. Chem. Inf. Model..

[3]  Ruifeng Liu,et al.  Development of Quantitative Structure-Property Relationship Models for Early ADME Evaluation in Drug Discovery. 1. Aqueous Solubility , 2001, J. Chem. Inf. Comput. Sci..

[4]  J. Irwin,et al.  ZINC ? A Free Database of Commercially Available Compounds for Virtual Screening. , 2005 .

[5]  P. Kollman,et al.  Automatic atom type and bond type perception in molecular mechanical calculations. , 2006, Journal of molecular graphics & modelling.

[6]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[7]  S. Yalkowsky,et al.  Estimation of the aqueous solubility I: application to organic nonelectrolytes. , 2001, Journal of pharmaceutical sciences.

[8]  P. Kollman,et al.  Solvation Model Based on Weighted Solvent Accessible Surface Area , 2001 .

[9]  W. L. Jorgensen,et al.  Prediction of drug solubility from structure. , 2002, Advanced drug delivery reviews.

[10]  Marc Parham,et al.  Prediction of aqueous solubility based on large datasets using several QSPR models utilizing topological structure representation. , 2004, Chemistry & biodiversity.

[11]  R. W. Hansen,et al.  The price of innovation: new estimates of drug development costs. , 2003, Journal of health economics.

[12]  Hao Zhu,et al.  Estimation of the Aqueous Solubility of Organic Molecules by the Group Contribution Approach , 2001, J. Chem. Inf. Comput. Sci..

[13]  Peter C. Jurs,et al.  Prediction of Aqueous Solubility of Organic Compounds from Molecular Structure , 1998, J. Chem. Inf. Comput. Sci..

[14]  Peter C. Jurs,et al.  Prediction of Aqueous Solubility of Organic Compounds , 1994 .

[15]  Xiang-Qun Xie,et al.  Fast approaches for molecular polarizability calculations. , 2007, The journal of physical chemistry. A.

[16]  Igor V. Tetko,et al.  Estimation of Aqueous Solubility of Chemical Compounds Using E-State Indices , 2001, J. Chem. Inf. Comput. Sci..

[17]  Jarmo Huuskonen,et al.  Estimation of Aqueous Solubility for a Diverse Set of Organic Compounds Based on Molecular Topology , 2000, J. Chem. Inf. Comput. Sci..