Is experimental data quality the limiting factor in predicting the aqueous solubility of druglike molecules?

We report the results of testing quantitative structure-property relationships (QSPR) that were trained upon the same druglike molecules but two different sets of solubility data: (i) data extracted from several different sources from the published literature, for which the experimental uncertainty is estimated to be 0.6-0.7 log S units (referred to mol/L); (ii) data measured by a single accurate experimental method (CheqSol), for which experimental uncertainty is typically <0.05 log S units. Contrary to what might be expected, the models derived from the CheqSol experimental data are not more accurate than those derived from the "noisy" literature data. The results suggest that, at the present time, it is the deficiency of QSPR methods (algorithms and/or descriptor sets), and not, as is commonly quoted, the uncertainty in the experimental measurements, which is the limiting factor in accurately predicting aqueous solubility for pharmaceutical molecules.

[1]  A. Noyes,et al.  The rate of solution of solid substances in their own solutions , 1897 .

[2]  L. Henderson CONCERNING THE RELATIONSHIP BETWEEN THE STRENGTH OF ACIDS AND THEIR CAPACITY TO PRESERVE NEUTRALITY , 1908 .

[3]  Helga Doser Über die Schmelzpunkte des Pantokains, Bromurals und Theophyllins , 1943 .

[4]  Lawrence J. Henderson,et al.  CONCERNING THE RELATIONSHIP BETWEEN THE STRENGTH OF ACIDS AND THEIR CAPACITY TO PRESERVE NEUTRALITY , 1908 .

[5]  S. Hagen,et al.  Quinolone antibacterial agents. Synthesis and structure-activity relationships of 8-substituted quinoline-3-carboxylic acids and 1,8-naphthyridine-3-carboxylic acids. , 1988, Journal of medicinal chemistry.

[6]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings , 1997 .

[7]  A. Florence,et al.  The Solubility of Drugs , 1998 .

[8]  Samuel H. Yalkowsky,et al.  Solubility and Solubilization in Aqueous Media , 1999 .

[9]  C. Lipinski Drug-like properties and the causes of poor solubility and poor permeability. , 2000, Journal of pharmacological and toxicological methods.

[10]  S. Yalkowsky,et al.  Estimation of the aqueous solubility I: application to organic nonelectrolytes. , 2001, Journal of pharmaceutical sciences.

[11]  Samuel H. Yalkowsky,et al.  Prediction of Drug Solubility by the General Solubility Equation (GSE) , 2001, J. Chem. Inf. Comput. Sci..

[12]  Neera Jain,et al.  Prediction of Aqueous Solubility of Organic Compounds by the General Solubility Equation (GSE) , 2001, J. Chem. Inf. Comput. Sci..

[13]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[14]  Yi Li,et al.  Prediction of aqueous solubility of organic compounds using a quantitative structure-property relationship. , 2002, Journal of pharmaceutical sciences.

[15]  W. L. Jorgensen,et al.  Prediction of drug solubility from structure. , 2002, Advanced drug delivery reviews.

[16]  G. Milne Drugs: Synonyms and Properties , 2002 .

[17]  S. Yalkowsky,et al.  Estimation of aqueous solubility of organic compounds by using the general solubility equation. , 2002, Chemosphere.

[18]  Xiaoyang Xia,et al.  High-throughput logP measurement using parallel liquid chromatography/ultraviolet/mass spectrometry and sample-pooling. , 2002, Rapid communications in mass spectrometry : RCM.

[19]  T. Threlfall,et al.  Structural and thermodynamic explanations of Ostwald's rule , 2003 .

[20]  H. van de Waterbeemd,et al.  ADMET in silico modelling: towards prediction paradise? , 2003, Nature reviews. Drug discovery.

[21]  W. Patrick Walters,et al.  A guide to drug discovery: Designing screens: how to make your hits a hit , 2003, Nature Reviews Drug Discovery.

[22]  Alex Avdeef,et al.  Absorption and Drug Development: Solubility, Permeability, and Charge State , 2003 .

[23]  H. Kubinyi Drug research: myths, hype and reality , 2003, Nature Reviews Drug Discovery.

[24]  AC Moffat,et al.  Clarke's analysis of drugs and poisons , 2003 .

[25]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[26]  S. Yalkowsky,et al.  Handbook of aqueous solubility data , 2003 .

[27]  Ulf Norinder,et al.  Global and Local Computational Models for Aqueous Solubility Prediction of Drug-Like Molecules , 2004, J. Chem. Inf. Model..

[28]  J. Muchowski,et al.  Synthesis of Miconazole and Analogs Through a Carbenoid Intermediate , 2004 .

[29]  W. L. Jorgensen The Many Roles of Computation in Drug Discovery , 2004, Science.

[30]  Ulf Norinder,et al.  Experimental and Computational Screening Models for Prediction of Aqueous Drug Solubility , 2002, Pharmaceutical Research.

[31]  Abu T M Serajuddin,et al.  Trends in solubility of polymorphs. , 2005, Journal of pharmaceutical sciences.

[32]  S. Yalkowsky,et al.  Comparison of the octanol/water partition coefficients calculated by ClogP, ACDlogP and KowWin to experimentally determined values. , 2005, International journal of pharmaceutics.

[33]  Hua Gao,et al.  Linear and Nonlinear Methods in Modeling the Aqueous Solubility of Organic Compounds , 2005, J. Chem. Inf. Model..

[34]  Christel A. S. Bergström,et al.  Contribution of solid-state properties to the aqueous solubility of drugs. , 2006, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[35]  G. V. Paolini,et al.  Global mapping of pharmacological space , 2006, Nature Biotechnology.

[36]  I. Tetko,et al.  In silico approaches to prediction of aqueous and DMSO solubility of drug-like compounds: trends, problems and solutions. , 2006, Current medicinal chemistry.

[37]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[38]  Jonathan M Goodman,et al.  Diclofenac solubility: independent determination of the intrinsic solubility of three crystal forms. , 2007, Journal of medicinal chemistry.

[39]  Robert C. Glen,et al.  Random Forest Models To Predict Aqueous Solubility , 2007, J. Chem. Inf. Model..

[40]  S. Venkatesh,et al.  Aqueous and cosolvent solubility data for drug-like organic compounds , 2005, The AAPS Journal.

[41]  John B. O. Mitchell,et al.  Predicting intrinsic aqueous solubility by a thermodynamic cycle. , 2008, Molecular pharmaceutics.

[42]  John B. O. Mitchell,et al.  Simultaneous feature selection and parameter optimisation using an artificial ant colony: case study of melting point prediction , 2008, Chemistry Central journal.

[43]  Florian Nigsch,et al.  Why Are Some Properties More Difficult To Predict than Others? A Study of QSPR Models of Solubility, Melting Point, and Log P , 2008, J. Chem. Inf. Model..

[44]  Robert C. Glen,et al.  Solubility Challenge: Can You Predict Solubilities of 32 Molecules Using a Database of 100 Reliable Measurements? , 2008, J. Chem. Inf. Model..

[45]  Jonathan M Goodman,et al.  Polymorph control: past, present and future. , 2008, Drug discovery today.

[46]  Emilio Xavier Esposito,et al.  Findings of the Challenge To Predict Aqueous Solubility , 2009, J. Chem. Inf. Model..

[47]  Sture Nordholm,et al.  In silico prediction of drug solubility: 4. Will simple potentials suffice? , 2009, J. Comput. Chem..

[48]  J. Huuskonen,et al.  Estimation of Aqueous Solubility for a Diverse Set of Organic Compounds Based on Molecular Topology. , 2010 .

[49]  Maxim V Fedorov,et al.  Accurate calculations of the hydration free energies of druglike molecules using the reference interaction site model. , 2010, The Journal of chemical physics.

[50]  Maxim V Fedorov,et al.  Towards a universal method for calculating hydration free energies: a 3D reference interaction site model with partial molar volume correction , 2010, Journal of physics. Condensed matter : an Institute of Physics journal.

[51]  David S. Palmer,et al.  Accurate calculation of the hydration free energies of biologically active molecules using the reference interaction site model , 2011 .

[52]  Maxim V Fedorov,et al.  Hydration thermodynamics using the reference interaction site model: speed or accuracy? , 2011, The journal of physical chemistry. B.

[53]  Maxim V Fedorov,et al.  Toward a universal model to calculate the solvation thermodynamics of druglike molecules: the importance of new experimental databases. , 2011, Molecular pharmaceutics.

[54]  David S. Palmer,et al.  In silico screening of bioactive and biomimetic solutes using Integral Equation Theory. , 2011, Current pharmaceutical design.

[55]  Claire S. Adjiman,et al.  Towards crystal structure prediction of complex organic compounds – a report on the fifth blind test , 2011, Acta crystallographica. Section B, Structural science.

[56]  Wei Yang,et al.  The Structure, Thermodynamics and Solubility of Organic Crystals from Simulation with a Polarizable Force Field. , 2012, Journal of chemical theory and computation.

[57]  John B. O. Mitchell,et al.  First-Principles Calculation of the Intrinsic Aqueous Solubility of Crystalline Druglike Molecules. , 2012, Journal of chemical theory and computation.

[58]  Tu C Le,et al.  Aqueous solubility prediction: do crystal lattice interactions help? , 2013, Molecular pharmaceutics.

[59]  F. Martínez,et al.  Solubility and solution thermodynamics of sulfamerazine and sulfamethazine in some ethanol + water mixtures , 2013 .

[60]  J. Dearden,et al.  The intrinsic aqueous solubility of indomethacin , 2014 .