Mechanistically transparent models for predicting aqueous solu¬bility of rigid, slightly flexible, and very flexible drugs (MW<2000) Accuracy near that of random forest regression Alex Avdeef

Yalkowsky’s General Solubility Equation (GSE), with its three fixed constants, is popular and easy to apply, but is not very accurate for polar, zwitterionic, or flexible molecules. This review examines the findings of a series of studies, where we have sought to come up with a better prediction model, by comparing the performances of the GSE to Abraham’s Solvation Equation (ABSOLV), and Random Forest regression (RFR) machine-learning (ML) method. Large, well-curated aqueous intrinsic solubility databases are available. However, drugs may be sparsely distributed in chemical space, concentrated in clusters. Even a large database might overlook some regions. Test compounds from under-represented portions of space may be poorly predicted, as might be the case with the ‘loose’ set of 32 drugs in the Second Solubility Challenge (2020). There appears to be still a need for better coverage of drug space. Increasingly, current trends in predictions of solubility use calculated input descriptors, which may be an advantage for exploring properties of molecules yet to be synthesized. The risk may be that overall prediction approaches might be based on accumulated uncertainty. The increasing use of ML/AI methods can lead to accurate predictions, but such predictions may not readily suggest the strategies to pursue in selecting yet-to-be-synthesized compounds. Based on our latest findings, we recommend predictions based on both ‘grouped’ ABSOLV(GRP) and ‘Flexible Acceptor’ GSE(Φ,B) models with the provided best-fit parameters, where Φ is the Kier molecular flexibility index and B is the Abraham H-bond acceptor strength. For molecules with Φ < 11, the prudent choice is to pick the Consensus Model, the average of ABSOLV(GRP) and GSE(Φ,B). For more flexible molecules, GSE(Φ,B) is recommended.

[1]  S. Sild,et al.  Intrinsic Aqueous Solubility: Mechanistically Transparent Data-Driven Modeling of Drug Substances , 2022, Pharmaceutics.

[2]  M. Kansy,et al.  Trends in PhysChem Properties of Newly Approved Drugs over the Last Six Years; Predicting Solubility of Drugs Approved in 2021 , 2022, Journal of Solution Chemistry.

[3]  G. Caron,et al.  Designing Soluble PROTACs: Strategies and Preliminary Guidelines , 2022, Journal of medicinal chemistry.

[4]  M. Kansy,et al.  Predicting Solubility of Newly-Approved Drugs (2016–2020) with a Simple ABSOLV and GSE(Flexible-Acceptor) Consensus Model Outperforming Random Forest Regression , 2022, Journal of Solution Chemistry.

[5]  A. Bauer-Brandl,et al.  UNGAP best practice for improving solubility data quality of orally administered drugs. , 2021, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[6]  A. Avdeef Do you know your r2? , 2020, ADMET & DMPK.

[7]  Ioana Oprisiu,et al.  Findings of the Second Challenge to Predict Aqueous Solubility , 2020, J. Chem. Inf. Model..

[8]  M. Kansy,et al.  'Flexible-acceptor' General Solubility Equation for 'beyond Rule of 5' Drugs. , 2020, Molecular pharmaceutics.

[9]  Vasanthanathan Poongavanam,et al.  Solubility prediction in the bRo5 chemical space: where are we right now? , 2020, ADMET & DMPK.

[10]  M. Kansy,et al.  Can small drugs predict the intrinsic aqueous solubility of ‘beyond Rule of 5’ big drugs? , 2020, ADMET & DMPK.

[11]  A. Avdeef Prediction of aqueous intrinsic solubility of druglike molecules using Random Forest regression trained with Wiki-pS0 database , 2020, ADMET & DMPK.

[12]  M. Shalaeva,et al.  Updating the portfolio of physicochemical descriptors related to permeability in the Beyond the Rule of 5 chemical space. , 2020, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[13]  Giulia Caron,et al.  Flexibility in early drug discovery: focus on the beyond-Rule-of-5 chemical space. , 2020, Drug discovery today.

[14]  Ardita Veseli,et al.  A review of methods for solubility determination in biopharmaceutical drug characterization , 2019, Drug development and industrial pharmacy.

[15]  Giulia Caron,et al.  Intramolecular hydrogen bonding: An opportunity for improved design in medicinal chemistry , 2019, Medicinal research reviews.

[16]  T. Fujita,et al.  Harmonizing solubility measurement to lower inter-laboratory variance – progress of consortium of biopharmaceutical tools (CoBiTo) in Japan , 2019, ADMET & DMPK.

[17]  A. Avdeef Multi-lab intrinsic solubility measurement reproducibility in CheqSol and shake-flask methods , 2019, ADMET & DMPK.

[18]  Antonio Llinas,et al.  Solubility Challenge Revisited after Ten Years, with Multilab Shake-Flask Data, Using Tight (SD ∼ 0.17 log) and Loose (SD ∼ 0.62 log) Test Sets , 2019, J. Chem. Inf. Model..

[19]  M. Shalaeva,et al.  Experimental lipophilicity for beyond Rule of 5 compounds , 2019, Future Drug Discovery.

[20]  S. Yalkowsky,et al.  Comments on prediction of the aqueous solubility using the general solubility equation (GSE) versus a genetic algorithm and a support vector machine model , 2018, Pharmaceutical development and technology.

[21]  M. Wendt,et al.  Beyond the Rule of 5: Lessons Learned from AbbVie's Drugs and Compound Collection. , 2017, Journal of medicinal chemistry.

[22]  A. Avdeef,et al.  Equilibrium solubility measurement of ionizable drugs - consensus recommendations for improving data quality , 2016 .

[23]  A. Avdeef Solubility Temperature Dependence Predicted from 2D Structure , 2015 .

[24]  A. Avdeef Suggested Improvements for Measurement of Equilibrium Solubility-pH of Ionizable Drugs , 2015 .

[25]  Fabrizio Giordanetto,et al.  Oral druggable space beyond the rule of 5: insights from drugs and clinical candidates. , 2014, Chemistry & biology.

[26]  John B. O. Mitchell,et al.  Is experimental data quality the limiting factor in predicting the aqueous solubility of druglike molecules? , 2014, Molecular pharmaceutics.

[27]  W. Patrick Walters,et al.  WHAT ARE OUR MODELS REALLY TELLING US? A PRACTICAL TUTORIAL ON AVOIDING COMMON MISTAKES WHEN BUILDING PREDICTIVE MODELS , 2013 .

[28]  Asher Mullard,et al.  2011 FDA drug approvals , 2012, Nature Reviews Drug Discovery.

[29]  A. Avdeef,et al.  Biorelevant pK(a) (37 °C) predicted from the 2D structure of the molecule and its pK(a) at 25 °C. , 2011, Journal of pharmaceutical and biomedical analysis.

[30]  Michael H. Abraham,et al.  Scales of solute hydrogen-bonding: their construction and application to physicochemical and biochemical processes , 2010 .

[31]  S. Yalkowsky,et al.  Handbook of Aqueous Solubility Data, Second Edition , 2010 .

[32]  R. Glen,et al.  Solubility Challenge: Can You Predict Solubilities of 32 Molecules Using a Database of 100 Reliable Measurements? , 2008, J. Chem. Inf. Model..

[33]  Neera Jain,et al.  Estimation of the aqueous solubility of weak electrolytes. , 2006, International journal of pharmaceutics.

[34]  L. Breiman Random Forests , 2001, Encyclopedia of Machine Learning and Data Mining.

[35]  Neera Jain,et al.  Prediction of Aqueous Solubility of Organic Compounds by the General Solubility Equation (GSE) , 2001, J. Chem. Inf. Comput. Sci..

[36]  C. Lipinski Drug-like properties and the causes of poor solubility and poor permeability. , 2000, Journal of pharmacological and toxicological methods.

[37]  M. Abraham,et al.  The correlation and prediction of the solubility of compounds in water using an amended solvation energy relationship. , 1999, Journal of pharmaceutical sciences.

[38]  James A. Platts,et al.  Estimation of Molecular Linear Free Energy Relation Descriptors Using a Group Contribution Approach , 1999, J. Chem. Inf. Comput. Sci..

[39]  S. Yalkowsky,et al.  Solubility and partitioning I: Solubility of nonelectrolytes in water. , 1980, Journal of pharmaceutical sciences.

[40]  Emilio Xavier Esposito,et al.  Findings of the Challenge To Predict Aqueous Solubility , 2009, J. Chem. Inf. Model..

[41]  Florian Nigsch,et al.  Why Are Some Properties More Difficult To Predict than Others? A Study of QSPR Models of Solubility, Melting Point, and Log P , 2008, J. Chem. Inf. Model..

[42]  Robert C. Glen,et al.  Random Forest Models To Predict Aqueous Solubility , 2007, J. Chem. Inf. Model..

[43]  S. Yalkowsky,et al.  Handbook of aqueous solubility data , 2003 .

[44]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[45]  Sujit Banerjee,et al.  Aqueous solubility : methods of estimation for organic compounds , 1992 .

[46]  Lemont B. Kier,et al.  An Index of Molecular Flexibility from Kappa Shape Attributes , 1989 .