Prediction of aqueous solubility of drug-like molecules using a novel algorithm for automatic adjustment of relative importance of descriptors implemented in counter-propagation artificial neural networks.

In this work, we present a novel approach for the development of models for prediction of aqueous solubility, based on the implementation of an algorithm for the automatic adjustment of descriptor's relative importance (AARI) in counter-propagation artificial neural networks (CPANN). Using this approach, the interpretability of the models based on artificial neural networks, which are traditionally considered as "black box" models, was significantly improved. For the development of the model, a data set consisting of 374 diverse drug-like molecules, divided into training (n=280) and test (n=94) sets using self-organizing maps, was used. Heuristic method was applied in preselecting a small number of the most significant descriptors to serve as inputs for CPANN training. The performances of the final model based on 7 descriptors for prediction of solubility were satisfactory for both training (RMSEP(train)=0.668) and test set (RMSEP(test)=0.679). The model was found to be a highly interpretable in terms of solubility, as well as rationalizing structural features that could have an impact on the solubility of the compounds investigated. Therefore, the proposed approach can significantly enhance model usability by giving guidance for structural modifications of compounds with the aim of improving solubility in the early phase of drug discovery.

[1]  Johann Gasteiger,et al.  Linear and nonlinear functions on modeling of aqueous solubility of organic compounds by two structure representation methods , 2004, J. Comput. Aided Mol. Des..

[2]  Alan Talevi,et al.  New QSPR study for the prediction of aqueous solubility of drug-like compounds. , 2008, Bioorganic & medicinal chemistry.

[3]  J. Zupan,et al.  Prediction of selectivity of alpha1-adrenergic antagonists by counterpropagation neural network (CP-ANN). , 2004, Il Farmaco.

[4]  J. Dearden In silico prediction of aqueous solubility , 2006, Expert opinion on drug discovery.

[5]  Hua Gao,et al.  Linear and Nonlinear Methods in Modeling the Aqueous Solubility of Organic Compounds , 2005, J. Chem. Inf. Model..

[6]  P. Ertl,et al.  Computational approaches to determine drug solubility. , 2007, Advanced drug delivery reviews.

[7]  W. Curatolo,et al.  Physical chemical properties of oral drug candidates in the discovery and exploratory development settings , 1998 .

[8]  Shobha N. Bhattachar,et al.  Solubility: it's not just for physical chemists. , 2006, Drug discovery today.

[9]  Yi Li,et al.  Prediction of aqueous solubility of organic compounds using a quantitative structure-property relationship. , 2002, Journal of pharmaceutical sciences.

[10]  Ruisheng Zhang,et al.  QSAR Models for the Prediction of Binding Affinities to Human Serum Albumin Using the Heuristic Method and a Support Vector Machine. , 2004 .

[11]  Johann Gasteiger,et al.  Neural networks in chemistry and drug design , 1999 .

[12]  Johann Gasteiger,et al.  A combined application of two different neural network types for the prediction of chemical reactivity , 1993 .

[13]  Igor V. Tetko,et al.  Estimation of Aqueous Solubility of Chemical Compounds Using E-State Indices , 2001, J. Chem. Inf. Comput. Sci..

[14]  Jarmo Huuskonen,et al.  Estimation of Aqueous Solubility for a Diverse Set of Organic Compounds Based on Molecular Topology , 2000, J. Chem. Inf. Comput. Sci..

[15]  Judith C. Madden,et al.  In Silico Prediction of Aqueous Solubility: The Solubility Challenge , 2009, J. Chem. Inf. Model..

[16]  I. Tetko,et al.  In silico approaches to prediction of aqueous and DMSO solubility of drug-like compounds: trends, problems and solutions. , 2006, Current medicinal chemistry.

[17]  S. Venkatesh,et al.  Aqueous and cosolvent solubility data for drug-like organic compounds , 2005, The AAPS Journal.

[18]  Thomas Roß,et al.  Feature selection for optimized skin tumor recognition using genetic algorithms , 1999, Artif. Intell. Medicine.

[19]  Ulf Norinder,et al.  Experimental and Computational Screening Models for Prediction of Aqueous Drug Solubility , 2002, Pharmaceutical Research.

[20]  H. van de Waterbeemd,et al.  ADMET in silico modelling: towards prediction paradise? , 2003, Nature reviews. Drug discovery.

[22]  Marjana Novič,et al.  Counter-propagation neural networks in Matlab , 2008 .

[23]  Marjana Novic,et al.  Automatic adjustment of the relative importance of different input variables for optimization of counter-propagation artificial neural networks. , 2009, Analytica chimica acta.

[24]  M. Novič,et al.  Counter-propagation artificial neural networks as a tool for prediction of pKBH+ for series of amides , 2010 .

[25]  Farhad Gharagheizi,et al.  QSPR Studies for Solubility Parameter by Means of Genetic Algorithm-Based Multivariate Linear Regression and Generalized Regression Neural Network , 2008 .

[26]  D. Manallack,et al.  Prediction of drug solubility from molecular structure using a drug-like training set , 2008, SAR and QSAR in environmental research.

[27]  Christel A. S. Bergström,et al.  In silico predictions of drug solubility and permeability: two rate-limiting barriers to oral drug absorption. , 2005, Basic & clinical pharmacology & toxicology.

[28]  E. Delgado Predicting aqueous solubility of chlorinated hydrocarbons from molecular structure , 2002 .

[29]  Kenneth M Merz,et al.  Prediction of aqueous solubility of a diverse set of compounds using quantitative structure-property relationships. , 2003, Journal of medicinal chemistry.

[30]  J. Comer,et al.  Equilibrium versus kinetic measurements of aqueous solubility, and the ability of compounds to supersaturate in solution--a validation study. , 2006, Journal of pharmaceutical sciences.

[31]  Tingjun Hou,et al.  Development of Reliable Aqueous Solubility Models and Their Application in Druglike Analysis , 2007, J. Chem. Inf. Model..

[32]  Florian Nigsch,et al.  Why Are Some Properties More Difficult To Predict than Others? A Study of QSPR Models of Solubility, Melting Point, and Log P , 2008, J. Chem. Inf. Model..

[33]  Zhide Hu,et al.  Prediction of pKa for Neutral and Basic Drugs Based on Radial Basis Function Neural Networks and the Heuristic Method , 2005, Pharmaceutical Research.

[34]  Li Di,et al.  Profiling drug-like properties in discovery research. , 2003, Current opinion in chemical biology.

[35]  David W. Miller,et al.  Busting the Black Box Myth: Designing Out Unwanted ADMET Properties with Machine Learning Approaches , 2009 .

[36]  H. X. Liu,et al.  The prediction of human oral absorption for diffusion rate-limited drugs based on heuristic method and support vector machine , 2005, J. Comput. Aided Mol. Des..

[37]  Tingjun Hou,et al.  Aqueous Solubility Prediction Based on Weighted Atom Type Counts and Solvent Accessible Surface Areas , 2009, J. Chem. Inf. Model..

[38]  Gary B. Fogel,et al.  Quantitative structure-property relationships for drug solubility prediction using evolved neural networks , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[39]  Riccardo Leardi,et al.  Genetic Algorithms as a Tool for Wavelength Selection in Multivariate Calibration , 1995 .

[40]  Stephen R. Johnson,et al.  Recent progress in the computational prediction of aqueous solubility and absorption , 2006, The AAPS Journal.

[41]  Ulf Norinder,et al.  Global and Local Computational Models for Aqueous Solubility Prediction of Drug‐Like Molecules. , 2004 .

[42]  J. Delaney Predicting aqueous solubility from structure. , 2005, Drug discovery today.

[43]  W. L. Jorgensen,et al.  Prediction of drug solubility from structure. , 2002, Advanced drug delivery reviews.

[44]  Jure Zupan,et al.  Kohonen and counterpropagation artificial neural networks in analytical chemistry , 1997 .

[45]  Kimito Funatsu,et al.  GA Strategy for Variable Selection in QSAR Studies: GA-Based PLS Analysis of Calcium Channel Antagonists , 1997, J. Chem. Inf. Comput. Sci..

[46]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[47]  Emilio Xavier Esposito,et al.  Findings of the Challenge To Predict Aqueous Solubility , 2009, J. Chem. Inf. Model..

[48]  S. Stegemann,et al.  When poor solubility becomes an issue: from early stage to proof of concept. , 2007, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[49]  J. Comer,et al.  Using measured pKa, LogP and solubility to investigate supersaturation and predict BCS class. , 2008, Current drug metabolism.

[50]  U. Norinder,et al.  In Silico Predictions of Solubility , 2007 .

[51]  M Karplus,et al.  Evolutionary optimization in quantitative structure-activity relationship: an application of genetic neural networks. , 1996, Journal of medicinal chemistry.

[52]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[53]  Igor Kuzmanovski,et al.  Prediction of toxicity and data exploratory analysis of estrogen-active endocrine disruptors using counter-propagation artificial neural networks. , 2010, Journal of molecular graphics & modelling.

[54]  R. Leardi,et al.  Genetic algorithms applied to feature selection in PLS regression: how and when to use them , 1998 .

[55]  Robert C. Glen,et al.  Random Forest Models To Predict Aqueous Solubility , 2007, J. Chem. Inf. Model..

[56]  T. Ghafourian,et al.  Estimation of drug solubility in water, PEG 400 and their binary mixtures using the molecular structures of solutes. , 2010, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[57]  J. Dressman,et al.  Influence of physicochemical properties on dissolution of drugs in the gastrointestinal tract. , 1997, Advanced drug delivery reviews.

[58]  Paul J. Gemperline,et al.  Wavelength selection and optimization of pattern recognition methods using the genetic algorithm , 2000 .

[59]  K. Box,et al.  Chasing equilibrium: measuring the intrinsic solubility of weak acids and bases. , 2005, Analytical chemistry.

[60]  K. Varmuza,et al.  Feature selection by genetic algorithms for mass spectral classifiers , 2001 .

[61]  Anne Hersey,et al.  Rate-Limited Steps of Human Oral Absorption and QSAR Studies , 2002, Pharmaceutical Research.