Predicting human intestinal absorption of diverse chemicals using ensemble learning based QSAR modeling approaches

Human intestinal absorption (HIA) of the drugs administered through the oral route constitutes an important criterion for the candidate molecules. The computational approach for predicting the HIA of molecules may potentiate the screening of new drugs. In this study, ensemble learning (EL) based qualitative and quantitative structure-activity relationship (SAR) models (gradient boosted tree, GBT and bagged decision tree, BDT) have been established for the binary classification and HIA prediction of the chemicals, using the selected molecular descriptors. The structural diversity of the chemicals and the nonlinear structure in the considered data were tested by the similarity index and Brock-Dechert-Scheinkman statistics. The external predictive power of the developed SAR models was evaluated through the internal and external validation procedures recommended in the literature. All the statistical criteria parameters derived for the performance of the constructed SAR models were above their respective thresholds suggesting for their robustness for future applications. In complete data, the qualitative SAR models rendered classification accuracy of >99%, while the quantitative SAR models yielded correlation (R(2)) of >0.91 between the measured and predicted HIA values. The performances of the EL-based SAR models were also compared with the linear models (linear discriminant analysis, LDA and multiple linear regression, MLR). The GBT and BDT SAR models performed better than the LDA and MLR methods. A comparison of our models with the previously reported QSARs for HIA prediction suggested for their better performance. The results suggest for the appropriateness of the developed SAR models to reliably predict the HIA of structurally diverse chemicals and can serve as useful tools for the initial screening of the molecules in the drug development process.

[1]  R. Didziapetris,et al.  Ionization-specific analysis of human intestinal absorption. , 2009, Journal of pharmaceutical sciences.

[2]  L. Lin Assay Validation Using the Concordance Correlation Coefficient , 1992 .

[3]  Taravat Ghafourian,et al.  The impact of training set data distributions for modelling of passive intestinal absorption. , 2012, International journal of pharmaceutics.

[4]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[5]  Jörg Huwyler,et al.  Combinatorial QSAR modeling of human intestinal absorption. , 2011, Molecular pharmaceutics.

[6]  Tingjun Hou,et al.  ADME Evaluation in Drug Discovery, 8. The Prediction of Human Intestinal Absorption by a Support Vector Machine , 2007, J. Chem. Inf. Model..

[7]  Albert Y. Zomaya,et al.  A Review of Ensemble Methods in Bioinformatics , 2010, Current Bioinformatics.

[8]  Emilio Benfenati,et al.  The Expanding Role of Predictive Toxicology: An Update on the (Q)SAR Models for Mutagens and Carcinogens , 2007, Journal of environmental science and health. Part C, Environmental carcinogenesis & ecotoxicology reviews.

[9]  Roberto Todeschini,et al.  Comments on the Definition of the Q2 Parameter for QSAR Validation , 2009, J. Chem. Inf. Model..

[10]  Paola Gramatica,et al.  Real External Predictivity of QSAR Models. Part 2. New Intercomparable Thresholds for Different Validation Criteria and the Need for Scatter Plot Inspection , 2012, J. Chem. Inf. Model..

[11]  A. D. L. Nuez,et al.  Current methodology for the assessment of ADME-Tox properties on drug candidate molecules , 2008 .

[12]  Kunal Roy,et al.  Some case studies on application of “rm2” metrics for judging quality of quantitative structure–activity relationship predictions: Emphasis on scaling of response data , 2013, J. Comput. Chem..

[13]  M. Bermejo,et al.  In Silico Prediction of Caco‐2 Cell Permeability by a Classification QSAR Approach , 2011, Molecular informatics.

[14]  Rafael Pino-Mejías,et al.  Reduced bootstrap aggregating of learning algorithms , 2008, Pattern Recognit. Lett..

[15]  Y Vander Heyden,et al.  Evaluation of chromatographic descriptors for the prediction of gastro-intestinal absorption of drugs. , 2007, Journal of chromatography. A.

[16]  Ralph Kühne,et al.  External Validation and Prediction Employing the Predictive Squared Correlation Coefficient Test Set Activity Mean vs Training Set Activity Mean , 2008, J. Chem. Inf. Model..

[17]  Egon L. Willighagen,et al.  The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo-and Bioinformatics , 2003, J. Chem. Inf. Comput. Sci..

[18]  Aixia Yan,et al.  Prediction of Human Intestinal Absorption by GA Feature Selection and Support Vector Machine Regression , 2008, International journal of molecular sciences.

[19]  Jerzy Leszczynski,et al.  Using nano-QSAR to predict the cytotoxicity of metal oxide nanoparticles. , 2011, Nature nanotechnology.

[20]  G Beck,et al.  Evaluation of human intestinal absorption data and subsequent derivation of a quantitative structure-activity relationship (QSAR) with the Abraham descriptors. , 2001, Journal of pharmaceutical sciences.

[21]  Dinesh Mohan,et al.  Evaluating influences of seasonal variations and anthropogenic activities on alluvial groundwater hydrochemistry using ensemble learning approaches , 2014 .

[22]  N. Campillo,et al.  Neural computational prediction of oral drug absorption based on CODES 2D descriptors. , 2010, European journal of medicinal chemistry.

[23]  Jianzhong Liu,et al.  Prediction and mechanistic interpretation of human oral drug absorption using MI-QSAR analysis. , 2007, Molecular pharmaceutics.

[24]  D. E. Clark What has polar surface area ever done for drug discovery? , 2011, Future medicinal chemistry.

[25]  Matthew D. Segall,et al.  Gaussian Processes for Classification: QSAR Modeling of ADMET and Target Activity , 2010, J. Chem. Inf. Model..

[26]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[27]  Premanjali Rai,et al.  Predicting carcinogenicity of diverse chemicals using probabilistic neural network modeling approaches. , 2013, Toxicology and applied pharmacology.

[28]  B. LeBaron,et al.  A test for independence based on the correlation dimension , 1996 .

[29]  Shikha Gupta,et al.  Predicting acute aquatic toxicity of structurally diverse chemicals in fish using artificial intelligence approaches. , 2013, Ecotoxicology and environmental safety.

[30]  Jie Shen,et al.  Estimation of ADME Properties with Substructure Pattern Recognition , 2010, J. Chem. Inf. Model..

[31]  Kristina Luthman,et al.  Polar Molecular Surface Properties Predict the Intestinal Absorption of Drugs in Humans , 1997, Pharmaceutical Research.

[32]  Tingjun Hou,et al.  ADME Evaluation in Drug Discovery, 7. Prediction of Oral Absorption by Correlation and Classification , 2007, J. Chem. Inf. Model..

[33]  Weida Tong,et al.  QSAR Models Using a Large Diverse Set of Estrogens , 2001, J. Chem. Inf. Comput. Sci..

[34]  J. Friedman Stochastic gradient boosting , 2002 .

[35]  Peter C. Jurs,et al.  Prediction of Human Intestinal Absorption of Drug Compounds from Molecular Structure , 1998, J. Chem. Inf. Comput. Sci..

[36]  Lei Wang,et al.  QSPR Study of the Absorption Maxima of Azobenzene Dyes , 2011 .

[37]  Shikha Gupta,et al.  Nano-QSAR modeling for predicting biological activity of diverse nanomaterials , 2014 .

[38]  R. Saracci,et al.  Describing the validity of carcinogen screening tests. , 1979, British Journal of Cancer.

[39]  Dinesh Mohan,et al.  Multivariate statistical techniques for the evaluation of spatial and temporal variations in water quality of Gomti River (India)--a case study. , 2004, Water research.

[40]  X. Y. Zhang,et al.  Application of support vector machine (SVM) for prediction toxic activity of different data sets. , 2006, Toxicology.

[41]  Hadi Valizadeh,et al.  THE RELATION BETWEEN MOLECULAR PROPERTIES OF DRUGS AND THEIR TRANSPORT ACROSS THE INTESTINAL MEMBRANE , 2006 .

[42]  Miklos Feher,et al.  Rapid Prediction of Human Intestinal Absorption , 2002 .

[43]  A. Talevi,et al.  Prediction of drug intestinal absorption by new linear and non-linear QSPR. , 2011, European journal of medicinal chemistry.

[44]  Alexander Golbraikh,et al.  Development of kNN QSAR Models for 3-Arylisoquinoline Antitumor Agents , 2011 .

[45]  Tingjun Hou,et al.  ADME Evaluation in Drug Discovery, 6. Can Oral Bioavailability in Humans Be Effectively Predicted by Simple Molecular Property-Based Rules? , 2007, J. Chem. Inf. Model..