Simultaneous Prediction of four ATP‐binding Cassette Transporters’ Substrates Using Multi‐label QSAR

Efflux by the ATP‐binding cassette (ABC) transporters affects the pharmacokinetic profile of drugs and it has been implicated in drug‐drug interactions as well as its major role in multi‐drug resistance in cancer. It is therefore important for the pharmaceutical industry to be able to understand what phenomena rule ABC substrate recognition. Considering a high degree of substrate overlap between various members of ABC transporter family, it is advantageous to employ a multi‐label classification approach where predictions made for one transporter can be used for modeling of the other ABC transporters. Here, we present decision tree‐based QSAR classification models able to simultaneously predict substrates and non‐substrates for BCRP1, P‐gp/MDR1 and MRP1 and MRP2, using a dataset of 1493 compounds. To this end, two multi‐label classification QSAR modelling approaches were adopted: Binary Relevance (BR) and Classifier Chain (CC). Even though both multi‐label models yielded similar predictive performances in terms of overall accuracies (close to 70 %), the CC model overcame the problem of skewed performance towards identifying substrates compared with non‐substrates, which is a common problem in the literature. The models were thoroughly validated by using external testing, applicability domain and activity cliffs characterization. In conclusion, a multi‐label classification approach is an appropriate alternative for the prediction of ABC efflux.

[1]  Balázs Sarkadi,et al.  The role of ABC transporters in drug absorption, distribution, metabolism, excretion and toxicity (ADME-Tox). , 2008, Drug discovery today.

[2]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[3]  Igor V. Tetko,et al.  Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process , 2014, Journal of Cheminformatics.

[4]  Daniela Digles,et al.  Computational models for predicting the interaction with ABC transporters. , 2014, Drug discovery today. Technologies.

[5]  Dimitris K. Agrafiotis,et al.  Developing Best Practices for Descriptor‐Based Property Prediction: Appropriate Matching of Datasets, Descriptors, Methods, and Expectations , 2012 .

[6]  Sean Ekins,et al.  Using Open Source Computational Tools for Predicting Human Metabolic Stability and Additional Absorption, Distribution, Metabolism, Excretion, and Toxicity Properties , 2010, Drug Metabolism and Disposition.

[7]  R. Beroukhim,et al.  Retrospective study of dasatinib for recurrent glioblastoma after bevacizumab failure , 2011, Journal of Neuro-Oncology.

[8]  Sebastian Stüker An automatic system for the simultaneous translation of lectures , 2014, Journal of Cheminformatics.

[9]  Horvath Dragos,et al.  Predicting the predictability: a unified approach to the applicability domain problem of QSAR models. , 2009, Journal of chemical information and modeling.

[10]  K. Tihanyi,et al.  Cell-based models of blood-brain barrier penetration. , 2011, Therapeutic delivery.

[11]  S. Kates,et al.  ADME (Absorption, Distribution, Metabolism, Excretion): The Real Meaning—Avoiding Disaster and Maintaining Efficacy for Preclinical Candidates , 2012 .

[12]  Melvin E. Andersen,et al.  Preclinical Development Handbook: ADME and Biopharmaceutical Properties , 2007 .

[13]  Juan José del Coz,et al.  Binary relevance efficacy for multilabel classification , 2012, Progress in Artificial Intelligence.

[14]  Jürgen Bajorath,et al.  Comprehensive Analysis of Single‐ and Multi‐Target Activity Cliffs Formed by Currently Available Bioactive Compounds , 2011, Chemical biology & drug design.

[15]  Andreas Bender,et al.  Metrabase: a cheminformatics and bioinformatics database for small molecule transporter data analysis and (Q)SAR modeling , 2015, Journal of Cheminformatics.

[16]  W. Elmquist,et al.  Pharmacokinetic Assessment of Efflux Transport in Sunitinib Distribution to the Brain , 2013, The Journal of Pharmacology and Experimental Therapeutics.

[17]  M. Jamei,et al.  Variability in P-Glycoprotein Inhibitory Potency (IC50) Using Various in Vitro Experimental Systems: Implications for Universal Digoxin Drug-Drug Interaction Risk Assessment Decision Criteria , 2013, Drug Metabolism and Disposition.

[18]  Newton Spolaôr,et al.  A Comparison of Multi-label Feature Selection Methods using the Problem Transformation Approach , 2013, CLEI Selected Papers.

[19]  Lisa Harris,et al.  Partial Charge Calculation Method Affects CoMFA QSAR Prediction Accuracy , 2009, J. Chem. Inf. Model..

[20]  H. J. Mclaughlin,et al.  Learn , 2002 .

[21]  Gerald M. Maggiora,et al.  On Outliers and Activity Cliffs-Why QSAR Often Disappoints , 2006, J. Chem. Inf. Model..

[22]  Geoff Holmes,et al.  Classifier Chains for Multi-label Classification , 2009, ECML/PKDD.

[23]  Ulf Norinder,et al.  Identification of Novel Specific and General Inhibitors of the Three Major Human ATP-Binding Cassette Transporters P-gp, BCRP and MRP2 Among Registered Drugs , 2009, Pharmaceutical Research.

[24]  Ian A. Watson,et al.  Integration of in silico and in vitro tools for scaffold optimization during drug discovery: predicting P-glycoprotein efflux. , 2013, Molecular pharmaceutics.

[25]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[26]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[27]  A. Tropsha,et al.  Human Intestinal Transporter Database: QSAR Modeling and Virtual Profiling of Drug Uptake, Efflux and Interactions , 2013, Pharmaceutical Research.

[28]  Alex Alves Freitas,et al.  Coping with Unbalanced Class Data Sets in Oral Absorption Models , 2013, J. Chem. Inf. Model..

[29]  Jürgen Bajorath,et al.  Exploring activity cliffs in medicinal chemistry. , 2012, Journal of medicinal chemistry.

[30]  M. Shahlaei Descriptor selection methods in quantitative structure-activity relationship studies: a review study. , 2013, Chemical reviews.

[31]  Igor V. Tetko,et al.  Critical Assessment of QSAR Models of Environmental Toxicity against Tetrahymena pyriformis: Focusing on Applicability Domain and Overfitting by Variable Selection , 2008, J. Chem. Inf. Model..

[32]  Fabio Broccatelli,et al.  QSAR Models for P-Glycoprotein Transport Based on a Highly Consistent Data Set , 2012, J. Chem. Inf. Model..

[33]  Verónica Bolón-Canedo,et al.  A review of feature selection methods on synthetic data , 2013, Knowledge and Information Systems.

[34]  Igor V. Tetko,et al.  Applicability Domains for Classification Problems: Benchmarking of Distance to Models for Ames Mutagenicity Set , 2010, J. Chem. Inf. Model..

[35]  Gerhard F Ecker,et al.  Ensemble Rule‐Based Classification of Substrates of the Human ABC‐Transporter ABCB1 Using Simple Physicochemical Descriptors , 2010, Molecular informatics.

[36]  Igor V. Tetko,et al.  Development of Dimethyl Sulfoxide Solubility Models Using 163 000 Molecules: Using a Domain Applicability Metric to Select More Reliable Predictions , 2013, J. Chem. Inf. Model..

[37]  F. Van Bambeke,et al.  ABC multidrug transporters: target for modulation of drug pharmacokinetics and drug-drug interactions. , 2011, Current drug targets.

[38]  Preeti Iyer,et al.  Activity Landscapes, Information Theory, and Structure – Activity Relationships , 2013, Molecular informatics.

[39]  I. Holen,et al.  Multidrug Resistance in Breast Cancer: From In Vitro Models to Clinical Studies , 2011, International journal of breast cancer.

[40]  Barbara Zdrazil,et al.  Selectivity profiling of BCRP versus P-gp inhibition: from automated collection of polypharmacology data to multi-label learning , 2016, Journal of Cheminformatics.

[41]  O. Legrand,et al.  Simultaneous activity of MRP1 and Pgp is correlated with in vitro resistance to daunorubicin and with in vivo resistance in adult acute myeloid leukemia. , 1999, Blood.

[42]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..