Development of QSAR machine learning-based models to forecast the effect of substances on malignant melanoma cells

SK-MEL-5 is a human melanoma cell line that has been used in various studies to explore new therapies against melanoma in different in vitro experiments. Based on this study we report on the development of quantitative structure-activity relationship (QSAR) models able to predict the cytotoxic effect of diverse chemical compounds on this cancer cell line. The dataset of cytotoxic and inactive compounds were downloaded from the PubChem database. It contains the data for all chemical compounds for which cytotoxicity results expressed by GI50 was recorded. In total 13 blocks of molecular descriptors were computed and used, after appropriate pre-processing in building QSAR models with four machine learning classifiers: Random forest (RF), gradient boosting, support vector machine and random k-nearest neighbors. Among the 186 models reported none had a positive predictive value (PPV) higher than 0.90 in both nested cross-validation and on an external dataset testing, but 7 models had a PPV higher than 0.85 in both evaluations, all seven using the RFs algorithm as a classifier, and topological descriptors, information indices, 2D-autocorrelation descriptors, P-VSA-like descriptors, and edge-adjacency descriptors as sets of features used for classification. The y-scrambling test was associated with considerably worse performance (confirming the non-random character of the models) and the applicability domain was assessed through three different methods.

[1]  J. Dearden The Use of Topological Indices in QSAR and QSPR Modeling , 2017 .

[2]  J L Sebaugh,et al.  Guidelines for accurate EC50/IC50 estimation , 2011, Pharmaceutical statistics.

[3]  E. Souto,et al.  Repurposing itraconazole to the benefit of skin cancer treatment: A combined azole-DDAB nanoencapsulation strategy. , 2018, Colloids and surfaces. B, Biointerfaces.

[4]  Sheng-Ping L. Hwang,et al.  A novel stilbene‐like compound that inhibits melanoma growth by regulating melanocyte differentiation and proliferation , 2017, Toxicology and applied pharmacology.

[5]  Saloni,et al.  Molecular docking, QSAR and ADMET studies of withanolide analogs against breast cancer , 2017, Drug design, development and therapy.

[6]  Swagatam Das,et al.  Near-Bayesian Support Vector Machines for imbalanced data classification with equal or unequal misclassification costs , 2015, Neural Networks.

[7]  S. Loesgen,et al.  The natural product mensacarcin induces mitochondrial toxicity and apoptosis in melanoma cells , 2017, The Journal of Biological Chemistry.

[8]  Ai-qin Niu,et al.  Prediction of selective estrogen receptor beta agonist using open data and machine learning approach , 2016, Drug design, development and therapy.

[9]  Supratik Kar,et al.  On a simple approach for determining applicability domain of QSAR models , 2015 .

[10]  Yuanfang Guan,et al.  Accurate prediction of personalized olfactory perception from large-scale chemoinformatic features , 2017, GigaScience.

[11]  C. Ferrone,et al.  Translational Research in Cutaneous Melanoma: New Therapeutic Perspectives. , 2017, Anti-Cancer Agents in Medicinal Chemistry.

[12]  Alexander Golbraikh,et al.  Application of Quantitative Structure–Activity Relationship Models of 5-HT1A Receptor Binding to Virtual Screening Identifies Novel and Potent 5-HT1A Ligands , 2014, J. Chem. Inf. Model..

[13]  Miklos Feher,et al.  Global or Local QSAR: Is There a Way Out? , 2009 .

[14]  Jie Li,et al.  In Silico Prediction of Compounds Binding to Human Plasma Proteins by QSAR Models , 2018, ChemMedChem.

[15]  Y. S. Prabhakar,et al.  QSAR of 2-(4-methylsulphonylphenyl)pyrimidine derivatives as cyclooxygenase-2 inhibitors: simple structural fragments as potential modulators of activity , 2012, Journal of enzyme inhibition and medicinal chemistry.

[16]  J. Dimmock,et al.  Comparative QSAR Analysis of 3,5-bis (Arylidene)-4-Piperidone Derivatives: the Development of Predictive Cytotoxicity Models , 2016, Iranian journal of pharmaceutical research : IJPR.

[17]  N. Omar,et al.  Synthesis and In Vitro Antiproliferative Activity of New 1-Phenyl-3-(4-(pyridin-3-yl)phenyl)urea Scaffold-Based Compounds , 2018, Molecules.

[18]  A. Fassihi,et al.  QSAR Study of p56lck Protein Tyrosine Kinase Inhibitory Activity of Flavonoid Derivatives Using MLR and GA-PLS , 2008, International journal of molecular sciences.

[19]  Eslam Pourbasheer,et al.  QSAR Study of 17β-HSD3 Inhibitors by Genetic Algorithm-Support Vector Machine as a Target Receptor for the Treatment of Prostate Cancer , 2017, Iranian journal of pharmaceutical research : IJPR.

[20]  A. Zarghi,et al.  QSAR Modeling of COX -2 Inhibitory Activity of Some Dihydropyridine and Hydroquinoline Derivatives Using Multiple Linear Regression (MLR) Method , 2017, Iranian journal of pharmaceutical research : IJPR.

[21]  Qihong Huang,et al.  The changing 50% inhibitory concentration (IC50) of cisplatin: a pilot study on the artifacts of the MTT assay and the precise measurement of density-dependent chemoresistance in ovarian cancer , 2016, Oncotarget.

[22]  Qi Wang,et al.  In Silico Pharmacoepidemiologic Evaluation of Drug-Induced Cardiovascular Complications Using Combined Classifiers , 2018, J. Chem. Inf. Model..

[23]  Feixiong Cheng,et al.  In silico Prediction of Chemical Ames Mutagenicity , 2012, J. Chem. Inf. Model..

[24]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[25]  Andrej Sali,et al.  Discovery of potent, selective multidrug and toxin extrusion transporter 1 (MATE1, SLC47A1) inhibitors through prescription drug profiling and computational modeling. , 2013, Journal of medicinal chemistry.

[26]  C. Nantasenamat,et al.  Discovery of novel 1,2,3-triazole derivatives as anticancer agents using QSAR and in silico structural modification , 2015, SpringerPlus.

[27]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[28]  Saroj Verma,et al.  Topological and physicochemical characteristics of 1,2,3,4-Tetrahydroacridin- 9(10H)-ones and their antimalarial profiles: a composite insight to the structure-activity relation. , 2013, Current computer-aided drug design.

[29]  A. Tsatsakis,et al.  The Cytotoxic Effects of Betulin-Conjugated Gold Nanoparticles as Stable Formulations in Normal and Melanoma Cells , 2018, Front. Pharmacol..

[30]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[31]  A. Sakhteman,et al.  A Comparative QSAR Analysis, Molecular Docking and PLIF Studies of Some N-arylphenyl-2, 2-Dichloroacetamide Analogues as Anticancer Agents , 2017, Iranian journal of pharmaceutical research : IJPR.

[32]  Isidro Cortes-Ciriano,et al.  Comparing the Influence of Simulated Experimental Errors on 12 Machine Learning Algorithms in Bioactivity Modeling Using 12 Diverse Data Sets , 2015, J. Chem. Inf. Model..

[33]  Roberto Todeschini,et al.  A QSTR-Based Expert System to Predict Sweetness of Molecules , 2017, Front. Chem..

[34]  H. Zeng,et al.  Predictive QSAR Models for the Toxicity of Disinfection Byproducts , 2017, Molecules.

[35]  Bernd Bischl,et al.  mlr: Machine Learning in R , 2016, J. Mach. Learn. Res..

[36]  Sk Abdul Amin,et al.  Monte Carlo based modelling approach for designing and predicting cytotoxicity of 2-phenylindole derivatives against breast cancer cell line MCF7. , 2018, Toxicology in vitro : an international journal published in association with BIBRA.

[37]  N. Andreatos,et al.  Targeting Histone Deacetylases in Malignant Melanoma: A Future Therapeutic Agent or Just Great Expectations? , 2017, Anticancer research.

[38]  Alois Knoll,et al.  Gradient boosting machines, a tutorial , 2013, Front. Neurorobot..

[39]  Q. Wang,et al.  Docking Analysis and Multidimensional Hybrid QSAR Model of 1,4-Benzodiazepine-2,5-Diones as HDM2 Antagonists , 2012, Iranian journal of pharmaceutical research : IJPR.

[40]  Roberto Todeschini,et al.  Defining a novel k-nearest neighbours approach to assess the applicability domain of a QSAR model for reliable predictions , 2013, Journal of Cheminformatics.

[41]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[42]  M. Bouachrine,et al.  QSAR study and rustic ligand-based virtual screening in a search for aminooxadiazole derivatives as PIM1 inhibitors , 2018, Chemistry Central Journal.

[43]  R. Karbakhsh,et al.  Application of different chemometric tools in QSAR study of azolo-adamantanes against influenza A virus , 2011, Research in pharmaceutical sciences.

[44]  Vinicius M. Alves,et al.  QSAR-Driven Design and Discovery of Novel Compounds With Antiplasmodial and Transmission Blocking Activities , 2018, Front. Pharmacol..

[45]  Mohammed K. Abdelhameid,et al.  Design and synthesis of thienopyrimidine urea derivatives with potential cytotoxic and pro-apoptotic activity against breast cancer cell line MCF-7. , 2018, European journal of medicinal chemistry.

[46]  S. Gibbons,et al.  Differential modulation of Bax/Bcl-2 ratio and onset of caspase-3/7 activation induced by derivatives of Justicidin B in human melanoma cells A375 , 2017, Oncotarget.

[47]  R. Means,et al.  p90RSK Blockade Inhibits Dual BRAF and MEK Inhibitor-Resistant Melanoma by Targeting Protein Synthesis. , 2017, The Journal of investigative dermatology.

[48]  Artem Cherkasov,et al.  SimBoost: a read-across approach for predicting drug–target binding affinities using gradient boosting machines , 2017, Journal of Cheminformatics.

[49]  A. Vulpetti,et al.  Comparability of Mixed IC50 Data – A Statistical Analysis , 2013, PloS one.

[50]  A. Mohanapriya,et al.  Comparative QSAR analysis of cyclo-oxygenase2 inhibiting drugs , 2012, Bioinformation.

[51]  J. Gower A General Coefficient of Similarity and Some of Its Properties , 1971 .

[52]  Anna Kryshchyshyn,et al.  Development of Predictive QSAR Models of 4‐Thiazolidinones Antitrypanosomal Activity Using Modern Machine Learning Algorithms , 2018, Molecular informatics.

[53]  Hybrid Docking-QSAR Studies of 1, 4-dihydropyridine-3, 5-Dicarboxamides as Potential Antitubercular Agents. , 2017, Current computer-aided drug design.

[54]  G. Box Robustness in the Strategy of Scientific Model Building. , 1979 .

[55]  Roberto Todeschini,et al.  Towards Global QSAR Model Building for Acute Toxicity: Munro Database Case Study , 2014, International journal of molecular sciences.

[56]  Roberto Todeschini,et al.  In Silico Prediction of Cytochrome P450-Drug Interaction: QSARs for CYP3A4 and CYP2C9 , 2016, International journal of molecular sciences.

[57]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics (e1071), TU Wien , 2014 .

[58]  S. Haryana,et al.  Biological activity, quantitative structure–activity relationship analysis, and molecular docking of xanthone derivatives as anticancer drugs , 2018, Drug design, development and therapy.