High-Dimensional Descriptor Selection and Computational QSAR Modeling for Antitumor Activity of ARC-111 Analogues Based on Support Vector Regression (SVR)

To design ARC-111 analogues with improved efficiency, we constructed the QSAR of 22 ARC-111 analogues with RPMI8402 tumor cells. First, the optimized support vector regression (SVR) model based on the literature descriptors and the worst descriptor elimination multi-roundly (WDEM) method had similar generalization as the artificial neural network (ANN) model for the test set. Secondly, seven and 11 more effective descriptors out of 2,923 features were selected by the high-dimensional descriptor selection nonlinearly (HDSN) and WDEM method, and the SVR models (SVR3 and SVR4) with these selected descriptors resulted in better evaluation measures and a more precise predictive power for the test set. The interpretability system of better SVR models was further established. Our analysis offers some useful parameters for designing ARC-111 analogues with enhanced antitumor activity.

[1]  Mohammad Hossein Fatemi,et al.  Quantitative Structure–Properties Relationship Study of the 29Si-NMR Chemical Shifts of Some Silicate Species , 2009 .

[2]  Mihai V. Putz,et al.  A Spectral-SAR Model for the Anionic-Cationic Interaction in Ionic Liquids: Application to Vibrio fischeri Ecotoxicity , 2007, International Journal of Molecular Sciences.

[3]  Farhad Gharagheizi,et al.  An accurate model for prediction of autoignition temperature of pure compounds. , 2011, Journal of hazardous materials.

[4]  Alexander Tropsha,et al.  Antitumor Agents 252. Application of validated QSAR models to database mining: discovery of novel tylophorine derivatives as potential anticancer agents , 2007, J. Comput. Aided Mol. Des..

[5]  Tan Xian Multi-KNN-SVR Combinatorial Forecast and Its Application to QSAR of Fluorine-Containing Compounds , 2008 .

[6]  Rafael Gozalbes,et al.  QSAR-based solubility model for drug-like compounds. , 2010, Bioorganic & medicinal chemistry.

[7]  Kazutoshi Tanabe,et al.  Prediction of carcinogenicity for diverse chemicals based on substructure grouping and SVM modeling , 2010, Molecular Diversity.

[8]  Slavica Erić,et al.  Synthesis, antitumor activity and QSAR studies of some 4-aminomethylidene derivatives of edaravone. , 2011, Bioorganic chemistry.

[9]  Mihai V. Putz,et al.  Köln-Timişoara Molecular Activity Combined Models toward Interspecies Toxicity Assessment , 2009, International journal of molecular sciences.

[10]  Peter J Houghton,et al.  Characterization of ARC-111 as a novel topoisomerase I-targeting anticancer drug. , 2003, Cancer research.

[11]  Y. S. Prabhakar,et al.  Topological descriptors in modeling the HIV inhibitory activity of 2-aryl-3-pyridyl-thiazolidin-4-ones. , 2005, Combinatorial chemistry & high throughput screening.

[12]  Zhiliang Li,et al.  Using scores of amino acid topological descriptors for quantitative sequence-mobility modeling of peptides based on support vector machine , 2006 .

[13]  F. A. Pasha,et al.  QSTR Study of Small Organic Molecules against Tetrahymena pyriformis , 2005 .

[14]  Piyush Trivedi,et al.  QSAR analysis of some phthalimide analogues based inhibitors of HIV-1 integrase , 2007 .

[15]  Igor V. Tetko,et al.  Virtual Computational Chemistry Laboratory – Design and Description , 2005, J. Comput. Aided Mol. Des..

[16]  Nai Zhou,et al.  5-(2-aminoethyl)dibenzo[c,h][1,6]naphthyridin-6-ones: variation of n-alkyl substituents modulates sensitivity to efflux transporters associated with multidrug resistance. , 2005, Journal of medicinal chemistry.

[17]  Harshinder Singh,et al.  QSAR Study of Skin Sensitization Using Local Lymph Node Assay Data , 2004 .

[18]  Jarl E. S. Wikberg,et al.  Proteochemometric Modeling of Drug Resistance over the Mutational Space for Multiple HIV Protease Variants and Multiple Protease Inhibitors , 2009, J. Chem. Inf. Model..

[19]  Adrian Chiriac,et al.  Quantum-SAR Extension of the Spectral-SAR Algorithm. Application to Polyphenolic Anticancer Bioactivity , 2009, International journal of molecular sciences.

[20]  Eslam Pourbasheer,et al.  QSRR Study of GC Retention Indices of Essential-Oil Compounds by Multiple Linear Regression with a Genetic Algorithm , 2008 .

[21]  Maykel Pérez González,et al.  BCUT descriptors to predicting affinity toward A3 adenosine receptors. , 2005, Bioorganic & medicinal chemistry letters.

[22]  Zhou Wei,et al.  A Novel QSAR Model Based on Geostatistics and Support Vector Regression , 2009 .

[23]  Zhimin He,et al.  Comparative QSAR modeling of antitumor activity of ARC-111 analogues using stepwise MLR, PLS, and ANN techniques , 2010, Medicinal Chemistry Research.

[24]  Adeel Malik,et al.  Databases and QSAR for Cancer Research , 2009 .

[25]  Yvan Vander Heyden,et al.  In silico predictions of ADME-Tox properties: drug absorption. , 2011, Combinatorial chemistry & high throughput screening.

[26]  Dai Zhi-Jun,et al.  A Novel Method of Nonlinear Rapid Feature Selection for High Dimensional Data and Its Application in Peptide QSAR Modeling Based on Support Vector Machine , 2011 .

[27]  Peteris Prusis,et al.  Proteochemometric Mapping of the Interaction of Organic Compounds with Melanocortin Receptor Subtypes , 2005, Molecular Pharmacology.

[28]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[29]  Zhou Yuan,et al.  Using scores of amino acid topological descriptors for quantitative sequence-mobility modeling of peptides based on support vector machine , 2006 .

[30]  E. Rubin,et al.  Mechanisms of resistance to topoisomerase I-targeting drugs , 2003, Oncogene.

[31]  Mihai V. Putz,et al.  Introducing Spectral Structure Activity Relationship (S-SAR) Analysis. Application to Ecotoxicology , 2007, International Journal of Molecular Sciences.

[32]  Dong-Sheng Cao,et al.  Prediction of aqueous solubility of druglike organic compounds using partial least squares, back‐propagation network and support vector machine , 2010 .