Adaptive variable-weighted support vector machine as optimized by particle swarm optimization algorithm with application of QSAR studies.

Representing a compound by a numerous structural descriptors becomes common in quantitative structure-activity relationship (QSAR) studies. As every descriptor carries molecular structure information more or less, it seems more advisable to investigate all the possible descriptor vectors rather than traditional variable selection when building a QSAR model. Based on particle swarm optimization (PSO) algorithm, a more flexible descriptor selection and model construction method variable-weighted support vector machine (VW-SVM) is proposed. The new strategy adopted in this paper is to weight all structural descriptors with continuous non-negative values rather than removing or reserving any ones arbitrarily. The manner of invoking PSO to seek non-negative weights of variables can be regarded as a process of searching optimized rescaling for every molecular structural descriptor. Moreover, PSO is employed to search the optimal parameters of VW-SVM model besides variable weights, enables the construction of a rational and adaptive parameter-free QSAR model according to the performance of the total model. Results obtained by investigating glycogen synthase kinase-3α inhibitors and carbonic anhydrase II inhibitors indicate VW-SVM can hold more useful structure information of compounds than other methods as optimally weighting all the descriptors, consequently leading to precisely QSAR models coupled with developed performance both in training and in prediction.

[1]  K. Roy,et al.  QSAR by LFER model of cytotoxicity data of anti-HIV 5-phenyl-1-phenylamino-1H-imidazole derivatives using principal component factor analysis and genetic function approximation. , 2005, Bioorganic & medicinal chemistry.

[2]  Alexander Golbraikh,et al.  Combinatorial QSAR Modeling of P-Glycoprotein Substrates , 2006, J. Chem. Inf. Model..

[3]  Ronald D. Snee,et al.  Validation of Regression Models: Methods and Examples , 1977 .

[4]  Ruisheng Zhang,et al.  QSAR Models for the Prediction of Binding Affinities to Human Serum Albumin Using the Heuristic Method and a Support Vector Machine , 2004, J. Chem. Inf. Model..

[5]  Jian-Hui Jiang,et al.  Support vector machine based training of multilayer feedforward neural networks as optimized by particle swarm algorithm: Application in QSAR studies of bioactivity of organic compounds , 2007, J. Comput. Chem..

[6]  Jian-Hui Jiang,et al.  Radial Basis Function Network-Based Transform for a Nonlinear Support Vector Machine as Optimized by a Particle Swarm Optimization Algorithm with Application to QSAR Studies , 2007, J. Chem. Inf. Model..

[7]  Quan Pan,et al.  Classification of protein quaternary structure with support vector machine , 2003, Bioinform..

[8]  R. Bhat,et al.  GSK3β Signalling: Casting a Wide Net in Alzheimer’s Disease , 2002, Neurosignals.

[9]  Igor V. Pletnev,et al.  Drug Discovery Using Support Vector Machines. The Case Studies of Drug-likeness, Agrochemical-likeness, and Enzyme Inhibition Predictions , 2003, J. Chem. Inf. Comput. Sci..

[10]  R. J. Doerksen,et al.  Probing the physicochemical and structural requirements for glycogen synthase kinase-3alpha inhibition: 2D-QSAR for 3-anilino-4-phenylmaleimides. , 2006, Bioorganic & medicinal chemistry.

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[12]  H. M. Vinkers,et al.  Improving QSAR models for the biological activity of HIV Reverse Transcriptase inhibitors: Aspects of outlier detection and uninformative variable elimination. , 2005, Talanta.

[13]  Chu-Young Kim,et al.  Structural aspects of isozyme selectivity in the binding of inhibitors to carbonic anhydrases II and IV. , 2002, Journal of medicinal chemistry.

[14]  Wei-Qi Lin,et al.  Artificial neural network-based transformation for nonlinear partial least-square regression with application to QSAR studies. , 2007, Talanta.

[15]  Zhide Hu,et al.  Prediction of surface tension for common compounds based on novel methods using heuristic method and support vector machine. , 2007, Talanta.

[16]  Carbonic Anhydrase Inhibitors: Synthesis of Water Soluble Sulfonamides Incorporating a 4-sulfamoylphenylmethylthiourea Scaffold, with Potent Intraocular Pressure Lowering Properties , 2002, Journal of enzyme inhibition and medicinal chemistry.

[17]  Wen Du,et al.  New Variable Selection Method Using Interval Segmentation Purity with Application to Blockwise Kernel Transform Support Vector Machine Classification of High-Dimensional Microarray Data , 2009, J. Chem. Inf. Model..

[18]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[19]  Christina A. Wilson,et al.  GSK-3α regulates production of Alzheimer's disease amyloid-β peptides , 2003, Nature.

[20]  Jian-Hui Jiang,et al.  Optimized Block-wise Variable Combination by Particle Swarm Optimization for Partial Least Squares Modeling in Quantitative Structure-Activity Relationship Studies , 2005, J. Chem. Inf. Model..

[21]  Jesús Avila,et al.  Glycogen synthase kinase 3: a drug target for CNS therapies , 2004, Journal of neurochemistry.

[22]  Jian Jiao,et al.  Modified Particle Swarm Optimization Algorithm for Adaptively Configuring Globally Optimal Classification and Regression Trees , 2009, J. Chem. Inf. Model..

[23]  G. Melagraki,et al.  QSAR study on para-substituted aromatic sulfonamides as carbonic anhydrase II inhibitors using topological information indices. , 2006, Bioorganic & medicinal chemistry.

[24]  Hai-Long Wu,et al.  Variable selection using probability density function similarity for support vector machine classification of high-dimensional microarray data. , 2009, Talanta.

[25]  A. Zhang,et al.  Studies of 3D-quantitative structure-activity relationships on a set of nitroaromatic compounds: CoMFA, advanced CoMFA and CoMSIA. , 2002, Chemosphere.

[26]  M. J. Duart,et al.  Use of QSAR methods for predicting the chemiluminescent behaviour of organic compounds upon reaction with potassium permanganate in an acid medium. , 2009, Talanta.

[27]  L. Buydens,et al.  Multivariate calibration with least-squares support vector machines. , 2004, Analytical chemistry.

[28]  C. Supuran,et al.  Carbonic anhydrases: current state of the art, therapeutic applications and future prospects. , 2004, Journal of enzyme inhibition and medicinal chemistry.

[29]  C. Supuran,et al.  Modulation of carbonic anhydrase activity and its applications in therapy , 2004 .