Support Vector Machines-Based Quantitative Structure-Property Relationship for the Prediction of Heat Capacity

The support vector machine (SVM), as a novel type of learning machine, for the first time, was used to develop a Quantitative Structure-Property Relationship (QSPR) model of the heat capacity of a diverse set of 182 compounds based on the molecular descriptors calculated from the structure alone. Multiple linear regression (MLR) and radial basis function networks (RBFNNs) were also utilized to construct quantitative linear and nonlinear models to compare with the results obtained by SVM. The root-mean-square (rms) errors in heat capacity predictions for the whole data set given by MLR, RBFNNs, and SVM were 4.648, 4.337, and 2.931 heat capacity units, respectively. The prediction results are in good agreement with the experimental value of heat capacity; also, the results reveal the superiority of the SVM over MLR and RBFNNs models.

[1]  Ruisheng Zhang,et al.  An Accurate QSPR Study of O-H Bond Dissociation Energy in Substituted Phenols Based on Support Vector Machines , 2004, J. Chem. Inf. Model..

[2]  Ruisheng Zhang,et al.  Radial basis function neural network-based QSPR for the prediction of critical temperature , 2002 .

[3]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[4]  Shaoning Pang,et al.  Membership authentication in the dynamic group by face classification using SVM ensemble , 2003, Pattern Recognit. Lett..

[5]  D. Manallack,et al.  Neural networks in drug discovery: Have they lived up to their promise? , 1999 .

[6]  Xiaoyun Zhang,et al.  Radial basis function network-based quantitative structure–property relationship for the prediction of Henry’s law constant , 2002 .

[7]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[8]  Ruisheng Zhang,et al.  Prediction of the Isoelectric Point of an Amino Acid Based on GA-PLS and SVMs , 2004, J. Chem. Inf. Model..

[9]  Ruisheng Zhang,et al.  QSAR Study of Ethyl 2-[(3-Methyl-2, 5-dioxo(3-pyrrolinyl))amino]-4-(trifluoromethyl) pyrimidine-5-carboxylate: An Inhibitor of AP-1 and NF-B Mediated Gene Expression Based on Support Vector Machines , 2003, J. Chem. Inf. Comput. Sci..

[10]  D. B. Boyd Quantum Chemistry Program Exchange. , 1999, Journal of molecular graphics & modelling.

[11]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[12]  Davide Anguita,et al.  Hyperparameter design criteria for support vector classifiers , 2003, Neurocomputing.

[13]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[14]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[15]  Jens Sadowski,et al.  Comparison of Support Vector Machine and Artificial Neural Network Systems for Drug/Nondrug Classification , 2003, J. Chem. Inf. Comput. Sci..

[16]  John Mark,et al.  Introduction to radial basis function networks , 1996 .

[17]  Biye Ren,et al.  Atomic-Level-Based AI Topological Descriptors for Structure-Property Correlations , 2003, J. Chem. Inf. Comput. Sci..

[18]  Feng Luan,et al.  Diagnosing Breast Cancer Based on Support Vector Machines , 2003, J. Chem. Inf. Comput. Sci..

[19]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[20]  Bernard F. Buxton,et al.  Drug Design by Machine Learning: Support Vector Machines for Pharmaceutical Data Analysis , 2001, Comput. Chem..

[21]  Ruisheng Zhang,et al.  Study of the Quantitative Structure-Mobility Relationship of Carboxylic Acids in Capillary Electrophoresis Based on Support Vector Machines , 2004, J. Chem. Inf. Model..