Empirical analysis of support vector machine ensemble classifiers

Ensemble classification – combining the results of a set of base learners – has received much attention in the machine learning community and has demonstrated promising capabilities in improving classification accuracy. Compared with neural network or decision tree ensembles, there is no comprehensive empirical research in support vector machine (SVM) ensembles. To fill this void, this paper analyses and compares SVM ensembles with four different ensemble constructing techniques, namely bagging, AdaBoost, Arc-X4 and a modified AdaBoost. Twenty real-world data sets from the UCI repository are used as benchmarks to evaluate and compare the performance of these SVM ensemble classifiers by their classification accuracy. Different kernel functions and different numbers of base SVM learners are tested in the ensembles. The experimental results show that although SVM ensembles are not always better than a single SVM, the SVM bagged ensemble performs as well or better than other methods with a relatively higher generality, particularly SVMs with a polynomial kernel function. Finally, an industrial case study of gear defect detection is conducted to validate the empirical analysis results.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Chun-Xia Zhang,et al.  An efficient modified boosting method for solving classification problems , 2008 .

[3]  Stan Matwin,et al.  Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[4]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[5]  Shaoning Pang,et al.  Membership authentication in the dynamic group by face classification using SVM ensemble , 2003, Pattern Recognit. Lett..

[6]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[7]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[8]  Chengquan Huang,et al.  Enhanced algorithm performance for land cover classification from remotely sensed data using bagging and boosting , 2001, IEEE Trans. Geosci. Remote. Sens..

[9]  Byoung-Tak Zhang,et al.  AptaCDSS-E: A classifier ensemble-based clinical decision support system for cardiovascular disease level prediction , 2008, Expert Syst. Appl..

[10]  Hsuan-Tien Lin,et al.  Novel Distance-Based SVM Kernels for Infinite Ensemble Learning , 2005 .

[11]  Hyun-Chul Kim,et al.  Constructing support vector machine ensemble , 2003, Pattern Recognit..

[12]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[13]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[14]  Giorgio Valentini,et al.  Bias-Variance Analysis of Support Vector Machines for the Development of SVM-Based Ensemble Methods , 2004, J. Mach. Learn. Res..

[15]  Zhenchun Lei,et al.  Ensemble of Support Vector Machine for Text-Independent Speaker Recognition , 2006 .

[16]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[17]  Giorgio Valentini,et al.  An experimental bias-variance analysis of SVM ensembles based on resampling techniques , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  Jay Lee,et al.  A New Method for Feature Selection and Gear Defect Detection , 2007 .

[20]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory, Second Edition , 2000, Statistics for Engineering and Information Science.

[21]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[22]  Geoffrey I. Webb,et al.  Multistrategy ensemble learning: reducing error by combining ensemble learning techniques , 2004, IEEE Transactions on Knowledge and Data Engineering.

[23]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1990, COLT '90.

[24]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[25]  Lei Wang,et al.  A study of AdaBoost with SVM based weak learners , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[26]  Bo-Suk Yang,et al.  Combination of independent component analysis and support vector machines for intelligent faults diagnosis of induction motors , 2007, Expert Syst. Appl..

[27]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[28]  Lawrence O. Hall,et al.  A Comparison of Decision Tree Ensemble Creation Techniques , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Bo-Suk Yang,et al.  Support vector machine in machine condition monitoring and fault diagnosis , 2007 .

[30]  Geoffrey I. Webb,et al.  MultiBoosting: A Technique for Combining Boosting and Wagging , 2000, Machine Learning.

[31]  James M. Hogan,et al.  Improved prediction of bacterial transcription start sites , 2006, Bioinform..

[32]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[33]  L. Breiman Arcing Classifiers , 1998 .

[34]  Michiel C. van Wezel,et al.  Improved customer choice predictions using ensemble methods , 2005, Eur. J. Oper. Res..

[35]  Qiao Hu,et al.  Fault diagnosis of rotating machinery based on improved wavelet package transform and SVMs ensemble , 2007 .

[36]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[37]  Yoshua Bengio,et al.  Boosting Neural Networks , 2000, Neural Computation.

[38]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.