Ensemble of Multiple Kernel SVM Classifiers

Multiple kernel learning (MKL) allows the practitioner to optimize over linear combinations of kernels and shows good performance in many applications. However, many MKL algorithms require very high computational costs in real world applications. In this study, we present a framework which uses multiple kernel SVM classifiers as the base learners for stacked generalization, a general method of using a high-level model to combine lower-level models, to achieve greater computational efficiency. The experimental results show that our MKL-based stacked generalization algorithm combines advantages from both MKL and stacked generalization. Compared to other general ensemble methods tested in this paper, this method achieves greater performance on predictive accuracy.

[1]  Hyun-Chul Kim,et al.  Support Vector Machine Ensemble with Bagging , 2002, SVM.

[2]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[3]  Mohak Shah,et al.  Evaluating Learning Algorithms: A Classification Perspective , 2011 .

[4]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[5]  Ian H. Witten,et al.  Stacking Bagged and Dagged Models , 1997, ICML.

[6]  Sebastian Nowozin,et al.  Infinite Kernel Learning , 2008, NIPS 2008.

[7]  Tomaso A. Poggio,et al.  Bounds on the Generalization Performance of Kernel Machine Ensembles , 2000, ICML.

[8]  Maozhen Li,et al.  A distributed SVM ensemble for image classification and annotation , 2012, 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery.

[9]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[10]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Hyun-Chul Kim,et al.  Pattern classification using support vector machine ensemble , 2002, Object recognition supported by user interaction for service robots.

[12]  Zhi-Ping Fan,et al.  Parallel multiple kernel learning: a hybrid alternating direction method of multipliers , 2013, Knowledge and Information Systems.

[13]  Mehryar Mohri,et al.  L2 Regularization for Learning Kernels , 2009, UAI.

[14]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[15]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[16]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[17]  Nello Cristianini,et al.  Composite Kernels for Hypertext Categorisation , 2001, ICML.

[18]  Giorgio Valentini,et al.  Low Bias Bagged Support Vector Machines , 2003, ICML.

[19]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[20]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[21]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[22]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, International Conference on Artificial Neural Networks.

[23]  Ian H. Witten,et al.  Issues in Stacked Generalization , 2011, J. Artif. Intell. Res..

[24]  Edward Y. Chang,et al.  Parallelizing Support Vector Machines on Distributed Computers , 2007, NIPS.

[25]  Giorgio Valentini,et al.  Bias-Variance Analysis of Support Vector Machines for the Development of SVM-Based Ensemble Methods , 2004, J. Mach. Learn. Res..