Minimum class variance multiple kernel learning

Abstract The purpose of multiple kernel learning (MKL) is to learn an appropriate kernel from a set of predefined base kernels. Most of the MKL methods follow the basic idea of support vector machine (SVM) to learn the optimal weights of base kernels and build the used classifier. However, SVM is a local method and ignores the structure information of the data in that its solution is exclusively determined by the so-called support vectors. In the paper, we propose an improved SVM-based MKL method called minimum class variance multiple kernel learning (MCVMKL). The key characteristic of MCVMKL is that it exploits the ellipsoidal structure of the data during learning the optimal weights and building the classifier. Besides, its formulation is invariant to scalings of the weights of base kernels. We develop two optimization strategies to handle the optimization model of MCVMKL. Further, we derive a rough upper bound for the objective function of MCVMKL and propose a variant called trace-constrained multiple kernel learning (TCMKL) by using the trace of the within-class scatter matrix. TCMKL enlarges the margin between different classes and simultaneously shrinks the region covering the data as much as possible. Moreover, it can automatically tune the regularization parameter and so saves the training time due to avoiding using the time-consuming cross-validation technique to select an appropriate regularization parameter. Finally, the comprehensive experiments are conducted and the results demonstrate that the proposed methods are effective and can achieve better performance over the competing methods.

[1]  Ioannis Pitas,et al.  Novel Multiclass Classifiers Based on the Minimization of the Within-Class Variance , 2009, IEEE Transactions on Neural Networks.

[2]  Ivor W. Tsang,et al.  Generalized Multiple Kernel Learning With Data-Dependent Priors , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Ji Wu,et al.  Efficient Multiple Kernel Support Vector Machine Based Voice Activity Detection , 2011, IEEE Signal Processing Letters.

[4]  Zenglin Xu,et al.  Efficient Sparse Generalized Multiple Kernel Learning , 2011, IEEE Transactions on Neural Networks.

[5]  Xiangrong Zhang,et al.  A nonlinear subspace multiple kernel learning for financial distress prediction of Chinese listed companies , 2016, Neurocomputing.

[6]  Fabio Aiolli,et al.  EasyMKL: a scalable multiple kernel learning algorithm , 2015, Neurocomputing.

[7]  Claudio Gallicchio,et al.  Enhancing deep neural networks via multiple kernel learning , 2020, Pattern Recognit..

[8]  Jian Yang,et al.  Higher-level feature combination via multiple kernel learning for image classification , 2015, Neurocomputing.

[9]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[10]  Changshui Zhang,et al.  Learning Kernels with Radiuses of Minimum Enclosing Balls , 2010, NIPS.

[11]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[12]  Chih-Jen Lin,et al.  A Comparison of Methods for Multi-class Support Vector Machines , 2015 .

[13]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[14]  Anastasios Tefas,et al.  Minimum Class Variance Support Vector Machines , 2007, IEEE Transactions on Image Processing.

[15]  Mohammad H. Mahoor,et al.  Task-dependent multi-task multiple kernel learning for facial action unit detection , 2016, Pattern Recognit..

[16]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[17]  Yung-Yu Chuang,et al.  Multiple Kernel Fuzzy Clustering , 2012, IEEE Transactions on Fuzzy Systems.

[18]  Aruna Tiwari,et al.  Localized Multiple Kernel Learning for Anomaly Detection: One-class Classification , 2018, Knowl. Based Syst..

[19]  Yuxiao Hu,et al.  Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Sheng Tang,et al.  Localized Multiple Kernel Learning for Realistic Human Action Recognition in Videos , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[21]  Yung C. Shin,et al.  Sparse Multiple Kernel Learning for Signal Processing Applications , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  S. Sathiya Keerthi,et al.  A fast iterative nearest point algorithm for support vector machine classifier design , 2000, IEEE Trans. Neural Networks Learn. Syst..

[23]  Lei Wang,et al.  An Efficient Approach to Integrating Radius Information into Multiple Kernel Learning , 2013, IEEE Transactions on Cybernetics.

[24]  T. Lane,et al.  A Framework for Multiple Kernel Support Vector Regression and Its Applications to siRNA Efficacy Prediction , 2009, TCBB.

[25]  Zenglin Xu,et al.  An Extended Level Method for Efficient Multiple Kernel Learning , 2008, NIPS.

[26]  S. Asharaf,et al.  Deep multiple multilayer kernel learning in core vector machines , 2018, Expert Syst. Appl..

[27]  Ammar Belatreche,et al.  Forecasting movements of health-care stock prices based on different categories of news articles using multiple kernel learning , 2016, Decis. Support Syst..

[28]  Umberto Castellani,et al.  Classification of first-episode psychosis in a large cohort of patients using support vector machine and multiple kernel learning techniques , 2017, NeuroImage.

[29]  Yiqiang Chen,et al.  Building Sparse Multiple-Kernel SVM Classifiers , 2009, IEEE Transactions on Neural Networks.

[30]  Kiseon Kim,et al.  Multiple kernel learning based on three discriminant features for a P300 speller BCI , 2017, Neurocomputing.

[31]  Luca Oneto,et al.  Model Selection and Error Estimation in a Nutshell , 2020, Modeling and Optimization in Science and Technologies.

[32]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[33]  Qi Fan,et al.  Multiple empirical kernel learning with locality preserving constraint , 2016, Knowl. Based Syst..

[34]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[35]  Ethem Alpaydin,et al.  Localized algorithms for multiple kernel learning , 2013, Pattern Recognit..

[36]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[37]  Ivor W. Tsang,et al.  This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 1 Soft Margin Multiple Kernel Learning , 2022 .

[38]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[39]  Olivier Chapelle,et al.  Training a Support Vector Machine in the Primal , 2007, Neural Computation.

[40]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[41]  Rong Jin,et al.  Online Multiple Kernel Similarity Learning for Visual Search , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..