MultiK-MHKS: A Novel Multiple Kernel Learning Algorithm

In this paper, we develop a new effective multiple kernel learning algorithm. First, we map the input data into m different feature spaces by m empirical kernels, where each generated feature space is taken as one view of the input space. Then, through borrowing the motivating argument from Canonical Correlation Analysis (CCA) that can maximally correlate the m views in the transformed coordinates, we introduce a special term called Inter-Function Similarity Loss RIFSI. into the existing regularization framework so as to guarantee the agreement of multiview outputs. In implementation, we select the Modification of Ho-Kashyap algorithm with Squared approximation of the misclassification errors (MHKS) as the incorporated paradigm and the experimental results on benchmark data sets demonstrate the feasibility and effectiveness of the proposed algorithm named MultiK-MHKS.

[1]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[2]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[3]  Gunnar Rätsch,et al.  A General and Efficient Multiple Kernel Learning Algorithm , 2005, NIPS.

[4]  Nello Cristianini,et al.  A statistical framework for genomic data fusion , 2004, Bioinform..

[5]  Jieping Ye,et al.  Generalized Low Rank Approximations of Matrices , 2005, Machine Learning.

[6]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[7]  Javier M. Moguerza,et al.  Combining Kernel Information for Support Vector Classification , 2004, Multiple Classifier Systems.

[8]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[9]  Simon Haykin,et al.  On Different Facets of Regularization Theory , 2002, Neural Computation.

[10]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[11]  Javier M. Moguerza,et al.  Fusion of Gaussian Kernels Within Support Vector Classification , 2006, CIARP.

[12]  Jacek M. Łȩski,et al.  Ho--Kashyap classifier with generalization control , 2003 .

[13]  John Shawe-Taylor,et al.  Using KCCA for Japanese–English cross-language information retrieval and document classification , 2006, Journal of Intelligent Information Systems.

[14]  Javier M. Moguerza,et al.  On the Fusion of Polynomial Kernels for Support Vector Classifiers , 2006, IDEAL.

[15]  Jacek M. Leski Ho-Kashyap classifier with generalization control , 2003, Pattern Recognit. Lett..

[16]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[17]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[18]  Kristin P. Bennett,et al.  A Pattern Search Method for Model Selection of Support Vector Regression , 2002, SDM.

[19]  John Shawe-Taylor,et al.  Two view learning: SVM-2K, Theory and Practice , 2005, NIPS.

[20]  Yves Grandvalet,et al.  Adaptive Scaling for Feature Selection in SVMs , 2002, NIPS.

[21]  Jieping Ye,et al.  Generalized Low Rank Approximations of Matrices , 2004, Machine Learning.

[22]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[23]  Jinbo Bi,et al.  Column-generation boosting methods for mixture of kernels , 2004, KDD.

[24]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[25]  Alexander J. Smola,et al.  Learning the Kernel with Hyperkernels , 2005, J. Mach. Learn. Res..

[26]  H. Damasio,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence: Special Issue on Perceptual Organization in Computer Vision , 1998 .

[27]  Junshui Ma Function replacement vs. kernel trick , 2003, Neurocomputing.

[28]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[29]  M. Omair Ahmad,et al.  Optimizing the kernel in the empirical feature space , 2005, IEEE Transactions on Neural Networks.

[30]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[31]  Kristin P. Bennett,et al.  MARK: a boosting algorithm for heterogeneous kernel models , 2002, KDD.

[32]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[33]  J. Shawe-Taylor,et al.  Using KCCA for Japanese-English cross-language information retrieval and classification , 2004 .

[34]  Ivor W. Tsang,et al.  Efficient kernel feature extraction for massive data sets , 2006, KDD '06.

[35]  Ulrich H.-G. Kreßel,et al.  Pairwise classification and support vector machines , 1999 .

[36]  Robert P. W. Duin,et al.  A Generalized Kernel Approach to Dissimilarity-based Classification , 2002, J. Mach. Learn. Res..