Multiple Empirical Kernel Learning with dynamic pairwise constraints

Abstract Unlike the traditional Multiple Kernel Learning (MKL) with the implicit kernels, Multiple Empirical Kernel Learning (MEKL) explicitly maps the original data space into multiple feature spaces via different empirical kernels. MEKL has been demonstrated to bring good classification performance and to be much easier in processing and analyzing the adaptability of kernels for the input space. In this paper, we incorporate the dynamic pairwise constraints into MEKL to propose a novel Multiple Empirical Kernel Learning with dynamic Pairwise Constraints method (MEKLPC). It is known that the pairwise constraint provides the relationship between two samples, which tells whether these samples belong to the same class or not. In the present work, we boost the original pairwise constraints and design the dynamic pairwise constraints which can pay more attention onto the boundary samples and thus to make the decision hyperplane more reasonable and accurate. Thus, the proposed MEKLPC not only inherits the advantages of the MEKL, but also owns multiple folds of prior information. Firstly, MEKLPC gets the side-information and boosts the classification performance significantly in each feature space. Here, the side-information is the dynamic pairwise constraints which are constructed by the samples near the decision boundary, i.e. the boundary samples. Secondly, in each mapped feature space, MEKLPC still measures the empirical risk and generalization risk. Lastly, different feature spaces mapped by multiple empirical kernels can agree to their outputs for the same input sample as much as possible. To the best of our knowledge, it is the first time to introduce the dynamic pairwise constraints into the MEKL framework in the present work. The experiments on a number of real-world data sets demonstrate the feasibility and effectiveness of MEKLPC.

[1]  John Shawe-Taylor,et al.  Two view learning: SVM-2K, Theory and Practice , 2005, NIPS.

[2]  Jacek M. Łȩski,et al.  Ho--Kashyap classifier with generalization control , 2003 .

[3]  Daoqiang Zhang,et al.  Constraint Score: A new filter method for feature selection with pairwise constraints , 2008, Pattern Recognit..

[4]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[5]  Shan Suthaharan,et al.  Support Vector Machine , 2016 .

[6]  Hamza Turabieh,et al.  New empirical nonparametric kernels for support vector machine classification , 2013, Appl. Soft Comput..

[7]  Yu-Chieh Wu,et al.  Efficient text chunking using linear kernel with masked method , 2007, Knowl. Based Syst..

[8]  W. Rice ANALYZING TABLES OF STATISTICAL TESTS , 1989, Evolution; international journal of organic evolution.

[9]  Gunnar Rätsch,et al.  A General and Efficient Multiple Kernel Learning Algorithm , 2005, NIPS.

[10]  Simon Haykin,et al.  On Different Facets of Regularization Theory , 2002, Neural Computation.

[11]  Arindam Banerjee,et al.  Active Semi-Supervision for Pairwise Constrained Clustering , 2004, SDM.

[12]  Zhaohong Deng,et al.  Fuzzy kernel hyperball perceptron , 2004, Appl. Soft Comput..

[13]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[14]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[15]  Nozha Boujemaa,et al.  Active semi-supervised fuzzy clustering , 2008, Pattern Recognit..

[16]  Aiguo Song,et al.  Improving clustering with pairwise constraints: a discriminative approach , 2012, Knowledge and Information Systems.

[17]  Songcan Chen,et al.  MultiK-MHKS: A Novel Multiple Kernel Learning Algorithm , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Jieping Ye,et al.  Using uncorrelated discriminant analysis for tissue classification with gene expression data , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[19]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[20]  Daoqiang Zhang,et al.  Bagging Constraint Score for feature selection with pairwise constraints , 2010, Pattern Recognit..

[21]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[22]  Ming Yang,et al.  A novel hypothesis-margin based approach for feature selection with side pairwise constraints , 2010, Neurocomputing.

[23]  Edward R. Dougherty,et al.  Is cross-validation valid for small-sample microarray classification? , 2004, Bioinform..

[24]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[25]  Dit-Yan Yeung,et al.  Semi-Supervised Multi-Task Regression , 2009, ECML/PKDD.

[26]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[27]  Guodong Guo,et al.  Support vector machines for face recognition , 2001, Image Vis. Comput..

[28]  Changshui Zhang,et al.  Boosting with pairwise constraints , 2010, Neurocomputing.

[29]  M. Omair Ahmad,et al.  Optimizing the kernel in the empirical feature space , 2005, IEEE Transactions on Neural Networks.

[30]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[31]  Shie-Jue Lee,et al.  Employing multiple-kernel support vector machines for counterfeit banknote recognition , 2011, Appl. Soft Comput..

[32]  Jieping Ye,et al.  Characterization of a Family of Algorithms for Generalized Discriminant Analysis on Undersampled Problems , 2005, J. Mach. Learn. Res..

[33]  Xindong Wu,et al.  Extracting elite pairwise constraints for clustering , 2013, Neurocomputing.

[34]  Min Wu,et al.  Multi-label ensemble based on variable pairwise constraint projection , 2013, Inf. Sci..

[35]  J. Łȩski Kernel Ho-Kashyap classifier with generalization control , 2004 .

[36]  William Stafford Noble,et al.  Support vector machine , 2013 .

[37]  Natale Cascinelli,et al.  Prognostic value of tumor infiltrating lymphocytes in the vertical growth phase of primary cutaneous melanoma , 1996, Cancer.

[38]  Xiao-Jun Wu,et al.  A new semi-supervised clustering algorithm with pairwise constraints by competitive agglomeration , 2011, Appl. Soft Comput..

[39]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[40]  Huilin Xiong,et al.  A Unified Framework for Kernelization: The Empirical Kernel Feature Space , 2009, 2009 Chinese Conference on Pattern Recognition.

[41]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[42]  Konstantinos N. Plataniotis,et al.  Face recognition using kernel direct discriminant analysis algorithms , 2003, IEEE Trans. Neural Networks.

[43]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[44]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[45]  Yashpal Singh,et al.  Support Vector Machines for Face Recognition , 2015 .