Optimized Kernel Entropy Components

This brief addresses two main issues of the standard kernel entropy component analysis (KECA) algorithm: the optimization of the kernel decomposition and the optimization of the Gaussian kernel parameter. KECA roughly reduces to a sorting of the importance of kernel eigenvectors by entropy instead of variance, as in the kernel principal components analysis. In this brief, we propose an extension of the KECA method, named optimized KECA (OKECA), that directly extracts the optimal features retaining most of the data entropy by means of compacting the information in very few features (often in just one or two). The proposed method produces features which have higher expressive power. In particular, it is based on the independent component analysis framework, and introduces an extra rotation to the eigen decomposition, which is optimized via gradient-ascent search. This maximum entropy preservation suggests that OKECA features are more efficient than KECA features for density estimation. In addition, a critical issue in both the methods is the selection of the kernel parameter, since it critically affects the resulting performance. Here, we analyze the most common kernel length-scale selection criteria. The results of both the methods are illustrated in different synthetic and real problems. Results show that OKECA returns projections with more expressive power than KECA, the most successful rule for estimating the kernel parameter is based on maximum likelihood, and OKECA is more robust to the selection of the length-scale parameter in kernel density estimation.

[1]  Mark Girolami,et al.  Orthogonal Series Density Estimation and the Kernel Eigenvalue Problem , 2002, Neural Computation.

[2]  Edwin R. Hancock,et al.  Kernel Entropy-Based Unsupervised spectral Feature Selection , 2012, Int. J. Pattern Recognit. Artif. Intell..

[3]  Colin Fyfe,et al.  Kernel and Nonlinear Canonical Correlation Analysis , 2000, IJCNN.

[4]  T. Asano,et al.  ENTROPY , RELATIVE ENTROPY , AND MUTUAL INFORMATION , 2008 .

[5]  G. Camps-Valls,et al.  Kernel Entropy Component Analysis for Remote Sensing Image Clustering , 2012, IEEE Geoscience and Remote Sensing Letters.

[6]  James V. Stone Independent Component Analysis: A Tutorial Introduction , 2007 .

[7]  Clayton D. Scott,et al.  Robust kernel density estimation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Leonid Mestetskiy,et al.  Face recognition using kernel entropy component analysis , 2011, Neurocomputing.

[9]  Kaare Brandt Petersen,et al.  Kernel Multivariate Analysis Framework for Supervised Subspace Learning: A Tutorial on Linear and Kernel Multivariate Methods , 2013, IEEE Signal Processing Magazine.

[10]  Robert Jenssen Entropy-Relevant Dimensions in the Kernel Feature Space: Cluster-Capturing Dimensionality Reduction , 2013, IEEE Signal Processing Magazine.

[11]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[12]  Robert P. W. Duin,et al.  On the Choice of Smoothing Parameters for Parzen Estimators of Probability Density Functions , 1976, IEEE Transactions on Computers.

[13]  I. Jolliffe Principal Component Analysis , 2002 .

[14]  Aapo Hyvärinen,et al.  Independent Component Analysis: A Tutorial , 1999 .

[15]  Xiaojun Wu,et al.  Regional and Entropy component analysis based remote sensing images fusion , 2014, J. Intell. Fuzzy Syst..

[16]  Robert Jenssen,et al.  Information Theoretic Learning and Kernel Methods , 2009 .

[17]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[18]  Xue-feng Yan,et al.  Fault detection in nonlinear chemical processes based on kernel entropy component analysis and angular structure , 2013, Korean Journal of Chemical Engineering.

[19]  Robert Jenssen,et al.  Mixture weight influence on kernel entropy component analysis and semi-supervised learning using the Lasso , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[20]  Roman Rosipal,et al.  Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space , 2002, J. Mach. Learn. Res..

[21]  Devis Tuia,et al.  Remote Sensing Image Processing. Synthesis Lectures on Image, Video, and Multimedia Processing. , 2011 .

[22]  Xiao Jun Wu,et al.  Fusing Remote Sensing Images Using a Statistical Model , 2012 .

[23]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[24]  Robert Jenssen,et al.  Kernel Entropy Component Analysis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  F. La Foresta,et al.  Automatic Artifact Rejection From Multichannel Scalp EEG by Wavelet ICA , 2012, IEEE Sensors Journal.

[26]  Yuan Yan Tang,et al.  Nonnegative class-specific entropy component analysis with adaptive step search criterion , 2011, Pattern Analysis and Applications.

[27]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[28]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[29]  Yong De Hu,et al.  High-Dimensional Data Dimension Reduction Based on KECA , 2013 .

[30]  Ling Guan,et al.  Multimodal Information Fusion of Audio Emotion Recognition Based on Kernel Entropy Component Analysis , 2012, 2012 IEEE International Symposium on Multimedia.

[31]  Jose C. Principe,et al.  Information Theoretic Learning - Renyi's Entropy and Kernel Perspectives , 2010, Information Theoretic Learning.

[32]  Deniz Erdogmus,et al.  Locally Defined Principal Curves and Surfaces , 2011, J. Mach. Learn. Res..

[33]  Erkki Oja,et al.  Independent Component Analysis , 2001 .