Feature selection for noisy variation patterns using kernel principal component analysis

Kernel Principal Component Analysis (KPCA) is a technique widely used to understand and visualize non-linear variation patterns by inverse mapping the projected data from a high-dimensional feature space back to the original input space. Variation patterns often occur in a small number of relevant features out of the overall set of features that are recorded in the data. It is, therefore, crucial to discern this set of relevant features that define the pattern. Here we propose a feature selection procedure that augments KPCA to obtain importance estimates of the features given the noisy training data. Our feature selection strategy involves projecting the data points onto sparse random vectors for calculating the kernel matrix. We then match pairs of such projections, and determine the preimages of the data with and without a feature, thereby trying to identify the importance of that feature. Thus, preimages' differences within pairs are used to identify the relevant features. An advantage of our method is it can be used with any suitable KPCA algorithm. Moreover, the computations can be parallelized easily leading to significant speedup. We demonstrate our method on several simulated and real data sets, and compare the results to alternative approaches in the literature.

[1]  Bernhard Schölkopf,et al.  Learning to Find Pre-Images , 2003, NIPS.

[2]  Richard Weber,et al.  Simultaneous feature selection and classification using kernel-penalized support vector machines , 2011, Inf. Sci..

[3]  U. Kruger,et al.  Moving window kernel PCA for adaptive monitoring of nonlinear processes , 2009 .

[4]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[5]  Jin Hyun Park,et al.  Fault detection and identification of nonlinear processes based on kernel PCA , 2005 .

[6]  Daniel W. Apley,et al.  Image denoising with a multi-phase kernel principal component approach and an ensemble version , 2011, 2011 IEEE Applied Imagery Pattern Recognition Workshop (AIPR).

[7]  Bernhard Schölkopf,et al.  A Local Learning Approach for Clustering , 2006, NIPS.

[8]  Ivor W. Tsang,et al.  The pre-image problem in kernel methods , 2003, IEEE Transactions on Neural Networks.

[9]  In-Beum Lee,et al.  Nonlinear dynamic process monitoring based on dynamic kernel PCA , 2004 .

[10]  ChangKyoo Yoo,et al.  Fault detection of batch processes using multiway kernel principal component analysis , 2004, Comput. Chem. Eng..

[11]  Gunnar Rätsch,et al.  Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.

[12]  Daniel W. Apley,et al.  Preimages for variation patterns from kernel PCA and bagging , 2014 .

[13]  Richard Weber,et al.  Feature selection for support vector regression via Kernel penalization , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[14]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[15]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[16]  Genevera I. Allen Automatic Feature Selection via Weighted Kernels and Regularization , 2013 .

[17]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[18]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[19]  Zhiqiang Ge,et al.  Improved kernel PCA-based monitoring approach for nonlinear processes , 2009 .

[20]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[21]  Jian-Huang Lai,et al.  Penalized Preimage Learning in Kernel Principal Component Analysis , 2010, IEEE Transactions on Neural Networks.

[22]  Junhong Li,et al.  Improved kernel principal component analysis for fault detection , 2008, Expert Syst. Appl..

[23]  Ying Liu,et al.  A Selective Kernel PCA Algorithm for Anomaly Detection in Hyperspectral Imagery , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[24]  Heiko Hoffmann,et al.  Kernel PCA for novelty detection , 2007, Pattern Recognit..

[25]  Yiu-ming Cheung,et al.  Feature Selection and Kernel Learning for Local Learning-Based Clustering , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Liangsheng Qu,et al.  Evolving kernel principal component analysis for fault diagnosis , 2007, Comput. Ind. Eng..