Robust kernel PCA using fuzzy membership

Principal component analysis (PCA) is widely used for dimensionality reduction in pattern recognition. Although PCA has been applied in many areas successfully, it suffers from sensitivity to noise and is limited to linear principal components. The noise sensitivity problem comes from the least-squares measure used in PCA and the limitation to linear components originates from the fact that PCA uses an affine transform defined by eigenvectors of the covariance matrix and the mean of the data. In this paper, a robust kernel PCA method that extends Schölkopf et al.'s kernel PCA and uses fuzzy memberships is introduced to tackle the two problems simultaneously. We first propose an iterative method to find a robust covariance matrix called Robust Fuzzy PCA (RF-PCA). The RF-PCA is introduced to reduce the sensitivity to noise with the help of robust estimation technique. The RF-PCA method is then extended to a non-linear one, Robust Kernel Fuzzy PCA (RKF-PCA), using kernels. Experimental results suggest that the proposed algorithm works well on artificial and real world data sets.

[1]  Jacek M. Leski,et al.  Fuzzy c-varieties/elliptotypes clustering in reproducing kernel Hilbert space , 2004, Fuzzy Sets Syst..

[2]  Rajesh N. Davé,et al.  Characterization and detection of noise in clustering , 1991, Pattern Recognit. Lett..

[3]  Alan L. Yuille,et al.  Robust principal component analysis by self-organizing rules based on statistical physics approach , 1995, IEEE Trans. Neural Networks.

[4]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[5]  J. Łȩski Fuzzy c-varieties/elliptotypes clustering in reproducing kernel Hilbert space , 2004 .

[6]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[7]  José Ragot,et al.  Nonlinear PCA combining principal curves and RBF-networks for process monitoring , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[8]  H. Saunders Literature Review : RANDOM DATA: ANALYSIS AND MEASUREMENT PROCEDURES J. S. Bendat and A.G. Piersol Wiley-Interscience, New York, N. Y. (1971) , 1974 .

[9]  Lakhmi C. Jain,et al.  Fuzzy Clustering based Principal Component Analysis , 2006 .

[10]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[11]  Tai-Ning Yang,et al.  Fuzzy auto-associative neural networks for principal component extraction of noisy data , 2000, IEEE Trans. Neural Networks Learn. Syst..

[12]  V. K. Jayaraman,et al.  Feature extraction and denoising using kernel PCA , 2003 .

[13]  P. Rousseeuw Multivariate estimation with high breakdown point , 1985 .

[14]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[15]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[16]  Congde Lu,et al.  A robust kernel PCA algorithm , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[17]  Thomas R. Cundari,et al.  Robust Fuzzy Principal Component Analysis (FPCA). A Comparative Study Concerning Interaction of Carbon-Hydrogen Bonds with Molybdenum-Oxo Bonds , 2002, J. Chem. Inf. Comput. Sci..

[18]  Chunmei Zhang,et al.  Adaptive robust kernel PCA algorithm , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[19]  Matthias Scholz,et al.  Nonlinear Principal Component Analysis: Neural Network Models and Applications , 2008 .

[20]  Peter J. Huber,et al.  Robust Statistics , 2005, Wiley Series in Probability and Statistics.

[21]  Guang-Ho Cha,et al.  Kernel Principal Component Analysis for Content Based Image Retrieval , 2005, PAKDD.

[22]  Emilio Corchado,et al.  Outlier Resistant PCA Ensembles , 2006, KES.

[23]  Xuelong Li,et al.  KPCA for semantic object extraction in images , 2008, Pattern Recognit..

[24]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.