Non-Linear Feature Extraction by Linear Principal Component Analysis Using Local Kernel

In the last decade, the effectiveness of kernel-based methods for object detection and recognition have been reported Fukui et al. (2006); Hotta (2008c); Kim et al. (2002); Pontil & Verri (1998); Shawe-Taylor & Cristianini (2004); Yang (2002). In particular, Kernel Principal Component Analysis (KPCA) took the place of traditional linear PCA as the first feature extraction step in various researches and applications. KPCA can cope with non-linear variations well. However, KPCAmust solve the eigen value problem with the number of samples × the number of samples. In addition, the computation of kernel functions with all training samples are required to map a test sample to the subspace obtained by KPCA. Therefore, the computational cost is the main drawback. To reduce the computational cost of KPCA, sparse KPCA Tipping (2001) and the use of clustering Ichino et al. (2007 (in Japanese) were proposed. Ichino et al. Ichino et al. (2007 (in Japanese) reported that KPCA of cluster centers is more effective than sparse KPCA. However, the computational cost becomes a big problem again when the number of classes is large and each class has one subspace. For example, KPCA of visual words (cluster centers of local features) Hotta (2008b) was effective for object categorization but the computational cost is high. In this method, each category of 101 categories has one subspace constructed by 400 visual words. Namely, 40, 400 (= 101 categorizes × 400 visual words) kernel computations are required to map a local feature to all subspaces. On the other hand, traditional linear PCA is independent of the number of samples when the dimension of a feature is smaller than the number of samples. This is because the size of eigen value problem depends on the minimum number of the feature dimension and the number of samples. To map a test sample to a subspace, only inner products between basis vectors and the test sample are required. Therefore, in general, the computational cost of linear PCA is much lower than KPCA. In this paper, we propose how to use non-linearity of KPCA and computational cost of linear PCA simultaneously Hotta (2008a). Kernel-based methods map training samples to high dimensional space as x → φ(x). Nonlinearity is realized by linear method in high dimensional space. The dimension of mapped feature space of the Radial Basis Function (RBF) kernel becomes infinity, and we can not describe the mapped feature explicitly. However, the mapped feature φ(x) of the polynomial kernel can be described explicitly. This means that KPCA with the polynomial kernel can be solved directly by linear PCA of mapped features. Unfortunately, in general, the dimension of mapped features is too high to solve by linear PCA even if the polynomial kernel with 2nd degrees K(x, y) = (1+ xTy)2 is used. The dimension of mapped features of the polynomial 5

[1]  Kazuhiro Hotta,et al.  Non-linear feature extraction by linear PCA using local kernel , 2008, 2008 19th International Conference on Pattern Recognition.

[2]  Kwang In Kim,et al.  Face recognition using kernel principal component analysis , 2002, IEEE Signal Processing Letters.

[3]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Thomas Serre,et al.  Object recognition with features inspired by visual cortex , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Ming-Hsuan Yang,et al.  Face Recognition Using Kernel Methods , 2001, NIPS.

[7]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[8]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[9]  Kazuhiro Hotta Robust face recognition under partial occlusion based on support vector machine with local Gaussian summation kernel , 2008, Image Vis. Comput..

[10]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  Bernhard Schölkopf,et al.  Extracting Support Data for a Given Task , 1995, KDD.

[12]  Massimiliano Pontil,et al.  Support Vector Machines for 3D Object Recognition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Tomaso A. Poggio,et al.  Face recognition: component-based versus global approaches , 2003, Comput. Vis. Image Underst..

[14]  Jitendra Malik,et al.  SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  David G. Lowe,et al.  Multiclass Object Recognition with Sparse, Localized Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16]  B. Schölkopf,et al.  Advances in kernel methods: support vector learning , 1999 .

[17]  Rameswar Debnath,et al.  Kernel Selection for the Support Vector Machine , 2004, IEICE Trans. Inf. Syst..

[18]  Kazuhiro Hotta,et al.  Object Categorization Based on Kernel Principal Component Analysis of Visual Words , 2008, 2008 IEEE Workshop on Applications of Computer Vision.

[19]  Pietro Perona,et al.  Combining generative models and Fisher kernels for object recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[20]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[21]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[22]  Gang Wang,et al.  Using Dependent Regions for Object Categorization in a Generative Framework , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Björn Stenger,et al.  A Framework for 3D Object Recognition Using the Kernel Constrained Mutual Subspace Method , 2006, ACCV.

[24]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[25]  Michael E. Tipping Sparse Kernel Principal Component Analysis , 2000, NIPS.