论文信息 - Fast Method of Principal Component Analysis Based on L1-Norm Maximization Algorithm

Fast Method of Principal Component Analysis Based on L1-Norm Maximization Algorithm

In data-analysis problems with a large number of dimension, principal component analysis based on L2-norm (L2PCA) is one of the most popular methods, but L2-PCA is sensitive to outliers. Unlike L2-PCA, PCA-L1 is robust to outliers because it utilizes the L1-norm, which is less sensitive to outliers. Furthermore, the bases obtained by PCA-L1 is invariant to rotations. However, PCA-L1 needs long time to calculate bases, because PCA-L1 employs an iterative algorithm to obtain each basis, and requires to calculate an eigenvector of autocorrelation matrix as an initial vector. The autocorrelation matrix needs to be recalculated for each basis. In this paper, we propose a fast method to compute the autocorrelation matrices. In order to verify the proposed method, we apply L2-PCA, PCA-L1, and the proposed method to face recognition. Simulation results show that the proposed method provides same recognition performance as PCA-L1, and is superior to L2-PCA, while the execution time is less than PCA-L1. I. I NTRODUCTION In data-analysis problems with a large number of dimension, principal component analysis (PCA) is one of the most popular methods. PCA is an operation that finds orthonormal bases to project in a subspace among multivariable data. Various methods in order to obtain the bases are proposed, and the most popular method is a PCA based on L2-norm (L2-PCA). Projection values of data onto the bases derived from L2-PCA have the greatest number of variance. Although L2-PCA has been successful for many problems, the influence of outliers on the principal bases are significant due to the L2-norm criterion. The influence seems to be reduced by the PCA based on L1norm (L1-PCA). Unlike L2-PCA, L1-PCA is robust to outliers because it utilizes the L1-norm, which is less sensitive to outliers. However, it is difficult to calculate exact solutions of L1-PCA. To solve this problem, Kwak proposes a scheme employing a substitute formula based on L1-norm, designated as PCA-L1 [1], to obtain principal bases easily. Furthermore, the bases obtained by PCA-L1 is invariant to rotations. The detail of PCA-L1 is described in Section II; PCA-L1 employs an iterative algorithm to compute each basis, and requires to calculate an eigenvector of autocorrelation matrix as an initial vector. The autocorrelation matrix is computed by the projected data onto the orthogonal complement of already calculated eigenvectors. Thus, autocorrelation matrix needs to be recalculated for each basis. This paper proposes a fast method to compute the autocorrelation matrices. In order to verify the proposed method, we apply L2-PCA, PCA-L1, and the proposed method to face recognition. The rest of this paper is organized as follows: In Section II , PCA-L1 algorithm is formulated. The proposed method is explained in Section II I. In Section IV, we mention face recognition technique. The performance of the proposed method is compared with the conventional methods in Section V and the conclusion in Section VI. II. PCA-L1 ALGORITHM Let X = [x1, . . . , xn] ∈ Rd×n be the given data, where n andd denote the number of samples and the dimension of the original input space, respectively. Without loss of generality, xi n i=1 is assumed to have zero mean. In L2-PCA, one tries to find anW ∗ which is m(< d) dimensional linear subspace. The W ∗ is the solution of the following dual problem: W ∗ = argmax W ‖W SW ‖2= argmax W ‖W X ‖2, (1) Subject to W W = Im, whereW ∈ Rd×m is the projection matrix whose columns wk m k=1 constitute the bases of the m-dimensional linear subspace (feature space), S = XX is the autocorrelation matrix of X, Im is them×m identity matrix, and‖ · ‖2 denotes the L2-norm of a matrix or a vector. The methods based on the L2-norm are sensitive to outliers, so we use the methods based on the L1-norm which is robust to outliers than the L2-norm. In PCA-L1, one tries to find anW ∗ which is m(< d) dimensional linear subspace. The W ∗ is the solution of the following dual problem: W ∗ = argmax W ‖W X ‖1, (2) Subject to W W = Im. Here, the constraint W W = Im ensures the orthonormality of the projection matrix. The solution of (2) is invariant to rotations because the maximization is done on the subspace and it is expected to be more robust to outliers than the L2 solution. As a downside, finding a global solution of (2) for m > 1 is very difficult. To ameliorate this problem, Kwak simplify (2) into series ofm = 1 problems using a greedy search method. If we set m = 1, (2) becomes the following optimization problem: w∗ = argmax W ‖ wX ‖1= n ∑

Yoshimitsu Kuroki | Nobuhiro Funatsu | Y. Kuroki | Nobuhiro Funatsu

[1] Nojun Kwak,et al. Principal Component Analysis Based on L1-Norm Maximization , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] David J. Kriegman,et al. Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.