Algorithms for accelerated convergence of adaptive PCA

We derive and discuss new adaptive algorithms for principal component analysis (PCA) that are shown to converge faster than the traditional PCA algorithms due to Oja, Sanger, and Xu. It is well known that traditional PCA algorithms that are derived by using gradient descent on an objective function are slow to converge. Furthermore, the convergence of these algorithms depends on appropriate choices of the gain sequences. Since online applications demand faster convergence and an automatic selection of gains, we present new adaptive algorithms to solve these problems. We first present an unconstrained objective function, which can be minimized to obtain the principal components. We derive adaptive algorithms from this objective function by using: 1) gradient descent; 2) steepest descent; 3) conjugate direction; and 4) Newton-Raphson methods. Although gradient descent produces Xu's LMSER algorithm, the steepest descent, conjugate direction, and Newton-Raphson methods produce new adaptive algorithms for PCA. We also provide a discussion on the landscape of the objective function, and present a global convergence proof of the adaptive gradient descent PCA algorithm using stochastic approximation theory. Extensive experiments with stationary and nonstationary multidimensional Gaussian sequences show faster convergence of the new algorithms over the traditional gradient descent methods.We also compare the steepest descent adaptive algorithm with state-of-the-art methods on stationary and nonstationary sequences.

[1]  Yingbo Hua,et al.  Fast subspace tracking and neural network learning by a novel information criterion , 1998, IEEE Trans. Signal Process..

[2]  Terence D. Sanger,et al.  Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.

[3]  Shingo Tomita,et al.  An optimal orthonormal system for discriminant analysis , 1985, Pattern Recognit..

[4]  Edwin K. P. Chong,et al.  On relative convergence properties of principal component analysis algorithms , 1998, IEEE Trans. Neural Networks.

[5]  E. Oja,et al.  On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix , 1985 .

[6]  Mark D. Plumbley Lyapunov functions for convergence of principal component algorithms , 1995, Neural Networks.

[7]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[8]  Nikolas P. Galatsanos,et al.  Regularized total least squares reconstruction for optical tomographic imaging using conjugate gradient method , 1997, Proceedings of International Conference on Image Processing.

[9]  Kurt Hornik,et al.  Learning in linear neural networks: a survey , 1995, IEEE Trans. Neural Networks.

[10]  Mahmood R. Azimi-Sadjadi,et al.  Principal component extraction using recursive least squares learning , 1995, IEEE Trans. Neural Networks.

[11]  E. M. Dowling,et al.  Conjugate gradient projection subspace tracking , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[12]  Y. Chauvin,et al.  Principal component analysis by gradient descent on a constrained linear Hebbian cell , 1989, International 1989 Joint Conference on Neural Networks.

[13]  Michael D. Zoltowski,et al.  Self-organizing algorithms for generalized eigen-decomposition , 1997, IEEE Trans. Neural Networks.

[14]  Soura Dasgupta,et al.  Adaptive estimation of eigensubspace , 1995, IEEE Trans. Signal Process..

[15]  Vwani P. Roychowdhury,et al.  Self-Organizing and Adaptive Algorithms for Generalized Eigen-Decomposition , 1996, NIPS.

[16]  E. M. Dowling,et al.  Conjugate gradient eigenstructure tracking for adaptive spectral estimation , 1995, IEEE Trans. Signal Process..

[17]  Tapan K. Sarkar,et al.  A survey of conjugate gradient algorithms for solution of extreme eigen-problems of a symmetric matrix , 1989, IEEE Trans. Acoust. Speech Signal Process..

[18]  T. Sarkar,et al.  Application of the conjugate gradient and steepest descent for computing the eigenvalues of an operator , 1989 .

[19]  Andrzej Cichocki,et al.  Neural networks for optimization and signal processing , 1993 .

[20]  Bin Yang,et al.  Projection approximation subspace tracking , 1995, IEEE Trans. Signal Process..

[21]  Gene H. Golub,et al.  Matrix computations , 1983 .

[22]  Eric M. Dowling,et al.  Conjugate gradient projection subspace tracking , 1997, IEEE Trans. Signal Process..

[23]  Tamer Basar,et al.  Analysis of Recursive Stochastic Algorithms , 2001 .

[24]  Lei Xu,et al.  Least mean square error reconstruction principle for self-organizing neural-nets , 1993, Neural Networks.

[25]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .