The Sparse Matrix Transform for Covariance Estimation and Analysis of High Dimensional Signals

Covariance estimation for high dimensional signals is a classically difficult problem in statistical signal analysis and machine learning. In this paper, we propose a maximum likelihood (ML) approach to covariance estimation, which employs a novel non-linear sparsity constraint. More specifically, the covariance is constrained to have an eigen decomposition which can be represented as a sparse matrix transform (SMT). The SMT is formed by a product of pairwise coordinate rotations known as Givens rotations. Using this framework, the covariance can be efficiently estimated using greedy optimization of the log-likelihood function, and the number of Givens rotations can be efficiently computed using a cross-validation procedure. The resulting estimator is generally positive definite and well-conditioned, even when the sample size is limited. Experiments on a combination of simulated data, standard hyperspectral data, and face image sets show that the SMT-based covariance estimates are consistently more accurate than both traditional shrinkage estimates and recently proposed graphical lasso estimates for a variety of different classes and sample sizes. An important property of the new covariance estimate is that it naturally yields a fast implementation of the estimated eigen-transformation using the SMT representation. In fact, the SMT can be viewed as a generalization of the classical fast Fourier transform (FFT) in that it uses “butterflies” to represent an orthonormal transform. However, unlike the FFT, the SMT can be used for fast eigen-signal analysis of general non-stationary signals.

[1]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[2]  P. P. Vaidyanathan,et al.  Paraunitary filter banks and wavelet packets , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Charles A. Bouman,et al.  Noniterative MAP Reconstruction Using Sparse Matrix Representations , 2009, IEEE Transactions on Image Processing.

[4]  J. Friedman Regularized Discriminant Analysis , 1989 .

[5]  David A. Landgrebe,et al.  Covariance Matrix Estimation and Classification With Limited Training Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[7]  R. Kass,et al.  Shrinkage Estimators for Covariance Matrices , 2001, Biometrics.

[8]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[9]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[10]  R. Bellman,et al.  V. Adaptive Control Processes , 1964 .

[11]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .

[12]  James Theiler,et al.  Quantitative comparison of quadratic covariance-based anomalous change detectors. , 2008, Applied optics.

[13]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[14]  David A. Landgrebe,et al.  Signal Theory Methods in Multispectral Remote Sensing , 2003 .

[15]  Adam J. Rothman,et al.  Sparse permutation invariant covariance estimation , 2008, 0801.4837.

[16]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[17]  I. Johnstone,et al.  Sparse Principal Components Analysis , 2009, 0901.4392.

[18]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  I. Jolliffe,et al.  A Modified Principal Component Technique Based on the LASSO , 2003 .

[20]  W. Krzanowski Between-Groups Comparison of Principal Components , 1979 .

[21]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[22]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[23]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[24]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[25]  Robert Bregovic,et al.  Multirate Systems and Filter Banks , 2002 .

[26]  Allan D. Jepson,et al.  Sparse PCA: Extracting Multi-scale Structure from Data , 2001, ICCV.

[27]  J. Wade Davis,et al.  Statistical Pattern Recognition , 2003, Technometrics.

[28]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[29]  Charles A. Bouman,et al.  Weak signal detection in hyperspectral imagery using sparse matrix transform (smt) covariance estimation , 2009, 2009 First Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing.

[30]  Charles A. Bouman,et al.  Covariance Estimation for High Dimensional Data Vectors Using the Sparse Matrix Transform , 2008, NIPS.

[31]  Adam J. Rothman,et al.  Generalized Thresholding of Large Covariance Matrices , 2009 .

[32]  Gregory Piatetsky-Shapiro,et al.  High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality , 2000 .

[33]  Ann B. Lee,et al.  Treelets--An adaptive multi-scale basis for sparse unordered data , 2007, 0707.0481.

[34]  Sanjoy Dasgupta,et al.  Adaptive Control Processes , 2010, Encyclopedia of Machine Learning and Data Mining.

[35]  W. Givens Computation of Plain Unitary Rotations Transforming a General Matrix to Triangular Form , 1958 .

[36]  Gene H. Golub,et al.  Matrix computations , 1983 .

[37]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[38]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[39]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .