Bayesian Robust PCA of Incomplete Data

We present a probabilistic model for robust factor analysis and principal component analysis in which the observation noise is modeled by Student-t distributions in order to reduce the negative effect of outliers. The Student-t distributions are modeled independently for each data dimensions, which is different from previous works using multivariate Student-t distributions. We compare methods using the proposed noise distribution, the multivariate Student-t and the Laplace distribution. Intractability of evaluating the posterior probability density is solved by using variational Bayesian approximation methods. We demonstrate that the assumed noise model can yield accurate reconstructions because corrupted elements of a bad quality sample can be reconstructed using the other elements of the same data vector. Experiments on an artificial dataset and a weather dataset show that the dimensional independency and the flexibility of the proposed Student-t noise model can make it superior in some applications.

[1]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[2]  Alexander Ilin,et al.  Transformations in variational Bayesian factor analysis to speed up learning , 2010, Neurocomputing.

[3]  Lawrence Carin,et al.  Bayesian Robust Principal Component Analysis , 2011, IEEE Transactions on Image Processing.

[4]  I. Jolliffe Principal Component Analysis , 2002 .

[5]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[6]  J. Zhao,et al.  Probabilistic PCA for t distributions , 2006, Neurocomputing.

[7]  D. Rubin,et al.  ML ESTIMATION OF THE t DISTRIBUTION USING EM AND ITS EXTENSIONS, ECM AND ECME , 1999 .

[8]  Junbin Gao,et al.  Robust L1 Principal Component Analysis and Its Bayesian Variational Inference , 2008, Neural Computation.

[9]  Ziming Kou,et al.  Notice of Violation of IEEE Publication PrinciplesVideo Stabilization by Sparse and Low-Rank Matrix Decomposition , 2011, 2011 International Conference on Computer and Management (CAMAN).

[10]  John Wright,et al.  Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization , 2009, NIPS.

[11]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[12]  Sam T. Roweis,et al.  EM Algorithms for PCA and SPCA , 1997, NIPS.

[13]  Alexander Ilin,et al.  Variational Gaussian-process factor analysis for modeling spatio-temporal data , 2009, NIPS.

[14]  John Wright,et al.  Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization , 2009, NIPS.

[15]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[16]  J ValdésJulio,et al.  2006 Special issue , 2006 .

[17]  Charles M. Bishop Variational principal components , 1999 .

[18]  Erkki Oja,et al.  Exploratory analysis of climate data using source separation methods , 2006, Neural Networks.

[19]  Andrzej Cichocki,et al.  Adaptive Blind Signal and Image Processing - Learning Algorithms and Applications , 2002 .

[20]  Jianhua Zhao,et al.  A note on variational Bayesian factor analysis , 2009, Neural Networks.

[21]  Tapani Raiko,et al.  Tkk Reports in Information and Computer Science Practical Approaches to Principal Component Analysis in the Presence of Missing Values Tkk Reports in Information and Computer Science Practical Approaches to Principal Component Analysis in the Presence of Missing Values , 2022 .

[22]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[23]  Michel Verleysen,et al.  Robust probabilistic projections , 2006, ICML.

[24]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[25]  Frank Dellaert,et al.  Robust Generative Subspace Modeling: The Subspace t Distribution , 2004 .

[26]  A. Willsky,et al.  Sparse and low-rank matrix decompositions , 2009 .

[27]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[28]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.