Bayesian adaptive matrix factorization with automatic model selection

Low-rank matrix factorization has long been recognized as a fundamental problem in many computer vision applications. Nevertheless, the reliability of existing matrix factorization methods is often hard to guarantee due to challenges brought by such model selection issues as selecting the noise model and determining the model capacity. We address these two issues simultaneously in this paper by proposing a robust non-parametric Bayesian adaptive matrix factorization (AMF) model. AMF proposes a new noise model built on the Dirichlet process Gaussian mixture model (DP-GMM) by taking advantage of its high flexibility on component number selection and capability of fitting a wide range of unknown noise. AMF also imposes an automatic relevance determination (ARD) prior on the low-rank factor matrices so that the rank can be determined automatically without the need for enforcing any hard constraint. An efficient variational method is then devised for model inference. We compare AMF with state-of-the-art matrix factorization methods based on data sets ranging from synthetic data to real-world application data. From the results, AMF consistently achieves better or comparable performance.

[1]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[2]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[3]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[4]  David Mackay,et al.  Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks , 1995 .

[5]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[6]  V. Maz'ya,et al.  On approximate approximations using Gaussian kernels , 1996 .

[7]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[8]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[9]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Michael J. Black,et al.  A Framework for Robust Subspace Learning , 2003, International Journal of Computer Vision.

[11]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[12]  Andrew W. Fitzgibbon,et al.  Damped Newton algorithms for matrix factorization with missing data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Takeo Kanade,et al.  Robust L/sub 1/ norm factorization in the presence of outliers and missing data by alternative convex programming , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[15]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[16]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[17]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[18]  Hossein Mobahi,et al.  Face recognition with contiguous occlusion using markov random fields , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Neil D. Lawrence,et al.  Non-linear matrix factorization with Gaussian processes , 2009, ICML '09.

[20]  Anders P. Eriksson,et al.  Efficient computation of robust low-rank matrix approximations in the presence of missing data using the L1 norm , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Lawrence Carin,et al.  Bayesian Robust Principal Component Analysis , 2011, IEEE Transactions on Image Processing.

[22]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[23]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[24]  Shuicheng Yan,et al.  Practical low-rank matrix approximation under robust L1-norm , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Laura Balzano,et al.  Incremental gradient on the Grassmannian for online foreground and background separation in subsampled video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Aggelos K. Katsaggelos,et al.  Sparse Bayesian Methods for Low-Rank Matrix Estimation , 2011, IEEE Transactions on Signal Processing.

[27]  Jingdong Wang,et al.  A Probabilistic Approach to Robust Matrix Factorization , 2012, ECCV.

[28]  Lei Zhang,et al.  A Cyclic Weighted Median Method for L1 Low-Rank Matrix Factorization with Missing Entries , 2013, AAAI.

[29]  Dit-Yan Yeung,et al.  Bayesian Robust Matrix Factorization for Image and Video Processing , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  James M. Rehg,et al.  GOSUS: Grassmannian Online Subspace Updates with Structured-Sparsity , 2013, 2013 IEEE International Conference on Computer Vision.

[31]  Deyu Meng,et al.  Robust Matrix Factorization with Unknown Noise , 2013, 2013 IEEE International Conference on Computer Vision.

[32]  Zhihua Zhang,et al.  Nonconvex Relaxation Approaches to Robust Matrix Recovery , 2013, IJCAI.

[33]  Xiaowei Zhou,et al.  Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Lei Zhang,et al.  Robust Principal Component Analysis with Complex Noise , 2014, ICML.

[35]  Y. Zhang,et al.  Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization , 2014, Optim. Methods Softw..