Truncated Cauchy Non-Negative Matrix Factorization

Non-negative matrix factorization (NMF) minimizes the euclidean distance between the data matrix and its low rank approximation, and it fails when applied to corrupted data because the loss function is sensitive to outliers. In this paper, we propose a Truncated CauchyNMF loss that handle outliers by truncating large errors, and develop a Truncated CauchyNMF to robustly learn the subspace on noisy datasets contaminated by outliers. We theoretically analyze the robustness of Truncated CauchyNMF comparing with the competing models and theoretically prove that Truncated CauchyNMF has a generalization bound which converges at a rate of order <inline-formula><tex-math notation="LaTeX">$O(\sqrt{{\ln n}/{n}})$</tex-math><alternatives><inline-graphic xlink:href="guan-ieq1-2777841.gif"/></alternatives></inline-formula> , where <inline-formula><tex-math notation="LaTeX">$n$</tex-math><alternatives> <inline-graphic xlink:href="guan-ieq2-2777841.gif"/></alternatives></inline-formula> is the sample size. We evaluate Truncated CauchyNMF by image clustering on both simulated and real datasets. The experimental results on the datasets containing gross corruptions validate the effectiveness and robustness of Truncated CauchyNMF for learning robust subspaces.

[1]  Zhigang Luo,et al.  NeNMF: An Optimal Gradient Method for Nonnegative Matrix Factorization , 2012, IEEE Transactions on Signal Processing.

[2]  Nicolas Gillis,et al.  Robust near-separable nonnegative matrix factorization using linear optimization , 2013, J. Mach. Learn. Res..

[3]  Raymond H. Chan,et al.  The Equivalence of Half-Quadratic Minimization and the Gradient Linearization Iteration , 2007, IEEE Transactions on Image Processing.

[4]  A. Martínez,et al.  The AR face databasae , 1998 .

[5]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[6]  D. Tao,et al.  On the robustness and generalization of Cauchy regression , 2014, 2014 4th IEEE International Conference on Information Science and Technology.

[7]  Gwenda J. Cane Linear Estimation of Parameters of the Cauchy Distribution Based on Sample Quantiles , 1974 .

[8]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[9]  Gene H. Golub,et al.  Singular value decomposition and least squares solutions , 1970, Milestones in Matrix Computation.

[10]  D. Perrett,et al.  Recognition of objects and their component parts: responses of single units in the temporal cortex of the macaque. , 1994, Cerebral cortex.

[11]  Thomas S. Huang,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation. , 2011, IEEE transactions on pattern analysis and machine intelligence.

[12]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[13]  Feiping Nie,et al.  Robust Capped Norm Nonnegative Matrix Factorization: Capped Norm NMF , 2015, CIKM.

[14]  A. Ben Hamza,et al.  Reconstruction of reflectance spectra using robust nonnegative matrix factorization , 2006, IEEE Transactions on Signal Processing.

[15]  Chris H. Q. Ding,et al.  Robust Non-Negative Dictionary Learning , 2014, AAAI.

[16]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[17]  John Shawe-Taylor,et al.  MahNMF: Manhattan Non-negative Matrix Factorization , 2012, ArXiv.

[18]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[19]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[20]  Edmund Y. Lam,et al.  Non-negative matrix factorization for images with Laplacian noise , 2008, APCCAS 2008 - 2008 IEEE Asia Pacific Conference on Circuits and Systems.

[21]  M. Tarr,et al.  Visual Object Recognition , 1996, ISTCS.

[22]  Pablo Tamayo,et al.  Metagenes and molecular pattern discovery using matrix factorization , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[25]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[26]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Trevor Darrell,et al.  Heavy-tailed Distances for Gradient Based Image Descriptors , 2011, NIPS.

[28]  Xuan Li,et al.  Robust Nonnegative Matrix Factorization via Half-Quadratic Minimization , 2012, 2012 IEEE 12th International Conference on Data Mining.

[29]  Tong Zhang,et al.  Covering Number Bounds of Certain Regularized Linear Function Classes , 2002, J. Mach. Learn. Res..

[30]  Dit-Yan Yeung,et al.  Bayesian adaptive matrix factorization with automatic model selection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  V. P. Pauca,et al.  Nonnegative matrix factorization for spectral data analysis , 2006 .

[33]  Xiaofei He,et al.  Robust non-negative matrix factorization , 2011 .

[34]  Michael Lindenbaum,et al.  Nonnegative Matrix Factorization with Earth Mover's Distance Metric for Image Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Chiranjib Bhattacharyya,et al.  Non-negative Matrix Factorization under Heavy Noise , 2016, ICML.

[36]  Weifeng Liu,et al.  Correntropy: Properties and Applications in Non-Gaussian Signal Processing , 2007, IEEE Transactions on Signal Processing.

[37]  Hossein Mobahi,et al.  Face recognition with contiguous occlusion using markov random fields , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[38]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[39]  P. Sabatier A L 1 -norm Pca and a Heuristic Approach , 1996 .

[40]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[41]  Donald Geman,et al.  Nonlinear image recovery with half-quadratic regularization , 1995, IEEE Trans. Image Process..

[42]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[43]  Lai K. Chan,et al.  Linear Estimation of the Location and Scale Parameters of the Cauchy Distribution Based on Sample Quantiles , 1970 .

[44]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[45]  Ferenc Nagy,et al.  Parameter Estimation of the Cauchy Distribution in Information Theory Approach , 2006, J. Univers. Comput. Sci..

[46]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[47]  Michael W. Berry,et al.  Text Mining Using Non-Negative Matrix Factorizations , 2004, SDM.