A Bregman-proximal point algorithm for robust non-negative matrix factorization with possible missing values and outliers - application to gene expression analysis

BackgroundNon-Negative Matrix factorization has become an essential tool for feature extraction in a wide spectrum of applications. In the present work, our objective is to extend the applicability of the method to the case of missing and/or corrupted data due to outliers.ResultsAn essential property for missing data imputation and detection of outliers is that the uncorrupted data matrix is low rank, i.e. has only a small number of degrees of freedom. We devise a new version of the Bregman proximal idea which preserves nonnegativity and mix it with the Augmented Lagrangian approach for simultaneous reconstruction of the features of interest and detection of the outliers using a sparsity promoting ℓ1 penality.ConclusionsAn application to the analysis of gene expression data of patients with bladder cancer is finally proposed.

[1]  Michael W. Berry,et al.  Email Surveillance Using Non-negative Matrix Factorization , 2005, Comput. Math. Organ. Theory.

[2]  Michael Möller,et al.  A Convex Model for Nonnegative Matrix Factorization and Dimensionality Reduction on Physical Space , 2011, IEEE Transactions on Image Processing.

[3]  Dacheng Tao,et al.  GoDec: Randomized Lowrank & Sparse Matrix Decomposition in Noisy Case , 2011, ICML.

[4]  Nicolas Gillis,et al.  The Why and How of Nonnegative Matrix Factorization , 2014, ArXiv.

[5]  J. Josse,et al.  missMDA: A Package for Handling Missing Values in Multivariate Data Analysis , 2016 .

[6]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[7]  Sabine Van Huffel,et al.  Hierarchical non‐negative matrix factorization (hNMF): a tissue pattern differentiation method for glioblastoma multiforme diagnosis using MRSI , 2013, NMR in biomedicine.

[8]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[9]  Hyunsoo Kim,et al.  Sparse Non-negative Matrix Factorizations via Alternating Non-negativity-constrained Least Squares , 2006 .

[10]  Joel A. Tropp,et al.  Factoring nonnegative matrices with linear programs , 2012, NIPS.

[11]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[12]  Haesun Park,et al.  Fast bregman divergence NMF using taylor expansion and coordinate descent , 2012, KDD.

[13]  A. Villers,et al.  Microarray gene expression profiling and analysis of bladder cancer supports the sub‐classification of T1 tumours into T1a and T1b stages , 2014, BJU international.

[14]  Chong-Yung Chi,et al.  A Convex Analysis Framework for Blind Separation of Non-Negative Sources , 2008, IEEE Transactions on Signal Processing.

[15]  Sen Jia,et al.  Constrained Nonnegative Matrix Factorization for Hyperspectral Unmixing , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[16]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[17]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[18]  Inderjit S. Dhillon,et al.  Generalized Nonnegative Matrix Approximations with Bregman Divergences , 2005, NIPS.

[19]  Nicolas Gillis,et al.  Fast and Robust Recursive Algorithmsfor Separable Nonnegative Matrix Factorization , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Jordi Vitrià,et al.  Non-negative Matrix Factorization for Face Recognition , 2002, CCIA.