Distributionally Robust and Multi-Objective Nonnegative Matrix Factorization

Nonnegative matrix factorization (NMF) is a linear dimensionality reduction technique for analyzing nonnegative data. A key aspect of NMF is the choice of the objective function that depends on the noise model (or statistics of the noise) assumed on the data. In many applications, the noise model is unknown and difficult to estimate. In this paper, we define a multi-objective NMF (MO-NMF) problem, where several objectives are combined within the same NMF model. We propose to use Lagrange duality to judiciously optimize for a set of weights to be used within the framework of the weighted-sum approach, that is, we minimize a single objective function which is a weighted sum of the all objective functions. We design a simple algorithm using multiplicative updates to minimize this weighted sum. We show how this can be used to find distributionally robust NMF (DR-NMF) solutions, that is, solutions that minimize the largest error among all objectives. We illustrate the effectiveness of this approach on synthetic, document and audio datasets. The results show that DR-NMF is robust to our incognizance of the noise model of the NMF problem.

[1]  Paul Honeine,et al.  Biobjective Nonnegative Matrix Factorization: Linear Versus Kernel-Based Models , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[2]  Mihoko Minami,et al.  Robust Prewhitening for ICA by Minimizing β-Divergence and Its Application to FastICA , 2007, Neural Processing Letters.

[3]  Erkki Oja,et al.  Selecting β-Divergence for Nonnegative Matrix Factorization by Score Matching , 2012, ICANN.

[4]  Joydeep Ghosh,et al.  Under Consideration for Publication in Knowledge and Information Systems Generative Model-based Document Clustering: a Comparative Study , 2003 .

[5]  Wei Xia,et al.  An approach based on constrained nonnegative matrix factorization to unmix hyperspectral data , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Stephen A. Vavasis,et al.  On the Complexity of Nonnegative Matrix Factorization , 2007, SIAM J. Optim..

[7]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[8]  L. Hien,et al.  An Inexact Primal-Dual Smoothing Framework for Large-Scale Non-Bilinear Saddle Point Problems , 2017, J. Optim. Theory Appl..

[9]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[10]  A. Shamsai,et al.  Multi-objective Optimization , 2017, Encyclopedia of Machine Learning and Data Mining.

[11]  Chih-Jen Lin,et al.  On the Convergence of Multiplicative Update Algorithms for Nonnegative Matrix Factorization , 2007, IEEE Transactions on Neural Networks.

[12]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[13]  Nancy Bertin,et al.  Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[14]  Chuan-Sheng Foo,et al.  Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ICLR.

[15]  N. S. Aybat,et al.  A Primal-Dual Algorithm for General Convex-Concave Saddle Point Problems , 2018, 1803.01401.

[16]  Sanjeev Arora,et al.  A Practical Algorithm for Topic Modeling with Provable Guarantees , 2012, ICML.

[17]  Songtao Lu,et al.  Block Alternating Optimization for Non-convex Min-max Problems: Algorithms and Applications in Signal Processing and Communications , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Tamara G. Kolda,et al.  On Tensors, Sparsity, and Nonnegative Factorizations , 2011, SIAM J. Matrix Anal. Appl..

[19]  Tuomas Virtanen,et al.  Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  A. Juditsky 6 First-Order Methods for Nonsmooth Convex Large-Scale Optimization , II : Utilizing Problem ’ s Structure , 2010 .

[21]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[22]  Arkadi Nemirovski,et al.  Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[23]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[24]  Ali Taylan Cemgil,et al.  Learning mixed divergences in coupled matrix and tensor factorization models , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  Vincent Y. F. Tan,et al.  A Unified Convergence Analysis of the Multiplicative Update Algorithm for Regularized Nonnegative Matrix Factorization , 2016, IEEE Transactions on Signal Processing.

[26]  Andrzej Cichocki,et al.  New Algorithms for Non-Negative Matrix Factorization in Applications to Blind Source Separation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[27]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[28]  Erkki Oja,et al.  Unified Development of Multiplicative Algorithms for Linear and Quadratic Nonnegative Matrix Factorization , 2011, IEEE Transactions on Neural Networks.

[29]  Jean-Philippe Vial,et al.  Robust Optimization , 2021, ICORES.

[30]  Vincent Y. F. Tan,et al.  Automatic Relevance Determination in Nonnegative Matrix Factorization with the /spl beta/-Divergence , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Wing-Kin Ma,et al.  Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications , 2018, IEEE Signal Processing Magazine.

[32]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[33]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[34]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[35]  Nicolas Gillis Nonnegative matrix factorization : complexity, algorithms and applications , 2011 .

[36]  C. Févotte,et al.  Automatic Relevance Determination in Nonnegative Matrix Factorization with the-Divergence , 2011 .

[37]  Joachim Fritsch,et al.  High Quality Musical Audio Source Separation , 2012 .

[38]  Laurence A. Wolsey,et al.  Two “well-known” properties of subgradient optimization , 2009, Math. Program..

[39]  Nicolas Gillis,et al.  The Why and How of Nonnegative Matrix Factorization , 2014, ArXiv.

[40]  Slim Essid,et al.  Smooth Nonnegative Matrix Factorization for Unsupervised Audiovisual Document Structuring , 2013, IEEE Transactions on Multimedia.

[41]  Nicolas Gillis,et al.  Improved SVD-based Initialization for Nonnegative Matrix Factorization using Low-Rank Correction , 2018, Pattern Recognit. Lett..

[42]  Raul Kompass,et al.  A Generalized Divergence Measure for Nonnegative Matrix Factorization , 2007, Neural Computation.

[43]  Erkki Oja,et al.  Learning the Information Divergence , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Antonin Chambolle,et al.  On the ergodic convergence rates of a first-order primal–dual algorithm , 2016, Math. Program..

[45]  Kay Chen Tan,et al.  Multiobjective Sparse Non-Negative Matrix Factorization , 2019, IEEE Transactions on Cybernetics.

[46]  R. Marler,et al.  The weighted sum method for multi-objective optimization: new insights , 2010 .

[47]  Norikazu Takahashi,et al.  Global convergence of modified multiplicative updates for nonnegative matrix factorization , 2013, Computational Optimization and Applications.

[48]  Yoonsuck Choe,et al.  Learning α-integration with partially-labeled data , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[49]  M. C. U. Araújo,et al.  The successive projections algorithm for variable selection in spectroscopic multicomponent analysis , 2001 .

[50]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[51]  Yunmei Chen,et al.  Optimal Primal-Dual Methods for a Class of Saddle Point Problems , 2013, SIAM J. Optim..