Convex relaxations of structured matrix factorizations

We consider the factorization of a rectangular matrix $X $ into a positive linear combination of rank-one factors of the form $u v^\top$, where $u$ and $v$ belongs to certain sets $\mathcal{U}$ and $\mathcal{V}$, that may encode specific structures regarding the factors, such as positivity or sparsity. In this paper, we show that computing the optimal decomposition is equivalent to computing a certain gauge function of $X$ and we provide a detailed analysis of these gauge functions and their polars. Since these gauge functions are typically hard to compute, we present semi-definite relaxations and several algorithms that may recover approximate decompositions with approximation guarantees. We illustrate our results with simulations on finding decompositions with elements in $\{0,1\}$. As side contributions, we present a detailed analysis of variational quadratic representations of norms as well as a new iterative basis pursuit algorithm that can deal with inexact first-order oracles.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  D. Boyd The power method for lp norms , 1974 .

[3]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[4]  G. Jameson Summing and nuclear norms in Banach space theory , 1987 .

[5]  V. N. Bogaevski,et al.  Matrix Perturbation Theory , 1991 .

[6]  藤重 悟 Submodular functions and optimization , 1991 .

[7]  安藤 毅 Completely positive matrices , 1991 .

[8]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[9]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[10]  Y. Nesterov Semidefinite relaxation and nonconvex quadratic optimization , 1998 .

[11]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[12]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[13]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[14]  J. Borwein,et al.  Convex Analysis And Nonlinear Optimization , 2000 .

[15]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[16]  S. Szarek,et al.  Chapter 8 - Local Operator Theory, Random Matrices and Banach Spaces , 2001 .

[17]  Renato D. C. Monteiro,et al.  A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization , 2003, Math. Program..

[18]  Adrian Lewis,et al.  The mathematics of eigenvalue optimization , 2003, Math. Program..

[19]  Noga Alon,et al.  Approximating the cut-norm via Grothendieck's inequality , 2004, STOC '04.

[20]  Adi Shraibman,et al.  Rank, Trace-Norm and Max-Norm , 2005, COLT.

[21]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[22]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[23]  Nathan Linial,et al.  Complexity measures of sign matrices , 2007, Comb..

[24]  A. Barron,et al.  Approximation and learning by greedy algorithms , 2008, 0803.1718.

[25]  Jean Ponce,et al.  Convex Sparse Matrix Factorizations , 2008, ArXiv.

[26]  I. Daubechies,et al.  Iteratively reweighted least squares minimization for sparse recovery , 2008, 0807.0575.

[27]  Laurent El Ghaoui,et al.  Robust Optimization , 2021, ICORES.

[28]  Nancy Bertin,et al.  Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[29]  Ruslan Salakhutdinov,et al.  Practical Large-Scale Optimization for Max-norm Regularization , 2010, NIPS.

[30]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[31]  Ben Taskar,et al.  Joint covariate selection and joint subspace selection for multiple classification problems , 2010, Stat. Comput..

[32]  Francis R. Bach,et al.  Low-Rank Optimization on the Cone of Positive Semidefinite Matrices , 2008, SIAM J. Optim..

[33]  Yurii Nesterov,et al.  Generalized Power Method for Sparse Principal Component Analysis , 2008, J. Mach. Learn. Res..

[34]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[35]  Tong Zhang,et al.  Analysis of Multi-stage Convex Relaxation for Sparse Regularization , 2010, J. Mach. Learn. Res..

[36]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[37]  Dimitri P. Bertsekas,et al.  A Unifying Polyhedral Approximation Framework for Convex Optimization , 2011, SIAM J. Optim..

[38]  F. Bach,et al.  Optimization with Sparsity-Inducing Penalties (Foundations and Trends(R) in Machine Learning) , 2011 .

[39]  Constantine Caramanis,et al.  Robust PCA via Outlier Pursuit , 2010, IEEE Transactions on Information Theory.

[40]  Julien Mairal,et al.  Optimization with Sparsity-Inducing Penalties , 2011, Found. Trends Mach. Learn..

[41]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[42]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[43]  Joel A. Tropp,et al.  Factoring nonnegative matrices with linear programs , 2012, NIPS.

[44]  Sanjeev Arora,et al.  Computing a nonnegative matrix factorization -- provably , 2011, STOC '12.

[45]  Yaoliang Yu,et al.  Accelerated Training for Matrix-norm Regularization: A Boosting Approach , 2012, NIPS.

[46]  Francis R. Bach,et al.  Convex Relaxation for Combinatorial Penalties , 2012, ArXiv.

[47]  Charles A. Micchelli,et al.  Regularizers for structured sparsity , 2010, Advances in Computational Mathematics.

[48]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[49]  B. Nadler,et al.  Do Semidefinite Relaxations Really Solve Sparse PCA , 2013 .

[50]  Francis R. Bach,et al.  Learning with Submodular Functions: A Convex Optimization Perspective , 2011, Found. Trends Mach. Learn..

[51]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[52]  Andrea Montanari,et al.  Finding Hidden Cliques of Size $$\sqrt{N/e}$$N/e in Nearly Linear Time , 2013, Found. Comput. Math..

[53]  Francis R. Bach,et al.  Duality Between Subgradient and Conditional Gradient Methods , 2012, SIAM J. Optim..

[54]  Zaïd Harchaoui,et al.  Conditional gradient algorithms for norm-regularized smooth convex optimization , 2013, Math. Program..