The Convex Geometry of Linear Inverse Problems

In applications throughout science and engineering one is often faced with the challenge of solving an ill-posed inverse problem, where the number of available measurements is smaller than the dimension of the model to be estimated. However in many practical situations of interest, models are constrained structurally so that they only have a few degrees of freedom relative to their ambient dimension. This paper provides a general framework to convert notions of simplicity into convex penalty functions, resulting in convex optimization solutions to linear, underdetermined inverse problems. The class of simple models considered includes those formed as the sum of a few atoms from some (possibly infinite) elementary atomic set; examples include well-studied cases from many technical fields such as sparse vectors (signal processing, statistics) and low-rank matrices (control, statistics), as well as several others including sums of a few permutation matrices (ranked elections, multiobject tracking), low-rank tensors (computer vision, neuroscience), orthogonal matrices (machine learning), and atomic measures (system identification). The convex programming formulation is based on minimizing the norm induced by the convex hull of the atomic set; this norm is referred to as the atomic norm. The facial structure of the atomic norm ball carries a number of favorable properties that are useful for recovering simple models, and an analysis of the underlying convex geometry provides sharp estimates of the number of generic measurements required for exact and robust recovery of models from partial information. These estimates are based on computing the Gaussian widths of tangent cones to the atomic norm ball. When the atomic set has algebraic structure the resulting optimization problems can be solved or approximated via semidefinite programming. The quality of these approximations affects the number of measurements required for recovery, and this tradeoff is characterized via some examples. Thus this work extends the catalog of simple models (beyond sparse vectors and low-rank matrices) that can be recovered from limited linear information via tractable convex programming.

[1]  R. Dudley The Sizes of Compact Subsets of Hilbert Space and Continuity of Gaussian Processes , 1967 .

[2]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[3]  J. Kuelbs Probability on Banach spaces , 1978 .

[4]  G. Pisier Remarques sur un résultat non publié de B. Maurey , 1981 .

[5]  M. Fukushima,et al.  A generalized proximal point algorithm for certain non-convex minimization problems , 1981 .

[6]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[7]  G. Pisier Probabilistic methods in the geometry of Banach spaces , 1986 .

[8]  Y. Gordon On Milman's inequality and random subspaces which escape through a mesh in ℝ n , 1988 .

[9]  Martin E. Dyer,et al.  A random polynomial-time algorithm for approximating the volume of convex bodies , 1991, JACM.

[10]  F. Bonsall A GENERAL ATOMIC DECOMPOSITION THEOREM AND BANACH'S CLOSED RANGE THEOREM , 1991 .

[11]  L. Jones A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network Training , 1992 .

[12]  Marie-Françoise Roy,et al.  Real algebraic geometry , 1992 .

[13]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[14]  G. Ziegler Lectures on Polytopes , 1994 .

[15]  Joe W. Harris,et al.  Algebraic Geometry: A First Course , 1995 .

[16]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[17]  Kim-Chuan Toh,et al.  SDPT3 -- A Matlab Software Package for Semidefinite Programming , 1996 .

[18]  Ronald A. DeVore,et al.  Some remarks on greedy algorithms , 1996, Adv. Comput. Math..

[19]  Michel Deza,et al.  Geometry of cuts and metrics , 2009, Algorithms and combinatorics.

[20]  Elijah Polak,et al.  Optimization: Algorithms and Consistent Approximations , 1997 .

[21]  Y. Nesterov Quality of semidefinite relaxation for nonconvex quadratic optimization , 1997 .

[22]  Miklós Simonovits,et al.  Approximation of diameters: randomization doesn't help , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[23]  P. McMullen INTRODUCTION TO GEOMETRIC PROBABILITY , 1999 .

[24]  M. Ledoux The concentration of measure phenomenon , 2001 .

[25]  Tamara G. Kolda,et al.  Orthogonal Tensor Decompositions , 2000, SIAM J. Matrix Anal. Appl..

[26]  Kim-Chuan Toh,et al.  SDPT3 — a Matlab software package for semidefinite-quadratic-linear programming, version 3.0 , 2001 .

[27]  S. Szarek,et al.  Chapter 8 - Local Operator Theory, Random Matrices and Banach Spaces , 2001 .

[28]  Alexander Barvinok,et al.  A course in convexity , 2002, Graduate studies in mathematics.

[29]  Jiri Matousek,et al.  Lectures on discrete geometry , 2002, Graduate texts in mathematics.

[30]  Robert D. Nowak,et al.  An EM algorithm for wavelet-based image restoration , 2003, IEEE Trans. Image Process..

[31]  I. Daubechies,et al.  An iterative thresholding algorithm for linear inverse problems with a sparsity constraint , 2003, math/0307152.

[32]  Pablo A. Parrilo,et al.  Semidefinite programming relaxations for semialgebraic problems , 2003, Math. Program..

[33]  J. Lofberg,et al.  YALMIP : a toolbox for modeling and optimization in MATLAB , 2004, 2004 IEEE International Conference on Robotics and Automation (IEEE Cat. No.04CH37508).

[34]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[35]  Johan Löfberg,et al.  YALMIP : a toolbox for modeling and optimization in MATLAB , 2004 .

[36]  Noga Alon,et al.  Approximating the cut-norm via Grothendieck's inequality , 2004, STOC '04.

[37]  Patrick L. Combettes,et al.  Signal Recovery by Proximal Forward-Backward Splitting , 2005, Multiscale Model. Simul..

[38]  Adi Shraibman,et al.  Rank, Trace-Norm and Max-Norm , 2005, COLT.

[39]  C. F. Beckmann,et al.  Tensorial extensions of independent component analysis for multisubject FMRI analysis , 2005, NeuroImage.

[40]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[41]  D. Donoho,et al.  Sparse nonnegative solution of underdetermined linear equations by linear programming. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[42]  D. Donoho,et al.  Counting faces of randomly-projected polytopes when the projection radically lowers dimension , 2006, math/0607364.

[43]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[44]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[45]  A. Banerjee Convex Analysis and Optimization , 2006 .

[46]  M. Rudelson,et al.  Sparse reconstruction by convex relaxation: Fourier and Gaussian measurements , 2006, 2006 40th Annual Conference on Information Sciences and Systems.

[47]  David L. Donoho,et al.  High-Dimensional Centrally Symmetric Polytopes with Neighborliness Proportional to Dimension , 2006, Discret. Comput. Geom..

[48]  Yin Zhang,et al.  Fixed-Point Continuation for l1-Minimization: Methodology and Convergence , 2008, SIAM J. Optim..

[49]  Wotao Yin,et al.  Bregman Iterative Algorithms for (cid:2) 1 -Minimization with Applications to Compressed Sensing ∗ , 2008 .

[50]  Vin de Silva,et al.  Tensor rank and the ill-posedness of the best low-rank approximation problem , 2006, math/0607647.

[51]  Xuelong Li,et al.  Tensors in Image Processing and Computer Vision , 2009, Advances in Pattern Recognition.

[52]  S. Geer,et al.  On the conditions used to prove oracle results for the Lasso , 2009, 0910.0722.

[53]  Mihailo Stojnic,et al.  Various thresholds for ℓ1-optimization in compressed sensing , 2009, ArXiv.

[54]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[55]  Holger Rauhut,et al.  Circulant and Toeplitz matrices in compressed sensing , 2009, ArXiv.

[56]  M. Stojnic Various thresholds for $\ell_1$-optimization in compressed sensing , 2009 .

[57]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[58]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.

[59]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[60]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[61]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[62]  Jian-Feng Cai,et al.  Linearized Bregman iterations for compressed sensing , 2009, Math. Comput..

[63]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[64]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[65]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[66]  Weiyu Xu,et al.  Compressive Sensing over the Grassmann Manifold: a Unified Geometric Framework , 2010, ArXiv.

[67]  Victor Vianu,et al.  Invited articles section foreword , 2010, JACM.

[68]  Rekha R. Thomas,et al.  Theta Bodies for Polynomial Ideals , 2008, SIAM J. Optim..

[69]  David L. Donoho,et al.  Counting the Faces of Randomly-Projected Hypercubes and Orthants, with Applications , 2008, Discret. Comput. Geom..

[70]  Robert D. Nowak,et al.  Toeplitz Compressed Sensing Matrices With Applications to Sparse Channel Estimation , 2010, IEEE Transactions on Information Theory.

[71]  Emmanuel J. Candès,et al.  Tight oracle bounds for low-rank matrix recovery from a minimal number of random measurements , 2010, ArXiv.

[72]  Weiyu Xu,et al.  Precise Stability Phase Transitions for $\ell_1$ Minimization: A Unified Geometric Framework , 2011, IEEE Transactions on Information Theory.

[73]  Devavrat Shah,et al.  Inferring Rankings Using Constrained Sensing , 2009, IEEE Transactions on Information Theory.

[74]  Emmanuel J. Candès,et al.  Tight Oracle Inequalities for Low-Rank Matrix Recovery From a Minimal Number of Noisy Random Measurements , 2011, IEEE Transactions on Information Theory.

[75]  Shiqian Ma,et al.  Fixed point and Bregman iterative methods for matrix rank minimization , 2009, Math. Program..

[76]  Weiyu Xu,et al.  Null space conditions and thresholds for rank minimization , 2011, Math. Program..

[77]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[78]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[79]  Benjamin Recht,et al.  Probability of unique integer solution to a system of linear equations , 2011, Eur. J. Oper. Res..

[80]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[81]  Yurii Nesterov,et al.  Gradient methods for minimizing composite functions , 2012, Mathematical Programming.