Recovery of Sparse Probability Measures via Convex Programming

We consider the problem of cardinality penalized optimization of a convex function over the probability simplex with additional convex constraints. The classical l1 regularizer fails to promote sparsity on the probability simplex since l1 norm on the probability simplex is trivially constant. We propose a direct relaxation of the minimum cardinality problem and show that it can be efficiently solved using convex programming. As a first application we consider recovering a sparse probability measure given moment constraints, in which our formulation becomes linear programming, hence can be solved very efficiently. A sufficient condition for exact recovery of the minimum cardinality solution is derived for arbitrary affine constraints. We then develop a penalized version for the noisy setting which can be solved using second order cone programs. The proposed method outperforms known rescaling heuristics based on l1 norm. As a second application we consider convex clustering using a sparse Gaussian mixture and compare our results with the well known soft k-means algorithm.

[1]  Arie Yeredor,et al.  On the use of sparsity for recovering discrete probability distributions from their moments , 2011, 2011 IEEE Statistical Signal Processing Workshop (SSP).

[2]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[3]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[4]  A. Willsky,et al.  The Convex algebraic geometry of linear inverse problems , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[5]  Michael Elad,et al.  From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images , 2009, SIAM Rev..

[6]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[7]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[8]  Polina Golland,et al.  Convex Clustering with Exemplar-Based Models , 2007, NIPS.