Orbit Regularization

We propose a general framework for regularization based on group-induced ma-jorization. In this framework, a group is defined to act on the parameter space and an orbit is fixed; to control complexity, the model parameters are confined to the convex hull of this orbit (the orbitope). We recover several well-known regularizes as particular cases, and reveal a connection between the hyperoctahedral group and the recently proposed sorted l1-norm. We derive the properties a group must satisfy for being amenable to optimization with conditional and projected gradient algorithms. Finally, we suggest a continuation strategy for orbit exploration, presenting simulation results for the symmetric and hyperoctahedral groups.

[1]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[2]  Leon Wenliang Zhong,et al.  Efficient Sparse Modeling With Automatic Feature Grouping , 2011, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Panos M. Pardalos,et al.  Algorithms for a Class of Isotonic Regression Problems , 1999, Algorithmica.

[4]  Y. Tong Probability Inequalities in Multivariate Distributions , 1980 .

[5]  Dimitri P. Bertsekas,et al.  Convex Analysis and Optimization , 2003 .

[6]  Mário A. T. Figueiredo,et al.  Decreasing Weighted Sorted ℓ1 Regularization , 2014, ArXiv.

[7]  Yin Zhang,et al.  Fixed-Point Continuation for l1-Minimization: Methodology and Convergence , 2008, SIAM J. Optim..

[8]  Jean-Pierre Serre,et al.  Linear representations of finite groups , 1977, Graduate texts in mathematics.

[9]  Ronny Luss,et al.  Decomposing Isotonic Regression for Efficiently Solving Large Problems , 2010, NIPS.

[10]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[11]  Mário A. T. Figueiredo,et al.  Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems , 2007, IEEE Journal of Selected Topics in Signal Processing.

[12]  R. Jackson Inequalities , 2007, Algebra for Parents.

[13]  A.G.M. Steerneman G-Majorization, group-induced cone orderings, and reflection groups , 1990 .

[14]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[15]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[16]  Weijie J. Su,et al.  Statistical estimation and testing via the sorted L1 norm , 2013, 1310.1969.

[17]  J. Moreau Fonctions convexes duales et points proximaux dans un espace hilbertien , 1962 .

[18]  H. Wynn,et al.  G-majorization with applications to matrix orderings , 1985 .

[19]  M. R. Osborne,et al.  A new approach to variable selection in least squares problems , 2000 .

[20]  H. Bondell,et al.  Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR , 2008, Biometrics.

[21]  G. Ziegler Lectures on Polytopes , 1994 .

[22]  Mário A. T. Figueiredo,et al.  Decreasing Weighted Sorted ${\ell_1}$ Regularization , 2014, IEEE Signal Processing Letters.

[23]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[24]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[25]  Morris L. Eaton,et al.  On group induced orderings, monotone functions, and convolution theorems , 1984 .

[26]  Emmanuel J. Candès,et al.  STATISTICAL ESTIMATION AND TESTING VIA THE ORDERED l 1 NORM By Małgorzata , 2013 .

[27]  R. Fildes Journal of the Royal Statistical Society (B): Gary K. Grunwald, Adrian E. Raftery and Peter Guttorp, 1993, “Time series of continuous proportions”, 55, 103–116.☆ , 1993 .

[28]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.

[29]  I. Olkin,et al.  Inequalities: Theory of Majorization and Its Applications , 1980 .

[30]  L. Mirsky A trace inequality of John von Neumann , 1975 .

[31]  Julien Mairal,et al.  Convex optimization with sparsity-inducing norms , 2011 .

[32]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[33]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[34]  R. Tyrrell Rockafellar,et al.  Convex Analysis , 1970, Princeton Landmarks in Mathematics and Physics.

[35]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[36]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.