Analysis of Multi-stage Convex Relaxation for Sparse Regularization

We consider learning formulations with non-convex objective functions that often occur in practical applications. There are two approaches to this problem: Heuristic methods such as gradient descent that only find a local minimum. A drawback of this approach is the lack of theoretical guarantee showing that the local minimum gives a good solution. Convex relaxation such as L1-regularization that solves the problem under some conditions. However it often leads to a sub-optimal solution in reality. This paper tries to remedy the above gap between theory and practice. In particular, we present a multi-stage convex relaxation scheme for solving problems with non-convex objective functions. For learning formulations with sparse regularization, we analyze the behavior of a specific multi-stage relaxation scheme. Under appropriate conditions, we show that the local solution obtained by this procedure is superior to the global solution of the standard L1 convex relaxation for learning sparse targets.

[1]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[2]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[3]  H. P. Annales de l'Institut Henri Poincaré , 1931, Nature.

[4]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[5]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[6]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[7]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[8]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[9]  V. Koltchinskii Sparsity in penalized empirical risk minimization , 2009 .

[10]  A. Tsybakov,et al.  Sparsity oracle inequalities for the Lasso , 2007, 0705.3308.

[11]  Charles A. Micchelli,et al.  A Spectral Regularization Framework for Multi-Task Structure Learning , 2007, NIPS.

[12]  Tong Zhang,et al.  Adaptive Forward-Backward Greedy Algorithm for Sparse Learning with Linear Models , 2008, NIPS.

[13]  Cun-Hui Zhang,et al.  The sparsity and bias of the Lasso selection in high-dimensional linear regression , 2008, 0808.0967.

[14]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[15]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[16]  Tong Zhang Some sharp performance bounds for least squares regression with L1 regularization , 2009, 0908.2869.

[17]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[18]  G. Pisier The volume of convex bodies and Banach space geometry , 1989 .

[19]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..