论文信息 - Invexity Preserving Transformations for Projection Free Optimization with Sparsity Inducing Non-convex Constraints

Invexity Preserving Transformations for Projection Free Optimization with Sparsity Inducing Non-convex Constraints

Forward stagewise and Frank Wolfe are popular gradient based projection free optimization algorithms which both require convex constraints. We propose a method to extend the applicability of these algorithms to problems of the form \(\min _x f(x) \quad s.t. \quad g(x) \le \kappa \) where f(x) is an invex (Invexity is a generalization of convexity and ensures that all local optima are also global optima.) objective function and g(x) is a non-convex constraint. We provide a theorem which defines a class of monotone component-wise transformation functions \(x_i = h(z_i)\). These transformations lead to a convex constraint function \(G(z) = g(h(z))\). Assuming invexity of the original function f(x) that same transformation \(x_i = h(z_i)\) will lead to a transformed objective function \(F(z) = f(h(z))\) which is also invex. For algorithms that rely on a non-zero gradient \(\nabla F\) to produce new update steps invexity ensures that these algorithms will move forward as long as a descent direction exists.

Volker Roth | Damian Murezzan | Sebastian Mathias Keller

[1] Gal Chechik,et al. Information Bottleneck for Gaussian Variables , 2003, J. Mach. Learn. Res..

[2] Ryan J. Tibshirani,et al. A general framework for fast stagewise algorithms , 2014, J. Mach. Learn. Res..

[3] J. Friedman. Fast sparse regression and classification , 2012 .

[4] T. Hastie,et al. SparseNet: Coordinate Descent With Nonconvex Penalties , 2011, Journal of the American Statistical Association.

[5] Jianqing Fan,et al. Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[6] Jun Wang,et al. A One-Layer Recurrent Neural Network for Constrained Nonsmooth Optimization , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7] Bhaskar D. Rao,et al. Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm , 1997, IEEE Trans. Signal Process..

[8] Adi Ben-Israel,et al. What is invexity? , 1986, The Journal of the Australian Mathematical Society. Series B. Applied Mathematics.

[9] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[10] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .

[11] A. Rakotomamonjy,et al. Solving non-convex lasso type problems with DC programming , 2008, 2008 IEEE Workshop on Machine Learning for Signal Processing.

[12] Volker Roth,et al. Meta-Gaussian Information Bottleneck , 2012, NIPS.

[13] G. Giorgi. On First Order Sufficient Conditions for Constrained Optima , 1995 .

[14] Serena Morigi,et al. Convex Image Denoising via Non-Convex Regularization , 2015, SSVM.

[15] Cun-Hui Zhang. Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[16] R. Tibshirani,et al. Forward stagewise regression and the monotone lasso , 2007, 0705.0269.

[17] Martin Jaggi,et al. Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[18] Peter V. Gehler,et al. Learning Output Kernels with Block Coordinate Descent , 2011, ICML.

[19] Volker Roth,et al. Sparse meta-Gaussian information bottleneck , 2014, ICML.