论文信息 - Sparse Learning for Stochastic Composite Optimization

Sparse Learning for Stochastic Composite Optimization

In this paper, we focus on Stochastic Composite Optimization (SCO) for sparse learning that aims to learn a sparse solution. Although many SCO algorithms have been developed for sparse learning with an optimal convergence rate $O(1/T)$, they often fail to deliver sparse solutions at the end either because of the limited sparsity regularization during stochastic optimization or due to the limitation in online-to-batch conversion. To improve the sparsity of solutions obtained by SCO, we propose a simple but effective stochastic optimization scheme that adds a novel sparse online-to-batch conversion to the traditional SCO algorithms. The theoretical analysis shows that our scheme can find a solution with better sparse patterns without affecting the convergence rate. Experimental results on both synthetic and real-world data sets show that the proposed methods are more effective in recovering the sparse solution and have comparable convergence rate as the state-of-the-art SCO algorithms for sparse learning.

[1] Xi Chen,et al. Optimal Regularized Dual Averaging Methods for Stochastic Optimization , 2012, NIPS.

[2] Saeed Ghadimi,et al. Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: A Generic Algorithmic Framework , 2012, SIAM J. Optim..

[3] Guanghui Lan,et al. An optimal method for stochastic composite optimization , 2011, Mathematical Programming.

[4] Ohad Shamir,et al. Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization , 2011, ICML.

[5] Lin Xiao,et al. Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..

[6] Yoram Singer,et al. Efficient Online and Batch Learning Using Forward Backward Splitting , 2009, J. Mach. Learn. Res..

[7] Emmanuel J. Candès,et al. NESTA: A Fast and Accurate First-Order Method for Sparse Recovery , 2009, SIAM J. Imaging Sci..

[8] I. Daubechies,et al. Iteratively reweighted least squares minimization for sparse recovery , 2008, 0807.0575.

[9] John Langford,et al. Sparse Online Learning via Truncated Gradient , 2008, NIPS.

[10] Wotao Yin,et al. Iteratively reweighted algorithms for compressive sensing , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11] Y. Nesterov. Gradient methods for minimizing composite objective function , 2007 .

[12] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[13] Yoram Singer,et al. Data-Driven Online to Batch Conversions , 2005, NIPS.

[14] Nick Littlestone,et al. From on-line to batch learning , 1989, COLT '89.

[15] Qihang Lin. A Sparsity Preserving Stochastic Gradient Method for Composite Optimization , 2011 .

[16] Elad Hazan. 24th Annual Conference on Learning Theory Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization , 2022 .