Stochastic Dual Coordinate Ascent with Alternating Direction Multiplier Method

We propose a new stochastic dual coordinate ascent technique that can be applied to a wide range of regularized learning problems. Our method is based on Alternating Direction Multiplier Method (ADMM) to deal with complex regularization functions such as structured regularizations. Although the original ADMM is a batch method, the proposed method offers a stochastic update rule where each iteration requires only one or few sample observations. Moreover, our method can naturally afford mini-batch update and it gives speed up of convergence. We show that, under mild assumptions, our method converges exponentially. The numerical experiments show that our method actually performs efficiently.

[1]  M. Powell A method for nonlinear constraints in minimization problems , 1969 .

[2]  M. Hestenes Multiplier and gradient methods , 1969 .

[3]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[4]  R. Tyrrell Rockafellar,et al.  Augmented Lagrangians and Applications of the Proximal Point Algorithm in Convex Programming , 1976, Math. Oper. Res..

[5]  Jean-Philippe Vert,et al.  Group lasso with overlap and graph lasso , 2009, ICML '09.

[6]  Lin Xiao,et al.  Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..

[7]  Hisashi Kashima,et al.  Statistical Performance of Convex Tensor Decomposition , 2011, NIPS.

[8]  J. Suykens,et al.  Nuclear Norms for Tensors and Their Use for Convex Multilinear Estimation , 2011 .

[9]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[10]  Mark W. Schmidt,et al.  A Stochastic Gradient Method with an Exponential Convergence Rate for Strongly-Convex Optimization with Finite Training Sets , 2012, ArXiv.

[11]  Donald Goldfarb,et al.  2 A Variable-Splitting Augmented Lagrangian Framework , 2011 .

[12]  Arindam Banerjee,et al.  Online Alternating Direction Method , 2012, ICML.

[13]  Tong Zhang,et al.  Proximal Stochastic Dual Coordinate Ascent , 2012, ArXiv.

[14]  Alexander G. Gray,et al.  Stochastic Alternating Direction Method of Multipliers , 2013, ICML.

[15]  Shai Shalev-Shwartz,et al.  Stochastic dual coordinate ascent methods for regularized loss , 2012, J. Mach. Learn. Res..

[16]  Taiji Suzuki,et al.  Dual Averaging and Proximal Gradient Descent for Online Alternating Direction Multiplier Method , 2013, ICML.

[17]  Shai Shalev-Shwartz,et al.  Accelerated Mini-Batch Stochastic Dual Coordinate Ascent , 2013, NIPS.

[18]  Alain Rakotomamonjy,et al.  Applying alternating direction method of multipliers for constrained dictionary learning , 2013, Neurocomputing.

[19]  Avleen Singh Bijral,et al.  Mini-Batch Primal and Dual Methods for SVMs , 2013, ICML.

[20]  Enbin Song,et al.  On the global and linear convergence of the generalized ADMM with three blocks , 2015, 2015 International Conference on Estimation, Detection and Information Fusion (ICEDIF).