Accelerated Stochastic ADMM with Variance Reduction

Alternating Direction Method of Multipliers (ADMM) is a popular method in solving Machine Learning problems. Stochastic ADMM was firstly proposed in order to reduce the per iteration computational complexity, which is more suitable for big data problems. Recently, variance reduction techniques have been integrated with stochastic ADMM in order to get a fast convergence rate, such as SAG-ADMM and SVRG-ADMM,but the convergence is still suboptimal w.r.t the smoothness constant. In this paper, we propose a new accelerated stochastic ADMM algorithm with variance reduction, which enjoys a faster convergence than all the other stochastic ADMM algorithms. We theoretically analyze its convergence rate and show its dependence on the smoothness constant is optimal. We also empirically validate its effectiveness and show its priority over other stochastic ADMM algorithms.

[1]  Leon Wenliang Zhong,et al.  Fast Stochastic Alternating Direction Method of Multipliers , 2013, ICML.

[2]  Arindam Banerjee,et al.  Online Alternating Direction Method , 2012, ICML.

[3]  Konstantina Christakopoulou,et al.  Accelerated Alternating Direction Method of Multipliers , 2015, KDD.

[4]  Cuong V Nguyen,et al.  Accelerated Stochastic Mirror Descent Algorithms For Composite Non-strongly Convex Optimization , 2016 .

[5]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[6]  Shiqian Ma,et al.  Fast alternating linearization methods for minimizing the sum of two convex functions , 2009, Math. Program..

[7]  Renato D. C. Monteiro,et al.  Iteration-Complexity of Block-Decomposition Algorithms and the Alternating Direction Method of Multipliers , 2013, SIAM J. Optim..

[8]  Suvrit Sra,et al.  Towards an optimal stochastic alternating direction method of multipliers , 2014, ICML.

[9]  Taiji Suzuki,et al.  Dual Averaging and Proximal Gradient Descent for Online Alternating Direction Multiplier Method , 2013, ICML.

[10]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[11]  Bingsheng He,et al.  On non-ergodic convergence rate of Douglas–Rachford alternating direction method of multipliers , 2014, Numerische Mathematik.

[12]  Bingsheng He,et al.  On the O(1/n) Convergence Rate of the Douglas-Rachford Alternating Direction Method , 2012, SIAM J. Numer. Anal..

[13]  Yunmei Chen,et al.  An Accelerated Linearized Alternating Direction Method of Multipliers , 2014, SIAM J. Imaging Sci..

[14]  Zeyuan Allen-Zhu,et al.  Katyusha: the first direct acceleration of stochastic gradient methods , 2016, J. Mach. Learn. Res..

[15]  Jieping Ye,et al.  An efficient algorithm for a class of fused lasso problems , 2010, KDD.

[16]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[17]  Alexander G. Gray,et al.  Stochastic Alternating Direction Method of Multipliers , 2013, ICML.

[18]  Eric P. Xing,et al.  A multivariate regression approach to association analysis of a quantitative trait network , 2008, Bioinform..

[19]  Suvrit Sra,et al.  Fast Newton-type Methods for Total Variation Regularization , 2011, ICML.

[20]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[21]  R. Tibshirani,et al.  The solution path of the generalized lasso , 2010, 1005.1971.

[22]  James T. Kwok,et al.  Fast-and-Light Stochastic ADMM , 2016, IJCAI.