Edge-Based Stochastic Gradient Algorithm for Distributed Optimization

This paper investigates distributed optimization problems where a group of networked nodes collaboratively minimizes the sum of all local objective functions. The local objective function of each node is further set as an average of a finite set of subfunctions. This adjustment is motivated by machine learning problems with large training samples distributed and known privately to individual computational nodes. An augmented Lagrange (AL) stochastic gradient algorithm is presented to address the distributed optimization problem, which is integrated with the factorization of weighted Laplacian and local unbiased stochastic averaging gradient methods. At each iteration, only one randomly selected gradient of a subfunction is evaluated at a node, and a variance-reduced stochastic averaging gradient technique is applied to approximate the gradient of local objective function. Strong convexity of the local subfunction and Lipschitz continuity of its gradient are shown to ensure a linear convergence rate of the proposed algorithm in expectation. Numerical experiments on a logistic regression problem demonstrate the correctness of theoretical results.

[1]  Yiguang Hong,et al.  On Convergence Rate of Distributed Stochastic Gradient Algorithm for Convex Optimization with Inequality Constraints , 2016, SIAM J. Control. Optim..

[2]  Qing Ling,et al.  DLM: Decentralized Linearized Alternating Direction Method of Multipliers , 2015, IEEE Transactions on Signal Processing.

[3]  Tingwen Huang,et al.  Distributed Projection Subgradient Algorithm Over Time-Varying General Unbalanced Directed Graphs , 2019, IEEE Transactions on Automatic Control.

[4]  Angelia Nedic,et al.  A Distributed Stochastic Gradient Tracking Method , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[5]  Qing Ling,et al.  On the Linear Convergence of the ADMM in Decentralized Consensus Optimization , 2013, IEEE Transactions on Signal Processing.

[6]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[7]  Usman A. Khan,et al.  ADD-OPT: Accelerated Distributed Directed Optimization , 2016, IEEE Transactions on Automatic Control.

[8]  Magnus Egerstedt,et al.  Formation of Robust Multi-Agent Networks through Self-Organizing Random Regular Graphs , 2015, IEEE Transactions on Network Science and Engineering.

[9]  Na Li,et al.  Harnessing smoothness to accelerate distributed optimization , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[10]  Michael G. Rabbat,et al.  Distributed consensus and optimization under communication delays , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[11]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[12]  Daniel W. C. Ho,et al.  Optimal distributed stochastic mirror descent for strongly convex optimization , 2016, Autom..

[13]  Mark W. Schmidt,et al.  Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.

[14]  Han-Fu Chen,et al.  Asymptotic Properties of Primal-Dual Algorithm for Distributed Stochastic Optimization over Random Networks with Imperfect Communications , 2016, SIAM J. Control. Optim..

[15]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[16]  Xuan Kong,et al.  Adaptive Signal Processing Algorithms: Stability and Performance , 1994 .

[17]  Guang-Hong Yang,et al.  Augmented Lagrange algorithms for distributed optimization over multi-agent networks via edge-based method , 2018, Autom..

[18]  Wei Shi,et al.  Geometrically convergent distributed optimization with uncoordinated step-sizes , 2016, 2017 American Control Conference (ACC).

[19]  Aryan Mokhtari,et al.  DSA: Decentralized Double Stochastic Averaging Gradient Algorithm , 2015, J. Mach. Learn. Res..

[20]  Pascal Bianchi,et al.  Explicit Convergence Rate of a Distributed Alternating Direction Method of Multipliers , 2013, IEEE Transactions on Automatic Control.

[21]  Stephen J. Wright,et al.  Analyzing Vulnerability of Power Systems with Continuous Optimization Formulations , 2016, IEEE Transactions on Network Science and Engineering.

[22]  Van Sy Mai,et al.  Linear Convergence in Optimization Over Directed Graphs With Row-Stochastic Matrices , 2016, IEEE Transactions on Automatic Control.

[23]  Angelia Nedic,et al.  Distributed Optimization Over Time-Varying Directed Graphs , 2015, IEEE Trans. Autom. Control..

[24]  N. S. Aybat,et al.  Distributed Linearized Alternating Direction Method of Multipliers for Composite Convex Consensus Optimization , 2015, IEEE Transactions on Automatic Control.

[25]  Lihua Xie,et al.  Convergence of Asynchronous Distributed Gradient Methods Over Stochastic Networks , 2018, IEEE Transactions on Automatic Control.

[26]  Angelia Nedic,et al.  Distributed optimization over time-varying directed graphs , 2013, 52nd IEEE Conference on Decision and Control.

[27]  Xiaofeng Liao,et al.  Distributed Consensus Optimization in Multiagent Networks With Time-Varying Directed Topologies and Quantized Communication , 2017, IEEE Transactions on Cybernetics.

[28]  José M. F. Moura,et al.  Fast Distributed Gradient Methods , 2011, IEEE Transactions on Automatic Control.

[29]  Anit Kumar Sahu,et al.  Distributed stochastic optimization with gradient tracking over strongly-connected networks , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[30]  Asuman E. Ozdaglar,et al.  Constrained Consensus and Optimization in Multi-Agent Networks , 2008, IEEE Transactions on Automatic Control.

[31]  Michael G. Rabbat,et al.  Push-Sum Distributed Dual Averaging for convex optimization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[32]  Ali Sayed,et al.  Adaptation, Learning, and Optimization over Networks , 2014, Found. Trends Mach. Learn..

[33]  Francis Bach,et al.  SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.

[34]  Qing Ling,et al.  EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.

[35]  Guo Chen,et al.  Distributed Consensus Optimization in Multiagent Networks With Time-Varying Directed Topologies and Quantized Communication. , 2017, IEEE transactions on cybernetics.

[36]  Asuman E. Ozdaglar,et al.  Distributed Alternating Direction Method of Multipliers , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[37]  Tamer Başar,et al.  Stochastic Subgradient Algorithms for Strongly Convex Optimization Over Distributed Networks , 2014, IEEE Transactions on Network Science and Engineering.

[38]  Qing Ling,et al.  On the Convergence of Decentralized Gradient Descent , 2013, SIAM J. Optim..

[39]  Qing Ling,et al.  Decentralized linearized alternating direction method of multipliers , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[40]  Lihua Xie,et al.  Augmented distributed gradient methods for multi-agent optimization under uncoordinated constant stepsizes , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[41]  Usman A. Khan,et al.  A Linear Algorithm for Optimization Over Directed Graphs With Geometric Convergence , 2018, IEEE Control Systems Letters.

[42]  Angelia Nedic,et al.  Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[43]  Wei Shi,et al.  Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs , 2016, SIAM J. Optim..

[44]  Tingwen Huang,et al.  Convergence Analysis of a Distributed Optimization Algorithm with a General Unbalanced Directed Communication Network , 2019, IEEE Transactions on Network Science and Engineering.

[45]  R. Murray,et al.  Decentralized Multi-Agent Optimization via Dual Decomposition , 2011 .

[46]  Martin J. Wainwright,et al.  Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.

[47]  Yuguang Fang,et al.  Efficient Privacy-Preserving Machine Learning in Hierarchical Distributed System , 2019, IEEE Transactions on Network Science and Engineering.

[48]  A. Ozdaglar,et al.  Convergence analysis of distributed subgradient methods over random networks , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.