Convergence of the Iterates in Mirror Descent Methods

We consider centralized and distributed mirror descent (MD) algorithms over a finite-dimensional Hilbert space, and prove that the problem variables converge to an optimizer of a possibly nonsmooth function when the step sizes are square summable but not summable. Prior literature has focused on the convergence of the function value to its optimum. However, applications from distributed optimization and learning in games require the convergence of the variables to an optimizer, which is generally not guaranteed without assuming strong convexity of the objective function. We provide numerical simulations comparing entropic MD and standard subgradient methods for the robust regression problem.

[1]  Angelia Nedic,et al.  On Stochastic Subgradient Mirror-Descent Algorithm with Weighted Averaging , 2013, SIAM J. Optim..

[2]  Thinh T. Doan,et al.  Distributed Lagrangian Method for Tie-Line Scheduling in Power Grids under Uncertainty , 2017, SIGMETRICS Perform. Evaluation Rev..

[3]  Peter W. Glynn,et al.  Countering Feedback Delays in Multi-Agent Learning , 2017, NIPS.

[4]  John Darzentas,et al.  Problem Complexity and Method Efficiency in Optimization , 1983 .

[5]  Maxim Raginsky,et al.  Continuous-time stochastic Mirror Descent on a network: Variance reduction, consensus, convergence , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[6]  Marc Teboulle,et al.  Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[7]  Waheed U. Bajwa,et al.  Stochastic Optimization From Distributed Streaming Data in Rate-Limited Networks , 2017, IEEE Transactions on Signal and Information Processing over Networks.

[8]  Arkadi Nemirovski,et al.  The Ordered Subsets Mirror Descent Optimization Method with Applications to Tomography , 2001, SIAM J. Optim..

[9]  Rebecca Willett,et al.  Online Convex Optimization in Dynamic Environments , 2015, IEEE Journal of Selected Topics in Signal Processing.

[10]  Michael G. Rabbat,et al.  Multi-agent mirror descent for decentralized stochastic optimization , 2015, 2015 IEEE 6th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[11]  Alexandre M. Bayen,et al.  Convergence of heterogeneous distributed learning in stochastic routing games , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[12]  Michael I. Jordan,et al.  Ergodic mirror descent , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[13]  Changzhi Wu,et al.  Stochastic mirror descent method for distributed multi-agent optimization , 2016, Optimization Letters.

[14]  Shahin Shahrampour,et al.  Distributed Online Optimization in Dynamic Environments Using Mirror Descent , 2016, IEEE Transactions on Automatic Control.

[15]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[16]  O. J. Karst,et al.  Linear Curve Fitting Using Least Deviations , 1958 .

[17]  Gregory S. Ledva,et al.  Inferring the behavior of distributed energy resources with online learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[18]  Peter W. Glynn,et al.  Mirror descent learning in continuous games , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[19]  Thinh T. Doan,et al.  Distributed Lagrangian methods for network resource allocation , 2016, 2017 IEEE Conference on Control Technology and Applications (CCTA).

[20]  Alexandre M. Bayen,et al.  Convergence of mirror descent dynamics in the routing game , 2015, 2015 European Control Conference (ECC).

[21]  Angelia Nedic,et al.  Distributed learning with infinitely many hypotheses , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[22]  Zhao Yang Dong,et al.  Distributed mirror descent method for multi-agent optimization with delay , 2016, Neurocomputing.

[23]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[24]  Heinz H. Bauschke,et al.  Joint and Separate Convexity of the Bregman Distance , 2001 .

[25]  Alexandre M. Bayen,et al.  Accelerated Mirror Descent in Continuous and Discrete Time , 2015, NIPS.

[26]  Doina Precup,et al.  Exponentiated Gradient Methods for Reinforcement Learning , 1997, ICML.