Decentralized Online Convex Programming with local information

This paper describes a novel approach to decentralized online optimization in a large network of agents. At each stage, the agents face a new objective function that reflects the effects of a changing environment, and each agent can share information pertaining to past decisions and cost functions only with his neighbors. These operating conditions arise in many practical applications, but introduce challenging questions related to the performance of distributed strategies relative to impractical centralized approaches. The proposed algorithm yields small regret (i.e., the difference between the total cost incurred using causally available information and the total cost that would have been incurred in hindsight had all the relevant information been available all at once) and is robust to evolving network topologies. It combines a subgradient-based sequential convex optimization scheme with decentralized decision-making via approximate dynamic programming.

[1]  H. Witsenhausen Separation of estimation and control for discrete time systems , 1971 .

[2]  John N. Tsitsiklis,et al.  Problems in decentralized decision making and computation , 1984 .

[3]  John N. Tsitsiklis,et al.  Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.

[4]  W. Fleming Book Review: Discrete-time Markov control processes: Basic optimality criteria , 1997 .

[5]  O. Hernández-Lerma,et al.  Discrete-time Markov control processes , 1999 .

[6]  Benjamin Van Roy,et al.  Decentralized decision-making in a large team with local information , 2003, Games Econ. Behav..

[7]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[8]  Sekhar Tatikonda,et al.  Control under communication constraints , 2004, IEEE Transactions on Automatic Control.

[9]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[10]  Dimitri P. Bertsekas,et al.  Stochastic optimal control : the discrete time case , 2007 .

[11]  Ambuj Tewari,et al.  Optimal Stragies and Minimax Lower Bounds for Online Convex Games , 2008, COLT.

[12]  Separation of Estimation and Control for Discrete Time Systems , 2009 .

[13]  Lin Xiao,et al.  Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..

[14]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[15]  Sekhar Tatikonda,et al.  Sequential team form and its simplification using graphical models , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[16]  Maxim Raginsky,et al.  Online Convex Programming and regularization in adaptive control , 2010, 49th IEEE Conference on Decision and Control (CDC).

[17]  Martin J. Wainwright,et al.  Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.

[18]  Feng Yan,et al.  Distributed Autonomous Online Learning: Regrets and Intrinsic Privacy-Preserving Properties , 2010, IEEE Transactions on Knowledge and Data Engineering.