Beyond consensus and synchrony in decentralized online optimization using saddle point method

We consider online learning problems in multiagent systems comprised of distinct subsets of agents operating without a common time-scale. Each individual in the network is charged with minimizing the global regret, which is a sum of the instantaneous sub-optimality of each agent's actions with respect to a fixed global clairvoyant actor with access to all costs across the network for all time up to a time-horizon T. Since agents are not assumed to be of the same type, the hypothesis that all agents seek a common action is violated, and thus we instead introduce a notion of network discrepancy as a measure of how well agents coordinate their behavior while retaining distinct local behavior. Moreover, agents are not assumed to receive the sequentially arriving costs on a common time index, and thus seek to learn in an asynchronous manner. A variant of the Arrow-Hurwicz saddle point algorithm is proposed to control the growth of global regret and network discrepancy. This algorithm uses Lagrange multipliers to penalize the discrepancies between agents and leads to an implementation that relies on local operations and exchange of variables between neighbors. Decisions made with this method lead to regret whose order is O(√T) and network discrepancy O(T3/4). Empirical evaluation is conducted on an asynchronously operating sensor network estimating a spatially correlated random field.

[1]  Cédric Archambeau,et al.  Adaptive Algorithms for Online Convex Optimization with Long-term Constraints , 2015, ICML.

[2]  Michael G. Rabbat,et al.  Distributed dual averaging for convex optimization under communication delays , 2012, 2012 American Control Conference (ACC).

[3]  Brian M. Sadler,et al.  Proximity Without Consensus in Online Multiagent Optimization , 2016, IEEE Transactions on Signal Processing.

[4]  Kent Quanrud,et al.  Online Learning with Adversarial Delays , 2015, NIPS.

[5]  Angelia Nedic,et al.  Subgradient Methods for Saddle-Point Problems , 2009, J. Optimization Theory and Applications.

[6]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[7]  Daniel Pérez Palomar,et al.  A tutorial on decomposition methods for network utility maximization , 2006, IEEE Journal on Selected Areas in Communications.

[8]  Alejandro Ribeiro,et al.  D4L: Decentralized dynamic discriminative dictionary learning , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Rong Jin,et al.  Trading regret for efficiency: online convex optimization with long term constraints , 2011, J. Mach. Learn. Res..

[10]  John C. Duchi,et al.  Asynchronous stochastic convex optimization , 2015, 1508.00882.

[11]  Mung Chiang,et al.  Cross-Layer Congestion Control, Routing and Scheduling Design in Ad Hoc Wireless Networks , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[12]  Ketan Rajawat,et al.  Asynchronous Incremental Stochastic Dual Descent Algorithm for Network Resource Allocation , 2017, IEEE Transactions on Signal Processing.

[13]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[14]  Dean P. Foster,et al.  Regret in the On-Line Decision Problem , 1999 .

[15]  Brian M. Sadler,et al.  Proximity without consensus in online multi-agent optimization , 2016, ICASSP.

[16]  Jie Chen,et al.  Multitask Diffusion Adaptation Over Networks , 2013, IEEE Transactions on Signal Processing.

[17]  Alejandro Ribeiro,et al.  A Saddle Point Algorithm for Networked Online Convex Optimization , 2014, IEEE Transactions on Signal Processing.

[18]  B. V. Dean,et al.  Studies in Linear and Non-Linear Programming. , 1959 .

[19]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[20]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[21]  John N. Tsitsiklis,et al.  Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.

[22]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[23]  Angelia Nedic,et al.  Distributed Asynchronous Constrained Stochastic Optimization , 2011, IEEE Journal of Selected Topics in Signal Processing.

[24]  Ali H. Sayed,et al.  Asynchronous adaptive networks , 2015, ArXiv.

[25]  Alejandro Ribeiro,et al.  Online learning for characterizing unknown environments in ground robotic vehicle models , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).