Fully Decentralized Policies for Multi-Agent Systems: An Information Theoretic Approach

Learning cooperative policies for multi-agent systems is often challenged by partial observability and a lack of coordination. In some settings, the structure of a problem allows a distributed solution with limited communication. Here, we consider a scenario where no communication is available, and instead we learn local policies for all agents that collectively mimic the solution to a centralized multi-agent static optimization problem. Our main contribution is an information theoretic framework based on rate distortion theory which facilitates analysis of how well the resulting fully decentralized policies are able to reconstruct the optimal solution. Moreover, this framework provides a natural extension that addresses which nodes an agent should communicate with to improve the performance of its individual policy.

[1]  Kee-Eung Kim,et al.  Learning to Cooperate via Policy Search , 2000, UAI.

[2]  Claudia V. Goldman,et al.  Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis , 2004, J. Artif. Intell. Res..

[3]  Makoto Yokoo,et al.  Adopt: asynchronous distributed constraint optimization with quality guarantees , 2005, Artif. Intell..

[4]  Claire J. Tomlin,et al.  Regression-based Inverter Control for Decentralized Optimal Power Flow and Voltage Regulation , 2019, ArXiv.

[5]  Makoto Yokoo,et al.  Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[6]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[7]  S. Sastry Nonlinear Systems: Analysis, Stability, and Control , 1999 .

[8]  Melanie Nicole Zeilinger,et al.  Inexact fast alternating minimization algorithm for distributed model predictive control , 2014, 53rd IEEE Conference on Decision and Control.

[9]  Steven H. Low,et al.  Convex Relaxation of Optimal Power Flow—Part I: Formulations and Equivalence , 2014, IEEE Transactions on Control of Network Systems.

[10]  Soumyadip Ghosh,et al.  Fully decentralized AC optimal power flow algorithms , 2013, 2013 IEEE Power & Energy Society General Meeting.

[11]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[12]  M. E. Baran,et al.  Optimal capacitor placement on radial distribution systems , 1989 .

[13]  Jan Lunze,et al.  Feedback control of large-scale systems , 1992 .

[14]  Lijun Chen,et al.  Equilibrium and dynamics of local voltage control in distribution systems , 2013, 52nd IEEE Conference on Decision and Control.

[15]  Panagiotis D. Christofides,et al.  Distributed model predictive control: A tutorial review and future research directions , 2013, Comput. Chem. Eng..

[16]  Stephen P. Boyd,et al.  Distributed optimization for cooperative agents: application to formation flight , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[17]  Dragoslav D. Šiljak,et al.  Decentralized control of complex systems , 2012 .

[18]  Eliseo Ferrante,et al.  Swarm robotics: a review from the swarm engineering perspective , 2013, Swarm Intelligence.

[19]  Melanie Nicole Zeilinger,et al.  Plug and play distributed model predictive control based on distributed invariance and optimization , 2013, 52nd IEEE Conference on Decision and Control.

[20]  Leslie Pack Kaelbling,et al.  All learning is Local: Multi-agent Learning in Global Reward Games , 2003, NIPS.

[21]  Sairaj V. Dhople,et al.  Optimal Dispatch of Photovoltaic Inverters in Residential Distribution Systems , 2013, IEEE Transactions on Sustainable Energy.

[22]  Albert Y. S. Lam,et al.  An Optimal and Distributed Method for Voltage Regulation in Power Distribution Systems , 2012, IEEE Transactions on Power Systems.

[23]  Frans A. Oliehoek,et al.  A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.

[24]  E. Davison,et al.  Decentralized stabilization and pole assignment for general proper systems , 1990 .

[25]  Yanjun Han,et al.  Minimax Estimation of Functionals of Discrete Distributions , 2014, IEEE Transactions on Information Theory.

[26]  David J. Hill,et al.  Multi-Timescale Coordinated Voltage/Var Control of High Renewable-Penetrated Distribution Systems , 2017, IEEE Transactions on Power Systems.

[27]  Claude Sammut,et al.  Automatic construction of reactive control systems using symbolic machine learning , 1996, The Knowledge Engineering Review.

[28]  R. J. Aumann,et al.  Cooperative games with coalition structures , 1974 .

[29]  Kee-Eung Kim,et al.  Information-Theoretic Bounded Rationality , 2015, ArXiv.