Communication decisions in multi-agent cooperation: model and experiments

In multi-agent cooperation, agents share a common goal, which is evaluated through a global utility function. However, an agent typically cannot observe the global state of an uncertain environment, and therefore they must communicate with each other in order to share the information needed for deciding which actions to take. We argue that, when communication incurs a cost (due to resource consumption, for example), whether to communicate or not also becomes a decision to make. Hence, communication decision becomes part of the overall agent decision problem. In order to explicitly address this problem, we present a multi-agent extension to Markov decision processes in which communication can be modeled as an explicit action that incurs a cost. This framework provides a foundation for a quantified study of agent coordination policies and provides both motivation and insight to the design of heuristic approaches. An example problem is studied under this framework. From this example we can see the impact communication policies have on the overall agent policies, and what implications we can find toward the design of agent coordination policies.

[1]  H. Witsenhausen A Counterexample in Stochastic Optimum Control , 1968 .

[2]  Michael Athans,et al.  Survey of decentralized control methods for large scale systems , 1978 .

[3]  T. Yoshikawa Decomposition of dynamic team decision problems , 1978 .

[4]  S. Marcus,et al.  Decentralized control of finite state Markov processes , 1980, 1980 19th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[5]  Y. Ho,et al.  Another look at the nonclassical information structure problem , 1980 .

[6]  John N. Tsitsiklis,et al.  On the complexity of decentralized decision making and detection problems , 1985 .

[7]  Editors , 1986, Brain Research Bulletin.

[8]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[9]  M. Aicardi,et al.  Decentralized optimal control of Markov chains with a common past information set , 1987 .

[10]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[11]  Eric A. Hansen,et al.  Cost-Effective Sensing during Plan Execution , 1994, AAAI.

[12]  Shlomo Zilberstein,et al.  Reinforcement Learning for Mixed Open-loop and Closed-loop Control , 1996, NIPS.

[13]  Nicholas R. Jennings,et al.  Foundations of distributed artificial intelligence , 1996, Sixth-generation computer technology series.

[14]  Shlomo Zilberstein,et al.  Monitoring the Progress of Anytime Problem-Solving , 1996, AAAI/IAAI, Vol. 2.

[15]  Craig Boutilier,et al.  Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[16]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.