DCOPs meet the realworld: exploring unknown reward matrices with applications to mobile sensor networks

Buoyed by recent successes in the area of distributed constraint optimization problems (DCOPs), this paper addresses challenges faced when applying DCOPs to real-world domains. Three fundamental challenges must be addressed for a class of real-world domains, requiring novel DCOP algorithms. First, agents may not know the payoff matrix and must explore the environment to determine rewards associated with variable settings. Second, agents may need to maximize total accumulated reward rather than instantaneous final reward. Third, limited time horizons disallow exhaustive exploration of the environment. We propose and implement a set of novel algorithms that combine decision-theoretic exploration approaches with DCOP-mandated coordination. In addition to simulation results, we implement these algorithms on robots, deploying DCOPs on a distributed mobile sensor network.

[1]  Carlos Guestrin,et al.  Multiagent Planning with Factored MDPs , 2001, NIPS.

[2]  Ian F. Akyildiz,et al.  Wireless sensor networks: a survey , 2002, Comput. Networks.

[3]  Nikos A. Vlassis,et al.  Anytime algorithms for multiagent decision making using coordination graphs , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[4]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[5]  Connections between cooperative control and potential games illustrated on the consensus problem , 2007, 2007 European Control Conference (ECC).

[6]  Radhika Nagpal,et al.  Robust and Self-Repairing Formation Control for Swarms of Mobile Agents , 2005, AAAI.

[7]  Nicholas R. Jennings,et al.  Decentralised coordination of low-power embedded devices using the max-sum algorithm , 2008, AAMAS.

[8]  Victor R. Lesser,et al.  Solving distributed constraint optimization problems using cooperative mediation , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[9]  Timothy W. McLain,et al.  Decentralized Cooperative Aerial Surveillance Using Fixed-Wing Miniature UAVs , 2006, Proceedings of the IEEE.

[10]  Milind Tambe,et al.  Distributed Algorithms for DCOP: A Graphical-Game-Based Approach , 2004, PDCS.

[11]  Jeff G. Schneider,et al.  Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[12]  Milind Tambe,et al.  Distributed Sensor Networks: A Multiagent Perspective , 2003 .

[13]  Boi Faltings,et al.  A Scalable Method for Multiagent Constraint Optimization , 2005, IJCAI.

[14]  P. Freeman The Secretary Problem and its Extensions: A Review , 1983 .

[15]  Rohit U. Nabar,et al.  Introduction to Space-Time Wireless Communications , 2003 .

[16]  Roger Mailler,et al.  Commbots: Distributed control of mobile communication relays , 2006 .

[17]  Peter Stone,et al.  Keepaway Soccer: From Machine Learning Testbed to Benchmark , 2005, RoboCup.

[18]  Andreas F. Molisch,et al.  Wireless Communications , 2005 .

[19]  Stephen Fitzpatrick,et al.  Distributed Coordination through Anarchic Optimization , 2003 .

[20]  Edmund H. Durfee,et al.  A distributed framework for solving the Multiagent Plan Coordination Problem , 2005, AAMAS '05.

[21]  M. Yokoo,et al.  Distributed Breakout Algorithm for Solving Distributed Constraint Satisfaction Problems , 1996 .

[22]  Craig Boutilier,et al.  Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[23]  Milind Tambe,et al.  Distributed Sensor Networks , 2003, Multiagent Systems, Artificial Societies, and Simulated Organizations.

[24]  Weixiong Zhang,et al.  An analysis and application of distributed constraint satisfaction and optimization algorithms in sensor networks , 2003, AAMAS '03.

[25]  Makoto Yokoo,et al.  Adopt: asynchronous distributed constraint optimization with quality guarantees , 2005, Artif. Intell..

[26]  S. Kozono Received signal-level characteristics in a wide-band mobile radio channel , 1994 .

[27]  Milind Tambe,et al.  Quality Guarantees on k-Optimal Solutions for Distributed Constraint Optimization Problems , 2007, IJCAI.

[28]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .