Coordinated Multi-Robot Exploration Under Communication Constraints Using Decentralized Markov Decision Processes

Recent works on multi-agent sequential decision making using decentralized partially observable Markov decision processes have been concerned with interactionoriented resolution techniques and provide promising results. These techniques take advantage of local interactions and coordination. In this paper, we propose an approach based on an interaction-oriented resolution of decentralized decision makers. To this end, distributed value functions (DVF) have been used by decoupling the multi-agent problem into a set of individual agent problems. However existing DVF techniques assume permanent and free communication between the agents. In this paper, we extend the DVF methodology to address full local observability, limited share of information and communication breaks. We apply our new DVF in a real-world application consisting of multi-robot exploration where each robot computes locally a strategy that minimizes the interactions between the robots and maximizes the space coverage of the team even under communication constraints. Our technique has been implemented and evaluated in simulation and in real-world scenarios during a robotic challenge for the exploration and mapping of an unknown environment. Experimental results from real-world scenarios and from the challenge are given where our system was vice-champion.

[1]  R. Bellman Dynamic programming. , 1957, Science.

[2]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[3]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[4]  Andrew W. Moore,et al.  Distributed Value Functions , 1999, ICML.

[5]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[6]  Anthony Stentz,et al.  Multi-robot exploration controlled by a market economy , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[7]  Milind Tambe,et al.  Multiagent teamwork: analyzing the optimality and complexity of key theories and models , 2002, AAMAS '02.

[8]  Matthew Powers,et al.  Value-Based Communication Preservation for Mobile Robots , 2004, DARS.

[9]  Abdel-Illah Mouaddib,et al.  A polynomial algorithm for decentralized Markov decision processes with temporal constraints , 2005, AAMAS '05.

[10]  Makoto Yokoo,et al.  Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[11]  Wolfram Burgard,et al.  Coordinated multi-robot exploration , 2005, IEEE Transactions on Robotics.

[12]  Abdel-Illah Mouaddib,et al.  An Iterative Algorithm for Solving Constrained Decentralized Markov Decision Processes , 2006, AAAI.

[13]  Wolfram Burgard,et al.  Coordinated multi-robot exploration using a segmentation of the environment , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Francisco S. Melo,et al.  Interaction-driven Markov games for decentralized multiagent planning under uncertainty , 2008, AAMAS.

[15]  Stephen Cameron,et al.  Autonomous Multi-Robot Exploration in Communication-Limited Environments , 2010 .

[16]  U. Rieder,et al.  Markov Decision Processes , 2010 .

[17]  F. Nashashibi,et al.  A Real-Time Robust SLAM for Large-Scale Outdoor Environments , 2010 .

[18]  Abdel-Illah Mouaddib,et al.  Collective Decision-Theoretic Planning for Planet Exploration , 2011, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

[19]  Manuela M. Veloso,et al.  Decentralized MDPs with sparse interactions , 2011, Artif. Intell..

[20]  Prasanna Velagapudi,et al.  Distributed model shaping for scaling to decentralized POMDPs with hundreds of agents , 2011, AAMAS.

[21]  Laurent Jeanpierre,et al.  Distributed value functions for multi-robot exploration , 2012, 2012 IEEE International Conference on Robotics and Automation.

[22]  Laurent Jeanpierre,et al.  Distributed value functions for the coordination of decentralized decision makers , 2012, AAMAS.