Decentralized multi-robot cooperation with auctioned POMDPs

Planning under uncertainty faces a scalability problem when considering multi-robot teams, as the information space scales exponentially with the number of robots. To address this issue, this paper proposes to decentralize multi-robot partially observable Markov decision processes (POMDPs) while maintaining cooperation between robots by using POMDP policy auctions. Auctions provide a flexible way of coordinating individual policies modeled by POMDPs and have low communication requirements. In addition, communication models in the multi-agent POMDP literature severely mismatch with real inter-robot communication. We address this issue by exploiting a decentralized data fusion method in order to efficiently maintain a joint belief state among the robots. The paper presents two different applications: environmental monitoring with unmanned aerial vehicles (UAVs); and cooperative tracking, in which several robots have to jointly track a moving target of interest. The first one is used as a proof of concept and illustrates the proposed ideas through different simulations. The second one adds real multi-robot experiments, showcasing the flexibility and robust coordination that our techniques can provide.

[1]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[2]  Panos E. Trahanias,et al.  Real-time hierarchical POMDPs for autonomous robot navigation , 2007, Robotics Auton. Syst..

[3]  Luis Montano,et al.  Comparative experiments on optimization criteria and algorithms for auction based multi-robot task allocation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[4]  Edmund H. Durfee,et al.  A decision-theoretic characterization of organizational influences , 2012, AAMAS.

[5]  Naomi Ehrich Leonard,et al.  Coordinated control of an underwater glider fleet in an adaptive ocean sampling field experiment in Monterey Bay , 2010, J. Field Robotics.

[6]  Lynne E. Parker,et al.  Multi-Robot Systems: From Swarms to Intelligent Automata , 2002, Springer Netherlands.

[7]  Makoto Yokoo,et al.  Communications for improving policy computation in distributed POMDPs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[8]  Aníbal Ollero,et al.  A distributed architecture for a robotic platform with aerial sensor transportation and self‐deployment capabilities , 2011, J. Field Robotics.

[9]  Aníbal Ollero,et al.  Decentralized Delayed-State Information Filter (DDSIF): A new approach for cooperative decentralized tracking , 2011, Robotics Auton. Syst..

[10]  Salah Sukkarieh,et al.  System development and demonstration of a UAV control architecture for information gathering missions , 2006, J. Field Robotics.

[11]  David Hsu,et al.  Motion planning under uncertainty for robotic tasks with long time horizons , 2010, Int. J. Robotics Res..

[12]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[13]  Manuela Veloso,et al.  Decentralized Communication Strategies for Coordinated Multi-Agent Policies , 2005 .

[14]  Patrick Doherty,et al.  Relay Positioning for Unmanned Aerial Vehicle Surveillance* , 2010, Int. J. Robotics Res..

[15]  Stephen Cameron,et al.  Role-Based Autonomous Multi-robot Exploration , 2009, 2009 Computation World: Future Computing, Service Computation, Cognitive, Adaptive, Content, Patterns.

[16]  David Hsu,et al.  SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.

[17]  P. Poupart Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .

[18]  Laurent Jeanpierre,et al.  Coordinated Multi-Robot Exploration Under Communication Constraints Using Decentralized Markov Decision Processes , 2012, AAAI.

[19]  Jesse Hoey,et al.  Value-Directed Human Behavior Analysis from Video Using Partially Observable Markov Decision Processes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Milind Tambe,et al.  The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..

[21]  Nikos A. Vlassis,et al.  Non-communicative multi-robot coordination in dynamic environments , 2005, Robotics Auton. Syst..

[22]  Frans A. Oliehoek,et al.  Tree-Based Solution Methods for Multiagent POMDPs with Delayed Communication , 2012, AAAI.

[23]  Reid G. Simmons,et al.  Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.

[24]  Milind Tambe,et al.  Role allocation and reallocation in multiagent teams: towards a practical analysis , 2003, AAMAS '03.

[25]  Joelle Pineau,et al.  Anytime Point-Based Approximations for Large POMDPs , 2006, J. Artif. Intell. Res..

[26]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[27]  Hugh Durrant-Whyte,et al.  Communication In General Decentralised Filters And The Coordinated Search Strategy , 2004 .

[28]  João Sequeira,et al.  Multirobot coordination by auctioning POMDPs , 2010, 2010 IEEE International Conference on Robotics and Automation.

[29]  Tucker R. Balch,et al.  Value-based action selection for observation with robot teams using probabilistic techniques , 2005, Robotics Auton. Syst..

[30]  Tomonari Furukawa,et al.  Multi-vehicle Bayesian Search for Multiple Lost Targets , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[31]  Nikos A. Vlassis,et al.  Multiagent Planning Under Uncertainty with Stochastic Communication Delays , 2008, ICAPS.

[32]  Gaurav S. Sukhatme,et al.  Adaptive teams of autonomous aerial and ground robots for situational awareness , 2007, J. Field Robotics.

[33]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[34]  Shlomo Zilberstein,et al.  Formal models and algorithms for decentralized decision making under uncertainty , 2008, Autonomous Agents and Multi-Agent Systems.

[35]  Leslie Pack Kaelbling,et al.  Grasping POMDPs , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[36]  Leslie Pack Kaelbling,et al.  Approximate Planning in POMDPs with Macro-Actions , 2003, NIPS.

[37]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[38]  P. J. Gmytrasiewicz,et al.  A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[39]  Rainer E. Burkard,et al.  Selected topics on assignment problems , 2002, Discret. Appl. Math..

[40]  Nicholas Roy,et al.  The permutable POMDP: fast solutions to POMDPs for preference elicitation , 2008, AAMAS.

[41]  Dylan A. Shell,et al.  Multi-Level Partitioning and Distribution of the Assignment Problem for Large-Scale Multi-Robot Task Allocation , 2011, Robotics: Science and Systems.

[42]  Aníbal Ollero,et al.  S+T: An algorithm for distributed multirobot task allocation based on services for improving robot cooperation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[43]  David Hsu,et al.  POMDPs for robotic tasks with mixed observability , 2009, Robotics: Science and Systems.

[44]  Aníbal Ollero,et al.  Data fusion in ubiquitous networked robot systems for urban services , 2012, Ann. des Télécommunications.

[45]  Julie A. Shah,et al.  Fast Distributed Multi-agent Plan Execution with Dynamic Task Assignment and Scheduling , 2009, ICAPS.

[46]  Nicholas Roy,et al.  Efficient planning under uncertainty for a target-tracking micro-aerial vehicle , 2010, 2010 IEEE International Conference on Robotics and Automation.

[47]  Geoffrey J. Gordon,et al.  Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..

[48]  Han-Lim Choi,et al.  Consensus-Based Decentralized Auctions for Robust Task Allocation , 2009, IEEE Transactions on Robotics.

[49]  Pedro U. Lima,et al.  Active cooperative perception in network robot systems using POMDPs , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[50]  Aníbal Ollero,et al.  A cooperative perception system for multiple UAVs: Application to automatic detection of forest fires , 2006, J. Field Robotics.

[51]  Makoto Yokoo,et al.  Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[52]  M. Ani Hsieh,et al.  Adaptive teams of autonomous aerial and ground robots for situational awareness: Field Reports , 2007 .

[53]  Nicholas Roy,et al.  Efficient Planning under Uncertainty with Macro-actions , 2014, J. Artif. Intell. Res..

[54]  Nikos A. Vlassis,et al.  A point-based POMDP algorithm for robot planning , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[55]  Claudia V. Goldman,et al.  The complexity of multiagent systems: the price of silence , 2003, AAMAS '03.