论文信息 - Decentralized multi-robot cooperation with auctioned POMDPs

Decentralized multi-robot cooperation with auctioned POMDPs

Planning under uncertainty faces a scalability problem when considering multi-robot teams, as the information space scales exponentially with the number of robots. To address this issue, this paper proposes to decentralize multi-robot partially observable Markov decision processes (POMDPs) while maintaining cooperation between robots by using POMDP policy auctions. Auctions provide a flexible way of coordinating individual policies modeled by POMDPs and have low communication requirements. In addition, communication models in the multi-agent POMDP literature severely mismatch with real inter-robot communication. We address this issue by exploiting a decentralized data fusion method in order to efficiently maintain a joint belief state among the robots. The paper presents two different applications: environmental monitoring with unmanned aerial vehicles (UAVs); and cooperative tracking, in which several robots have to jointly track a moving target of interest. The first one is used as a proof of concept and illustrates the proposed ideas through different simulations. The second one adds real multi-robot experiments, showcasing the flexibility and robust coordination that our techniques can provide.

Aníbal Ollero | Matthijs T. J. Spaan | Jesús Capitán | Luis Merino

[1] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[2] Panos E. Trahanias,et al. Real-time hierarchical POMDPs for autonomous robot navigation , 2007, Robotics Auton. Syst..

[3] Luis Montano,et al. Comparative experiments on optimization criteria and algorithms for auction based multi-robot task allocation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[4] Edmund H. Durfee,et al. A decision-theoretic characterization of organizational influences , 2012, AAMAS.

[5] Naomi Ehrich Leonard,et al. Coordinated control of an underwater glider fleet in an adaptive ocean sampling field experiment in Monterey Bay , 2010, J. Field Robotics.

[6] Lynne E. Parker,et al. Multi-Robot Systems: From Swarms to Intelligent Automata , 2002, Springer Netherlands.

[7] Makoto Yokoo,et al. Communications for improving policy computation in distributed POMDPs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[8] Aníbal Ollero,et al. A distributed architecture for a robotic platform with aerial sensor transportation and self‐deployment capabilities , 2011, J. Field Robotics.

[9] Aníbal Ollero,et al. Decentralized Delayed-State Information Filter (DDSIF): A new approach for cooperative decentralized tracking , 2011, Robotics Auton. Syst..

[10] Salah Sukkarieh,et al. System development and demonstration of a UAV control architecture for information gathering missions , 2006, J. Field Robotics.

[11] David Hsu,et al. Motion planning under uncertainty for robotic tasks with long time horizons , 2010, Int. J. Robotics Res..

[12] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[13] Manuela Veloso,et al. Decentralized Communication Strategies for Coordinated Multi-Agent Policies , 2005 .

[14] Patrick Doherty,et al. Relay Positioning for Unmanned Aerial Vehicle Surveillance* , 2010, Int. J. Robotics Res..

[15] Stephen Cameron,et al. Role-Based Autonomous Multi-robot Exploration , 2009, 2009 Computation World: Future Computing, Service Computation, Cognitive, Adaptive, Content, Patterns.

[16] David Hsu,et al. SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.

[17] P. Poupart. Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .

[18] Laurent Jeanpierre,et al. Coordinated Multi-Robot Exploration Under Communication Constraints Using Decentralized Markov Decision Processes , 2012, AAAI.

[19] Jesse Hoey,et al. Value-Directed Human Behavior Analysis from Video Using Partially Observable Markov Decision Processes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20] Milind Tambe,et al. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..

[21] Nikos A. Vlassis,et al. Non-communicative multi-robot coordination in dynamic environments , 2005, Robotics Auton. Syst..

[22] Frans A. Oliehoek,et al. Tree-Based Solution Methods for Multiagent POMDPs with Delayed Communication , 2012, AAAI.

[23] Reid G. Simmons,et al. Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.

[24] Milind Tambe,et al. Role allocation and reallocation in multiagent teams: towards a practical analysis , 2003, AAMAS '03.

[25] Joelle Pineau,et al. Anytime Point-Based Approximations for Large POMDPs , 2006, J. Artif. Intell. Res..

[26] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[27] Hugh Durrant-Whyte,et al. Communication In General Decentralised Filters And The Coordinated Search Strategy , 2004 .

[28] João Sequeira,et al. Multirobot coordination by auctioning POMDPs , 2010, 2010 IEEE International Conference on Robotics and Automation.

[29] Tucker R. Balch,et al. Value-based action selection for observation with robot teams using probabilistic techniques , 2005, Robotics Auton. Syst..

[30] Tomonari Furukawa,et al. Multi-vehicle Bayesian Search for Multiple Lost Targets , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[31] Nikos A. Vlassis,et al. Multiagent Planning Under Uncertainty with Stochastic Communication Delays , 2008, ICAPS.

[32] Gaurav S. Sukhatme,et al. Adaptive teams of autonomous aerial and ground robots for situational awareness , 2007, J. Field Robotics.

[33] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[34] Shlomo Zilberstein,et al. Formal models and algorithms for decentralized decision making under uncertainty , 2008, Autonomous Agents and Multi-Agent Systems.

[35] Leslie Pack Kaelbling,et al. Grasping POMDPs , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[36] Leslie Pack Kaelbling,et al. Approximate Planning in POMDPs with Macro-Actions , 2003, NIPS.

[37] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[38] P. J. Gmytrasiewicz,et al. A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[39] Rainer E. Burkard,et al. Selected topics on assignment problems , 2002, Discret. Appl. Math..

[40] Nicholas Roy,et al. The permutable POMDP: fast solutions to POMDPs for preference elicitation , 2008, AAMAS.

[41] Dylan A. Shell,et al. Multi-Level Partitioning and Distribution of the Assignment Problem for Large-Scale Multi-Robot Task Allocation , 2011, Robotics: Science and Systems.

[42] Aníbal Ollero,et al. S+T: An algorithm for distributed multirobot task allocation based on services for improving robot cooperation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[43] David Hsu,et al. POMDPs for robotic tasks with mixed observability , 2009, Robotics: Science and Systems.

[44] Aníbal Ollero,et al. Data fusion in ubiquitous networked robot systems for urban services , 2012, Ann. des Télécommunications.

[45] Julie A. Shah,et al. Fast Distributed Multi-agent Plan Execution with Dynamic Task Assignment and Scheduling , 2009, ICAPS.

[46] Nicholas Roy,et al. Efficient planning under uncertainty for a target-tracking micro-aerial vehicle , 2010, 2010 IEEE International Conference on Robotics and Automation.

[47] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..

[48] Han-Lim Choi,et al. Consensus-Based Decentralized Auctions for Robust Task Allocation , 2009, IEEE Transactions on Robotics.

[49] Pedro U. Lima,et al. Active cooperative perception in network robot systems using POMDPs , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[50] Aníbal Ollero,et al. A cooperative perception system for multiple UAVs: Application to automatic detection of forest fires , 2006, J. Field Robotics.

[51] Makoto Yokoo,et al. Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[52] M. Ani Hsieh,et al. Adaptive teams of autonomous aerial and ground robots for situational awareness: Field Reports , 2007 .

[53] Nicholas Roy,et al. Efficient Planning under Uncertainty with Macro-actions , 2014, J. Artif. Intell. Res..

[54] Nikos A. Vlassis,et al. A point-based POMDP algorithm for robot planning , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[55] Claudia V. Goldman,et al. The complexity of multiagent systems: the price of silence , 2003, AAMAS '03.