Efficient Multi-robot Exploration via Multi-head Attention-based Cooperation Strategy

The goal of coordinated multi-robot exploration tasks is to employ a team of autonomous robots to explore an unknown environment as quickly as possible. Compared with human-designed methods, which began with heuristic and rule-based approaches, learning-based methods enable individual robots to learn sophisticated and hard-to-design cooperation strategies through deep reinforcement learning technologies. However, in decentralized multi-robot exploration tasks, learning-based algorithms are still far from being universally applicable to the continuous space due to the difficulties associated with area calculation and reward function designing; moreover, existing learning-based methods encounter problems when attempting to balance the historical trajectory issue and target area conflict problem. Furthermore, the scalability of these methods to a large number of agents is poor because of the exponential explosion problem of state space. Accordingly, this paper proposes a novel approach - Multi-head Attention-based Multi-robot Exploration in Continuous Space (MAMECS) - aimed at reducing the state space and automatically learning the cooperation strategies required for decentralized multi-robot exploration tasks in continuous space. Computational geometry knowledge is applied to describe the environment in continuous space and to design an improved reward function to ensure a superior exploration rate. Moreover, the multi-head attention mechanism employed helps to solve the historical trajectory issue in the decentralized multi-robot exploration task, as well as to reduce the quadratic increase of action space.

[1]  Wolfram Burgard,et al.  Socially compliant mobile robot navigation via inverse reinforcement learning , 2016, Int. J. Robotics Res..

[2]  Xing Zhou,et al.  Learning to Cooperate in Decentralized Multi-robot Exploration of Dynamic Environments , 2018, ICONIP.

[3]  Jindong Tan,et al.  Distributed multi-robot coordination in area exploration , 2006, Robotics Auton. Syst..

[4]  Anthony Stentz,et al.  Multi-robot exploration controlled by a market economy , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[5]  Gang Wang,et al.  Door recognition and deep learning algorithm for visual based robot navigation , 2014, 2014 IEEE International Conference on Robotics and Biomimetics (ROBIO 2014).

[6]  Shimon Whiteson,et al.  Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[7]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[8]  Nathan Michael,et al.  Efficient Online Multi-robot Exploration via Distributed Sequential Greedy Assignment , 2017, Robotics: Science and Systems.

[9]  脇元 修一,et al.  IEEE International Conference on Robotics and Automation (ICRA) におけるフルードパワー技術の研究動向 , 2011 .

[10]  T. Gireesh Kumar,et al.  Frontier Based Multi Robot Area Exploration Using Prioritized Routing , 2016, ECMS.

[11]  Alexander Rakhlin,et al.  Automatic Instrument Segmentation in Robot-Assisted Surgery Using Deep Learning , 2018, bioRxiv.

[12]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Luiz Chaimowicz,et al.  The next frontier: combining information gain and distance cost for decentralized multi-robot exploration , 2016, SAC.

[14]  M. Ani Hsieh,et al.  Information Based Exploration with Panoramas and Angle Occupancy Grids , 2016, DARS.

[15]  Rob Fergus,et al.  Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[16]  Wolfram Burgard,et al.  Collaborative multi-robot exploration , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[17]  Elon Rimon,et al.  Spanning-tree based coverage of continuous areas by a mobile robot , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[18]  Vijay Kumar,et al.  Autonomous robotic exploration using occupancy grid maps and graph SLAM based on Shannon and Rényi Entropy , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Sergey Levine,et al.  Self-Supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Mingyang Geng,et al.  Learning to Communicate Efficiently with Group Division in Decentralized Multi-agent Cooperation , 2019, 2019 IEEE International Conference on Service-Oriented System Engineering (SOSE).

[21]  Nathan Michael,et al.  Communication-Efficient Planning and Mapping for Multi-Robot Exploration in Large Environments , 2019, IEEE Robotics and Automation Letters.

[22]  Jan Faigl,et al.  On benchmarking of frontier-based multi-robot exploration strategies , 2015, 2015 European Conference on Mobile Robots (ECMR).

[23]  Christian Bettstetter,et al.  Collaboration in Multi-Robot Exploration: To Meet or not to Meet? , 2016, J. Intell. Robotic Syst..

[24]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Brian Yamauchi,et al.  Frontier-based exploration using multiple robots , 1998, AGENTS '98.

[26]  Wolfram Burgard,et al.  Coordinated multi-robot exploration , 2005, IEEE Transactions on Robotics.

[27]  Lei Zhang,et al.  Learning to Cooperate via an Attention-Based Communication Neural Network in Decentralized Multi-Robot Exploration † , 2019, Entropy.