Inference-based Hierarchical Reinforcement Learning for Cooperative Multi-agent Navigation

This work aims to address the multi-agent cooperative navigation problem (MCNP), where multiple agents work together to occupy the landmarks in an environment without collision and with minimum time consumption. To this end, we propose an inference-based hierarchical reinforcement learning (IHRL) model, in which the high-level component infers the target allocation scheme among the agents and landmarks using a local message-passing algorithm, while the low-level component trains the sub-policy corresponding to the target assigned by the high-level component using traditional RL algorithms. The highlight of our model lies in the interplay of high-level inference based on the knowledge from learning and low-level learning with the results from inference. In this way, the overall learning efficiency can be improved by integrating more indicative information into the agents’ coordinated learning process. Extensive experiments demonstrate the effectiveness of the proposed model.