Algorithm for automatic constructing Option based on multi-agent

In current hierarchical reinforcement learning, the automatic task hierarchies are constructed by low speed serial learning algorithm based on single-agent. A multi-agent based algorithm for constructing Options automatically was presented for speeding up the learning algorithm. The algorithm was developed on the basis of the Option HRL framework proposed by Sutton. Firstly, multiple agents cooperated in parallel exploring the state space. Then the state space was partitioned into several sub-spaces via immune clustering based on aiNet. Next, the agents learned the local strategies of the different sub-space concurrently. Consequently, the Options were constructed. The theoretical analyses and experiments with shortest path planning in a two-dimensional grid space with obstacles show that the speed of multi-agent based algorithm for automatically constructing Options was obviously faster than that of single-agent based algorithms.