Dynamic Hierarchies in Hierarchical Reinforcement Learning

The existing automatic hierarchy approaches in hierarchical reinforcement learning construct the hierarchical structure one-off after a certain extent state space exploration. Incomplete exploration cannot assure the quality of solution, whereas over exploration will bring on learning slowdown. In order to solve the problem that the learning performance strongly depends on state space exploration, a dynamic hierarchy approach is presented in this paper. The approach integrates immune clustering and second response into the Options that is a hierarchical reinforcement learning frame proposed by Sutton. In the dynamic hierarchy approach, the state space of Options can be modified dynamically and the local strategies of Options can be learning dynamically along the learning trace. The experiments with shortest path planning in a two-dimensional grid space with obstacles show that the dynamic hierarchy approach little depends on state space exploration. The dynamic hierarchy approach is more applicable to large scale reinforcement learning problems.