Dynamic hierarchical reinforcement learning based on probability model

To deal with the overwhelming dimensionality in the large-scale reinforcement-learning and the strong depen-dence on prior knowledge in existing learning algorithms,we propose the method of dynamic hierarchical reinforcement learning based on the probability model(DHRL--model).This method identifies some key states automatically based on probability parameters of the state-transition probability model established based on Bayesian learning,then generates some state-subspaces dynamically by clustering,and learns the optimal policy based on hierarchical structure.Simulation results show that DHRL--model algorithm improves the learning efficiency of the agent remarkably in the complex environment,and can be applied to learning in the unknown large-scale world.