论文信息 - Learning policies for abstract state spaces

Learning policies for abstract state spaces

Applying Q-learning to multidimensional, real-valued state spaces is time-consuming in most cases. In this article, we deal with the assumption that a coarse partition of the state space is sufficient for learning good or even optimal policies. An algorithm is presented which constructs proper policies for abstract state spaces using an incremental procedure without approximating a Q-function. By combining an approach similar to dynamic programming and a search for policies, we can speed up the learning process. To provide empirical evidence, we use a cart-pole system. Experiments were conducted for a simulated environment as well as for a real plant.

Stephan Timmer | Martin A. Riedmiller | S. Timmer

[1] Stuart I. Reynolds,et al. Adaptive Resolution Model-Free Reinforcement Learning: Decision Boundary Partitioning , 2000, International Conference on Machine Learning.

[2] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.

[3] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[4] Thomas G. Dietterich. State Abstraction in MAXQ Hierarchical Reinforcement Learning , 1999, NIPS.

[5] Andrew W. Moore,et al. Variable Resolution Discretization in Optimal Control , 2002, Machine Learning.