论文信息 - Learning Macro-Actions in Reinforcement Learning

Learning Macro-Actions in Reinforcement Learning

We present a method for automatically constructing macro-actions from scratch from primitive actions during the reinforcement learning process. The overall idea is to reinforce the tendency to perform action b after action a if such a pattern of actions has been rewarded. We test the method on a bicycle task, the car-on-the-hill task, the race-track task and some grid-world tasks. For the bicycle and race-track tasks the use of macro-actions approximately halves the learning time, while for one of the grid-world tasks the learning time is reduced by a factor of 5. The method did not work for the car-on-the-hill task for reasons we discuss in the conclusion.

Jette Randløv | J. Randløv

[1] Gavin Adrian Rummery. Problem solving with reinforcement learning , 1995 .

[2] Vijaykumar Gullapalli,et al. Reinforcement learning and its application to control , 1992 .

[3] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[4] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[5] R. Sutton,et al. Macro-Actions in Reinforcement Learning: An Empirical Analysis , 1998 .

[6] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[7] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[8] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .

[9] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[10] Doina Precup,et al. Multi-time Models for Temporally Abstract Planning , 1997, NIPS.

[11] Wolfram Burgard,et al. The Interactive Museum Tour-Guide Robot , 1998, AAAI/IAAI.