论文信息 - Multiple-Target Reinforcement Learning with a Single Policy

Multiple-Target Reinforcement Learning with a Single Policy

We present a reinforcement learning approach to learning a single, non-hierarchical policy for multiple targets. In the context of a policy search method, we propose to define a parametrized policy as a function of both the state and the target. This allows for learning a single policy that can navigate the RL agent to different targets. Generalization to unseen targets is implicitly possible while avoiding combining local policies in a hierarchical RL setup. We present first promising experimental results that show the viability of our approach.

D Fox | Mp Deisenroth | D. Fox | Mp Deisenroth

[1] Marc Peter Deisenroth,et al. Efficient reinforcement learning using Gaussian processes , 2010 .

[2] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[3] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[4] Jeff G. Schneider,et al. Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning , 1996, NIPS.

[5] K. Chaloner,et al. Bayesian Experimental Design: A Review , 1995 .

[6] Agathe Girard,et al. Propagation of uncertainty in Bayesian kernel models - application to multiple-step ahead forecasting , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7] Stefan Schaal,et al. Learning from Demonstration , 1996, NIPS.