Multiple-Target Reinforcement Learning with a Single Policy

We present a reinforcement learning approach to learning a single, non-hierarchical policy for multiple targets. In the context of a policy search method, we propose to define a parametrized policy as a function of both the state and the target. This allows for learning a single policy that can navigate the RL agent to different targets. Generalization to unseen targets is implicitly possible while avoiding combining local policies in a hierarchical RL setup. We present first promising experimental results that show the viability of our approach.