论文信息 - Direct Policy Transfer via Hidden Parameter Markov Decision Processes

Direct Policy Transfer via Hidden Parameter Markov Decision Processes

Many situations arise in which an agent must learn to solve tasks with similar, but not identical, dynamics. For example, a robot might be tasked to manipulate objects with slightly different masses and volumes; a clinician may treat patients with unique—but still all human adult—physiologies. In such situations, if one has already seen several instances of the related tasks, it is inefficient to start learning a new task from scratch. Indeed, in some domains, such as medicine, one does not have multiple episodes to learn a personalized treatment policy: decisions must be optimized from only a few interactions with the patient.

Taylor W. Killian | G. Konidaris | Finale Doshi-Velez | Jiayu Yao | F. Doshi-Velez

[1] Louis Wehenkel,et al. Clinical data based optimal STI strategies for HIV: a reinforcement learning approach , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[2] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[3] David Hsu,et al. Planning how to learn , 2013, 2013 IEEE International Conference on Robotics and Automation.

[4] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.

[5] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[6] Finale Doshi-Velez,et al. Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations , 2013, IJCAI.

[7] Richard E. Turner,et al. Black-box α-divergence minimization , 2016, ICML 2016.

[8] Finale Doshi-Velez,et al. Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks , 2016, ICLR.

[9] Mykel J. Kochenderfer,et al. Simultaneous policy learning and latent state inference for imitating driver behavior , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[10] Finale Doshi-Velez,et al. Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes , 2017, AAAI.

[11] Sergey Levine,et al. Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.

[12] Regina Barzilay,et al. Deep Transfer in Reinforcement Learning by Language Grounding , 2017, ArXiv.

[13] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.

[14] Joelle Pineau,et al. Decoupling Dynamics and Reward for Transfer Learning , 2018, ICLR.