论文信息 - Effective Transfer via Demonstrations in Reinforcement Learning: A Preliminary Study

Effective Transfer via Demonstrations in Reinforcement Learning: A Preliminary Study

There are many successful methods for transferring information from one agent to another. One approach, taken in this work, is to have one (source) agent demonstrate a policy to a second (target) agent, and then have that second agent improve upon the policy. By allowing the target agent to observe the source agent's demonstrations, rather than relying on other types of direct knowledge transfer like Q-values, rules, or shared representations, we remove the need for the agents to know anything about each other's internal representation or have a shared language. In this work, we introduce a refinement to HAT, an existing transfer learning method, by integrating the target agent's confidence in its representation of the source agent's policy. Results show that a target agent can effectively 1) improve its initial performance relative to learning without transfer (jumpstart) and 2) improve its performance relative to the source agent (total reward). Furthermore, both the jumpstart and total reward are improved with this new refinement, relative to learning without transfer and relative to learning with HAT.

Matthew E. Taylor | Zhaodong Wang | Zhaodong Wang

[1] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[2] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[3] G. Celeux,et al. A Classification EM algorithm for clustering and two stochastic versions , 1992 .

[4] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[5] Sonia Chernova,et al. Integrating reinforcement learning with human demonstrations of varying ability , 2011, AAMAS.

[6] Peter Stone,et al. Keepaway Soccer: From Machine Learning Testbed to Benchmark , 2005, RoboCup.

[7] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.

[8] Peter Stone,et al. Cross-domain transfer for reinforcement learning , 2007, ICML '07.

[9] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[10] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[11] Ian Frank,et al. Soccer Server: A Tool for Research on Multiagent Systems , 1998, Appl. Artif. Intell..