Transfer Learning Through Policy Abstraction Using Learning Vector Quantization

Reinforcement learning (RL) enables an agent to find a solution to a problem by interacting with the environment. However, the learning process always starts from scratch and possibly takes a long time. Here, knowledge transfer between tasks is considered. In this paper, we argue that an abstraction can improve the transfer learning. Modified learning vector quantization (LVQ) that can manipulate its network weights is proposed to perform an abstraction, an adaptation and a precaution. At first, the abstraction is performed by extracting an abstract policy out of a learned policy which is acquired through conventional RL method, Q-learning. The abstract policy then is used in a new task as prior information. Here, the adaptation or policy learning as well as new task's abstract policy generating are performed using only a single operation. Simulation results show that the representation of acquired abstract policy is interpretable, that the modified LVQ successfully performs policy learning as well as generates abstract policy and that the application of generalized common abstract policy produces better results by more effectively guiding the agent when learning a new task.

[1]  Zhiming Cui,et al.  A Reward Optimization Method Based on Action Subrewards in Hierarchical Reinforcement Learning , 2014, TheScientificWorldJournal.

[2]  Manfred Huber,et al.  Learning to generalize and reuse skills using approximate partial policy homomorphisms , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[3]  Anna Helena Reali Costa,et al.  Stochastic Abstract Policies: Generalizing Knowledge to Improve Reinforcement Learning , 2015, IEEE Transactions on Cybernetics.

[4]  Junichi Murata,et al.  Acceleration of Reinforcement Learning with Incomplete Prior Information , 2013, J. Adv. Comput. Intell. Intell. Informatics.

[5]  Teuvo Kohonen,et al.  Learning vector quantization , 1998 .

[6]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7]  Junichi Murata,et al.  A study on abstract policy for acceleration of reinforcement learning , 2014, 2014 Proceedings of the SICE Annual Conference (SICE).

[8]  Matthew E. Taylor,et al.  Abstraction and Generalization in Reinforcement Learning: A Summary and Framework , 2009, ALA.

[9]  James L. Carroll,et al.  Fixed vs. Dynamic Sub-Transfer in Reinforcement Learning , 2002, ICMLA.

[10]  Junichi Murata,et al.  A study on visual abstraction for reinforcement learning problem using Learning Vector Quantization , 2013, The SICE Annual Conference 2013.

[11]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..