Reusing Source Task Knowledge via Transfer Approximator in Reinforcement Transfer Learning

Transfer Learning (TL) has received a great deal of attention because of its ability to speed up Reinforcement Learning (RL) by reusing learned knowledge from other tasks. This paper proposes a new transfer learning framework, referred to as Transfer Learning via Artificial Neural Network Approximator (TL-ANNA). It builds an Artificial Neural Network (ANN) transfer approximator to transfer the related knowledge from the source task into the target task and reuses the transferred knowledge with a Probabilistic Policy Reuse (PPR) scheme. Specifically, the transfer approximator maps the state of the target task symmetrically to states of the source task with a certain mapping rule, and activates the related knowledge (components of the action-value function) of the source task as the input of the ANNs; it then predicts the quality of the actions in the target task with the ANNs. The target learner uses the PPR scheme to bias the RL with the suggested action from the transfer approximator. In this way, the transfer approximator builds a symmetric knowledge path between the target task and the source task. In addition, two mapping rules for the transfer approximator are designed, namely, Full Mapping Rule and Group Mapping Rule. Experiments performed on the RoboCup soccer Keepaway task verified that the proposed transfer learning methods outperform two other transfer learning methods in both jumpstart and time to threshold metrics and are more robust to the quality of source knowledge. In addition, the TL-ANNA with the group mapping rule exhibits slightly worse performance than the one with the full mapping rule, but with less computation and space cost when appropriate grouping method is used.

[1]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[2]  Robertas Damasevicius,et al.  Multi-threaded learning control mechanism for neural networks , 2018, Future Gener. Comput. Syst..

[3]  Lincheng Shen,et al.  Transfer learning via linear multi-variable mapping under reinforcement learning framework , 2017, 2017 36th Chinese Control Conference (CCC).

[4]  Javier García,et al.  Probabilistic Policy Reuse for inter-task transfer learning , 2010, Robotics Auton. Syst..

[5]  Junichi Murata,et al.  Transfer Learning Through Policy Abstraction Using Learning Vector Quantization , 2018 .

[6]  Peter Stone,et al.  Transfer Learning via Inter-Task Mappings for Temporal Difference Learning , 2007, J. Mach. Learn. Res..

[7]  Malcolm I. Heywood,et al.  Knowledge Transfer from Keepaway Soccer to Half-field Offense through Program Symbiosis: Building Simple Programs for a Complex Task , 2015, GECCO.

[8]  Matthew E. Taylor,et al.  Effective Transfer via Demonstrations in Reinforcement Learning: A Preliminary Study , 2016, AAAI Spring Symposia.

[9]  Girish Chowdhary,et al.  Cross-Domain Transfer in Reinforcement Learning Using Target Apprentice , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Takamitsu Matsubara,et al.  Policy Transfer from Simulations to Real World by Transfer Component Analysis , 2018, 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE).

[11]  Kao-Shing Hwang,et al.  An Adaptive Strategy Selection Method With Reinforcement Learning for Robotic Soccer Games , 2018, IEEE Access.

[12]  Tara N. Sainath,et al.  Parallel Deep Neural Network Training for Big Data on Blue Gene/Q , 2017, IEEE Trans. Parallel Distributed Syst..

[13]  Peter Stone,et al.  Keepaway Soccer: From Machine Learning Testbed to Benchmark , 2005, RoboCup.

[14]  Anna Helena Reali Costa,et al.  Stochastic Abstract Policies: Generalizing Knowledge to Improve Reinforcement Learning , 2015, IEEE Transactions on Cybernetics.

[15]  Yifeng Zhu,et al.  Zero Shot Transfer Learning for Robot Soccer , 2018, AAMAS.

[16]  Matthew E. Taylor,et al.  Policy Transfer using Reward Shaping , 2015, AAMAS.

[17]  Lincheng Shen,et al.  An autonomous inter-task mapping learning method via artificial neural network for transfer learning , 2017, 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[18]  Romain Laroche,et al.  Transfer Reinforcement Learning with Shared Dynamics , 2017, AAAI.

[19]  Zhiyong Du,et al.  Context-Aware Indoor VLC/RF Heterogeneous Network Selection: Reinforcement Learning With Knowledge Transfer , 2018, IEEE Access.

[20]  Masoud Asadpour,et al.  Graph based skill acquisition and transfer Learning for continuous reinforcement learning domains , 2017, Pattern Recognit. Lett..

[21]  James S. Albus,et al.  Brains, behavior, and robotics , 1981 .

[22]  Sergey Levine,et al.  Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.

[23]  Sonia Chernova,et al.  Integrating reinforcement learning with human demonstrations of varying ability , 2011, AAMAS.

[24]  Peter Stone,et al.  Autonomous transfer for reinforcement learning , 2008, AAMAS.

[25]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[26]  Yusen Zhan,et al.  Online Transfer Learning in Reinforcement Learning Domains , 2015, AAAI Fall Symposia.

[27]  Tie-Yan Liu,et al.  Target Transfer Q-Learning and Its Convergence Analysis , 2018, Neurocomputing.

[28]  Geoff S. Nitschke,et al.  Multi-agent Behavior-Based Policy Transfer , 2016, EvoApplications.

[29]  Stefanie Tellex,et al.  Advantages and Limitations of using Successor Features for Transfer in Reinforcement Learning , 2017, ArXiv.

[30]  Felipe Leno da Silva,et al.  Towards Zero-Shot Autonomous Inter-Task Mapping through Object-Oriented Task Description , 2017 .

[31]  Manuela M. Veloso,et al.  Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.

[32]  Balaraman Ravindran,et al.  ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources , 2015, ArXiv.