Transfer Learning in Reinforcement Learning Problems Through Partial Policy Recycling

We investigate the relation between transfer learning in reinforcement learning with function approximation and supervised learning with concept drift. We present a new incremental relational regression tree algorithm that is capable of dealing with concept drift through tree restructuring and show that it enables a Q-learner to transfer knowledge from one task to another by recycling those parts of the generalized Q-function that still hold interesting information for the new task. We illustrate the performance of the algorithm in several experiments.

[1]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[2]  Leslie Pack Kaelbling,et al.  Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.

[3]  Kurt Driessens,et al.  Speeding Up Relational Reinforcement Learning through the Use of an Incremental First Order Decision Tree Learner , 2001, ECML.

[4]  Luc De Raedt,et al.  Machine Learning: ECML 2001 , 2001, Lecture Notes in Computer Science.

[5]  John K. Slaney,et al.  Blocks World revisited , 2001, Artif. Intell..

[6]  Paul E. Utgoff,et al.  Decision Tree Induction Based on Efficient Tree Restructuring , 1997, Machine Learning.

[7]  Gerhard Widmer,et al.  Learning in the presence of concept drift and hidden contexts , 2004, Machine Learning.

[8]  Michael G. Madden,et al.  Transfer of Experience Between Reinforcement Learning Environments with Progressive Difficulty , 2004, Artificial Intelligence Review.

[9]  Jude W. Shavlik,et al.  Using Advice to Transfer Knowledge Acquired in One Reinforcement Learning Task to Another , 2005, ECML.

[10]  Michael R. Genesereth,et al.  General Game Playing: Overview of the AAAI Competition , 2005, AI Mag..

[11]  Kurt Driessens,et al.  Relational Reinforcement Learning , 1998, Machine-mediated learning.

[12]  Shimon Whiteson,et al.  Transfer Learning for Policy Search Methods , 2006 .

[13]  Manuela M. Veloso,et al.  Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.

[14]  Jude W. Shavlik,et al.  Skill Acquisition Via Transfer Learning and Advice Taking , 2006, ECML.

[15]  Neil D. Lawrence,et al.  Missing Data in Kernel PCA , 2006, ECML.

[16]  Bikramjit Banerjee,et al.  General Game Learning Using Knowledge Transfer , 2007, IJCAI.

[17]  Amy McGovern,et al.  Utile Distinctions for Relational Reinforcement Learning , 2007, IJCAI.

[18]  Shimon Whiteson,et al.  Transfer via inter-task mappings in policy search reinforcement learning , 2007, AAMAS '07.