Using Cases as Heuristics in Reinforcement Learning: A Transfer Learning Application

In this paper we propose to combine three AI techniques to speed up a Reinforcement Learning algorithm in a Transfer Learning problem: Case-based Reasoning, Heuristically Accelerated Reinforcement Learning and Neural Networks. To do so, we propose a new algorithm, called L3, which works in 3 stages: in the first stage, it uses Reinforcement Learning to learn how to perform one task, and stores the optimal policy for this problem as a case-base; in the second stage, it uses a Neural Network to map actions from one domain to actions in the other domain and; in the third stage, it uses the case-base learned in the first stage as heuristics to speed up the learning performance in a related, but different, task. The RL algorithm used in the first phase is the Q-learning and in the third phase is the recently proposed Case-based Heuristically Accelerated Q-learning. A set of empirical evaluations were conducted in transferring the learning between two domains, the Acrobot and the Robocup 3D: the policy learned during the solution of the Acrobot Problem is transferred and used to speed up the learning of stability policies for a humanoid robot in the Robocup 3D simulator. The results show that the use of this algorithm can lead to a significant improvement in the performance of the agent.

[1]  S. Griffis EDITOR , 1997, Journal of Navigation.

[2]  Manuela M. Veloso,et al.  A case-based approach for coordinated action selection in robot soccer , 2009, Artif. Intell..

[3]  Vishal Soni,et al.  Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains , 2006, AAAI.

[4]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[5]  Reinaldo A. C. Bianchi,et al.  Improving Reinforcement Learning by Using Case Based Heuristics , 2009, ICCBR.

[6]  Chris Drummond,et al.  Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks , 2011, J. Artif. Intell. Res..

[7]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[8]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[9]  Jude W. Shavlik,et al.  Using Advice to Transfer Knowledge Acquired in One Reinforcement Learning Task to Another , 2005, ECML.

[10]  Bikramjit Banerjee,et al.  General Game Learning Using Knowledge Transfer , 2007, IJCAI.

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  R. Lathe Phd by thesis , 1988, Nature.

[13]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[14]  Jeffrey M. Bradshaw,et al.  Ieee Intelligent Systems Kinds of Systems? , 2009 .

[15]  Reinaldo A. C. Bianchi,et al.  Heuristically Accelerated Q-Learning: A New Approach to Speed Up Reinforcement Learning , 2004, SBIA.

[16]  M. Pollack Journal of Artificial Intelligence Research: Preface , 2001 .

[17]  Peter Stone,et al.  Transferring Instances for Model-Based Reinforcement Learning , 2008, ECML/PKDD.

[18]  Ashok K. Goel,et al.  Abstracting Reusable Cases from Reinforcement Learning , 2005, ICCBR Workshops.

[19]  David W. Aha,et al.  Case-Based Reasoning in Transfer Learning , 2009, ICCBR.

[20]  E. Thorndike,et al.  The influence of improvement in one mental function upon the efficiency of other functions. II. The estimation of magnitudes. , 1901 .

[21]  Andrea Bonarini,et al.  Transfer of samples in batch reinforcement learning , 2008, ICML '08.

[22]  Chen Zonghai,et al.  A case-based reinforcement learning for probe robot path planning , 2002, Proceedings of the 4th World Congress on Intelligent Control and Automation (Cat. No.02EX527).

[23]  Manuela M. Veloso,et al.  Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.

[24]  Paul Juell,et al.  Using Reinforcement Learning for Similarity Assessment in Case-Based Systems , 2003, IEEE Intell. Syst..

[25]  Barry Smyth,et al.  Retrieval, reuse, revision and retention in case-based reasoning , 2005, The Knowledge Engineering Review.

[26]  Martin A. Riedmiller,et al.  CBR for State Value Function Approximation in Reinforcement Learning , 2005, ICCBR.

[27]  Hector Muñoz-Avila,et al.  Recognizing the Enemy: Combining Reinforcement Learning with Strategy Selection Using Case-Based Reasoning , 2008, ECCBR.

[28]  Ashwin Ram,et al.  Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL , 2007, IJCAI.

[29]  Reinaldo A. C. Bianchi,et al.  Accelerating autonomous learning by using heuristic selection of actions , 2008, J. Heuristics.