论文信息 - Using Cases as Heuristics in Reinforcement Learning: A Transfer Learning Application

Using Cases as Heuristics in Reinforcement Learning: A Transfer Learning Application

In this paper we propose to combine three AI techniques to speed up a Reinforcement Learning algorithm in a Transfer Learning problem: Case-based Reasoning, Heuristically Accelerated Reinforcement Learning and Neural Networks. To do so, we propose a new algorithm, called L3, which works in 3 stages: in the first stage, it uses Reinforcement Learning to learn how to perform one task, and stores the optimal policy for this problem as a case-base; in the second stage, it uses a Neural Network to map actions from one domain to actions in the other domain and; in the third stage, it uses the case-base learned in the first stage as heuristics to speed up the learning performance in a related, but different, task. The RL algorithm used in the first phase is the Q-learning and in the third phase is the recently proposed Case-based Heuristically Accelerated Q-learning. A set of empirical evaluations were conducted in transferring the learning between two domains, the Acrobot and the Robocup 3D: the policy learned during the solution of the Acrobot Problem is transferred and used to speed up the learning of stability policies for a humanoid robot in the Robocup 3D simulator. The results show that the use of this algorithm can lead to a significant improvement in the performance of the agent.

Reinaldo A. C. Bianchi | Ramón López de Mántaras | Luiz A. Celiberto | Jackson Paul Matsuura

[1] S. Griffis. EDITOR , 1997, Journal of Navigation.

[2] Manuela M. Veloso,et al. A case-based approach for coordinated action selection in robot soccer , 2009, Artif. Intell..

[3] Vishal Soni,et al. Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains , 2006, AAAI.

[4] Nils J. Nilsson,et al. Artificial Intelligence , 1974, IFIP Congress.

[5] Reinaldo A. C. Bianchi,et al. Improving Reinforcement Learning by Using Case Based Heuristics , 2009, ICCBR.

[6] Chris Drummond,et al. Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks , 2011, J. Artif. Intell. Res..

[7] Chris Watkins,et al. Learning from delayed rewards , 1989 .

[8] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[9] Jude W. Shavlik,et al. Using Advice to Transfer Knowledge Acquired in One Reinforcement Learning Task to Another , 2005, ECML.

[10] Bikramjit Banerjee,et al. General Game Learning Using Knowledge Transfer , 2007, IJCAI.

[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12] R. Lathe. Phd by thesis , 1988, Nature.

[13] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[14] Jeffrey M. Bradshaw,et al. Ieee Intelligent Systems Kinds of Systems? , 2009 .

[15] Reinaldo A. C. Bianchi,et al. Heuristically Accelerated Q-Learning: A New Approach to Speed Up Reinforcement Learning , 2004, SBIA.

[16] M. Pollack. Journal of Artificial Intelligence Research: Preface , 2001 .

[17] Peter Stone,et al. Transferring Instances for Model-Based Reinforcement Learning , 2008, ECML/PKDD.

[18] Ashok K. Goel,et al. Abstracting Reusable Cases from Reinforcement Learning , 2005, ICCBR Workshops.

[19] David W. Aha,et al. Case-Based Reasoning in Transfer Learning , 2009, ICCBR.

[20] E. Thorndike,et al. The influence of improvement in one mental function upon the efficiency of other functions. II. The estimation of magnitudes. , 1901 .

[21] Andrea Bonarini,et al. Transfer of samples in batch reinforcement learning , 2008, ICML '08.

[22] Chen Zonghai,et al. A case-based reinforcement learning for probe robot path planning , 2002, Proceedings of the 4th World Congress on Intelligent Control and Automation (Cat. No.02EX527).

[23] Manuela M. Veloso,et al. Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.

[24] Paul Juell,et al. Using Reinforcement Learning for Similarity Assessment in Case-Based Systems , 2003, IEEE Intell. Syst..

[25] Barry Smyth,et al. Retrieval, reuse, revision and retention in case-based reasoning , 2005, The Knowledge Engineering Review.

[26] Martin A. Riedmiller,et al. CBR for State Value Function Approximation in Reinforcement Learning , 2005, ICCBR.

[27] Hector Muñoz-Avila,et al. Recognizing the Enemy: Combining Reinforcement Learning with Strategy Selection Using Case-Based Reasoning , 2008, ECCBR.

[28] Ashwin Ram,et al. Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL , 2007, IJCAI.

[29] Reinaldo A. C. Bianchi,et al. Accelerating autonomous learning by using heuristic selection of actions , 2008, J. Heuristics.