论文信息 - Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach

Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach

Reinforcement Learning (RL) is a well-known technique for the solution of problems where agents need to act with success in an unknown environment, learning through trial and error. However, this technique is not efficient enough to be used in applications with real world demands due to the time that the agent needs to learn. This paper investigates the use of Transfer Learning (TL) between agents to speed up the well-known Q-learning Reinforcement Learning algorithm. The new approach presented here allows the use of cases in a case base as heuristics to speed up the Q-learning algorithm, combining Case-Based Reasoning (CBR) and Heuristically Accelerated Reinforcement Learning (HARL) techniques. A set of empirical evaluations were conducted in the Mountain Car Problem Domain, where the actions learned during the solution of the 2D version of the problem can be used to speed up the learning of the policies for its 3D version. The experiments were made comparing the Q-learning Reinforcement Learning algorithm, the HAQL Heuristic Accelerated Reinforcement Learning (HARL) algorithm and the TL-HAQL algorithm, proposed here. The results show that the use of a case-base for transfer learning can lead to a significant improvement in the performance of the agent, making it learn faster than using either RL or HARL methods alone.

Jackson P. Matsuura | Luiz A. Celiberto Jr. | Ramon Lopez de Mantaras | Reinaldo A.C. Bianchi

[1] Paul Juell,et al. Using Reinforcement Learning for Similarity Assessment in Case-Based Systems , 2003, IEEE Intell. Syst..

[2] Barry Smyth,et al. Retrieval, reuse, revision and retention in case-based reasoning , 2005, The Knowledge Engineering Review.

[3] Sebastian Thrun,et al. Learning One More Thing , 1994, IJCAI.

[4] Martin A. Riedmiller,et al. CBR for State Value Function Approximation in Reinforcement Learning , 2005, ICCBR.

[5] Maurice Bruynooghe,et al. Learning Relational Options for Inductive Transfer in Relational Reinforcement Learning , 2007, ILP.

[6] Peter Stone,et al. Value-Function-Based Transfer for Reinforcement Learning Using Structure Mapping , 2006, AAAI.

[7] Rich Caruana,et al. Learning Many Related Tasks at the Same Time with Backpropagation , 1994, NIPS.

[8] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.

[9] E. Thorndike,et al. The influence of improvement in one mental function upon the efficiency of other functions. II. The estimation of magnitudes. , 1901 .

[10] Manuela M. Veloso,et al. A case-based approach for coordinated action selection in robot soccer , 2009, Artif. Intell..

[11] Agnar Aamodt,et al. Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..

[12] Chris Drummond,et al. Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks , 2011, J. Artif. Intell. Res..

[13] Peter Stone,et al. Transfer Learning via Inter-Task Mappings for Temporal Difference Learning , 2007, J. Mach. Learn. Res..

[14] Dieter Stockhofe,et al. Die zerlegungsmatrizen der symmetrischen gruppen s12 und s13 zur primzahl 2 , 1979 .

[15] Peter Stone,et al. Autonomous transfer for reinforcement learning , 2008, AAMAS.

[16] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[17] Ashwin Ram,et al. Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL , 2007, IJCAI.

[18] Ulrich Nehmzow. Scientific methods in mobile robotics - quantitative analysis of agent behaviour , 2006 .

[19] Reinaldo A. C. Bianchi,et al. Accelerating autonomous learning by using heuristic selection of actions , 2008, J. Heuristics.

[20] Chris Watkins,et al. Learning from delayed rewards , 1989 .

[21] Reinaldo A. C. Bianchi,et al. Improving Reinforcement Learning by Using Case Based Heuristics , 2009, ICCBR.

[22] Chen Zonghai,et al. A case-based reinforcement learning for probe robot path planning , 2002, Proceedings of the 4th World Congress on Intelligent Control and Automation (Cat. No.02EX527).

[23] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[24] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[25] Sebastian Thrun,et al. Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.

[26] Matthew E. Taylor. Autonomous Inter-Task Transfer in Reinforcement Learning Domains , 2007, AAAI.

[27] Hector Muñoz-Avila,et al. Recognizing the Enemy: Combining Reinforcement Learning with Strategy Selection Using Case-Based Reasoning , 2008, ECCBR.

[28] Peter Stone,et al. Transferring Instances for Model-Based Reinforcement Learning , 2008, ECML/PKDD.

[29] Ashok K. Goel,et al. Abstracting Reusable Cases from Reinforcement Learning , 2005, ICCBR Workshops.

[30] Raquel Ros Espinoza,et al. Action selection in cooperative robot soccer using case-based reasoning , 2008 .