Heuristically Accelerated Reinforcement Learning by Means of Case-Based Reasoning and Transfer Learning

Reinforcement Learning (RL) is a well-known technique for learning the solutions of control problems from the interactions of an agent in its domain. However, RL is known to be inefficient in problems of the real-world where the state space and the set of actions grow up fast. Recently, heuristics, case-based reasoning (CBR) and transfer learning have been used as tools to accelerate the RL process. This paper investigates a class of algorithms called Transfer Learning Heuristically Accelerated Reinforcement Learning (TLHARL) that uses CBR as heuristics within a transfer learning setting to accelerate RL. The main contributions of this work are the proposal of a new TLHARL algorithm based on the traditional RL algorithm Q(λ) and the application of TLHARL on two distinct real-robot domains: a robot soccer with small-scale robots and the humanoid-robot stability learning. Experimental results show that our proposed method led to a significant improvement of the learning rate in both domains.

[1]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[2]  Barry D. Nichols Continuous Action-Space Reinforcement Learning Methods Applied to the Minimum-Time Swing-Up of the Acrobot , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[3]  Radhika Nagpal,et al.  Kilobot: A low cost scalable robot system for collective behaviors , 2012, 2012 IEEE International Conference on Robotics and Automation.

[4]  Student,et al.  THE PROBABLE ERROR OF A MEAN , 1908 .

[5]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[6]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[7]  Bikramjit Banerjee,et al.  General Game Learning Using Knowledge Transfer , 2007, IJCAI.

[8]  Jackson P. Matsuura,et al.  Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach , 2010, 2010 Latin American Robotics Symposium and Intelligent Robotics Meeting.

[9]  Reinaldo A. C. Bianchi,et al.  Newton: A High Level Control Humanoid Robot for the RoboCup Soccer KidSize League , 2014 .

[10]  Richard S. Sutton,et al.  Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.

[11]  Reinaldo A. C. Bianchi,et al.  Accelerating autonomous learning by using heuristic selection of actions , 2008, J. Heuristics.

[12]  Hajime Asama,et al.  Development of open humanoid platform DARwIn-OP , 2011, SICE Annual Conference 2011.

[13]  Tao Yu,et al.  Consensus Transfer ${Q}$ -Learning for Decentralized Generation Command Dispatch Based on Virtual Generation Tribe , 2018, IEEE Transactions on Smart Grid.

[14]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[15]  Ruslan Salakhutdinov,et al.  Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.

[16]  Katsuhisa Furuta,et al.  Swinging up a pendulum by energy control , 1996, Autom..

[17]  Peng Hao,et al.  Transfer learning using computational intelligence: A survey , 2015, Knowl. Based Syst..

[18]  Mark W. Spong,et al.  The swing up control problem for the Acrobot , 1995 .

[19]  Carlos H. C. Ribeiro,et al.  Heuristically accelerated reinforcement learning modularization for multi-agent multi-objective problems , 2014, Applied Intelligence.

[20]  Barry Smyth,et al.  Retrieval, reuse, revision and retention in case-based reasoning , 2005, The Knowledge Engineering Review.

[21]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  Taghi M. Khoshgoftaar,et al.  A survey of transfer learning , 2016, Journal of Big Data.

[24]  Barbara Caputo,et al.  Learning to Learn, from Transfer Learning to Domain Adaptation: A Unifying Perspective , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Peter Stone,et al.  Transferring Instances for Model-Based Reinforcement Learning , 2008, ECML/PKDD.

[26]  David W. Aha,et al.  Case-Based Reasoning in Transfer Learning , 2009, ICCBR.

[27]  Ian D. Watson,et al.  Combining Case-Based Reasoning and Reinforcement Learning for Tactical Unit Selection in Real-Time Strategy Game AI , 2016, ICCBR.

[28]  Ancai Zhang,et al.  Motion planning and tracking control for an acrobot based on a rewinding approach , 2013, Autom..

[29]  Felipe Leno da Silva,et al.  Towards Knowledge Transfer in Deep Reinforcement Learning , 2016, 2016 5th Brazilian Conference on Intelligent Systems (BRACIS).

[30]  Reinaldo A. C. Bianchi,et al.  Transferring knowledge as heuristics in reinforcement learning: A case-based approach , 2015, Artif. Intell..

[31]  Manuela M. Veloso,et al.  Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.

[32]  Antoni Wibowo,et al.  Review of state of the art for metaheuristic techniques in Academic Scheduling Problems , 2013, Artificial Intelligence Review.

[33]  Andrea Lockerd Thomaz,et al.  Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.

[34]  Bo Yang,et al.  Accelerating bio-inspired optimizer with transfer reinforcement learning for reactive power optimization , 2017, Knowl. Based Syst..

[35]  Reinaldo A. C. Bianchi,et al.  Heuristically Accelerated Q-Learning: A New Approach to Speed Up Reinforcement Learning , 2004, SBIA.

[36]  Ashok K. Goel,et al.  Abstracting Reusable Cases from Reinforcement Learning , 2005, ICCBR Workshops.

[37]  Alessandro Lazaric,et al.  Transfer in Reinforcement Learning: A Framework and a Survey , 2012, Reinforcement Learning.

[38]  E. Thorndike,et al.  The influence of improvement in one mental function upon the efficiency of other functions. (I). , 1901 .

[39]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[40]  Jelle R. Kok,et al.  The Incremental Development of a Synthetic Multi-Agent System: The UvA Trilearn 2001 Robotic Soccer Simulation Team , 2002 .

[41]  Qiang Yang,et al.  Transitive Transfer Learning , 2015, KDD.

[42]  Rich Caruana,et al.  Learning Many Related Tasks at the Same Time with Backpropagation , 1994, NIPS.

[43]  Welch Bl THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED , 1947 .

[44]  Richard S. Sutton,et al.  Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[45]  Chris Drummond,et al.  Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks , 2011, J. Artif. Intell. Res..

[46]  Rich Caruana,et al.  Inductive Transfer for Bayesian Network Structure Learning , 2007, ICML Unsupervised and Transfer Learning.

[47]  Sebastian Thrun,et al.  Learning One More Thing , 1994, IJCAI.

[48]  Reinaldo A. C. Bianchi,et al.  Transfer Learning Heuristically Accelerated Algorithm: A Case Study with Real Robots , 2016, 2016 XIII Latin American Robotics Symposium and IV Brazilian Robotics Symposium (LARS/SBR).

[49]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[50]  Sergey Levine,et al.  Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.

[51]  Bogdan Gabrys,et al.  Metalearning: a survey of trends and technologies , 2013, Artificial Intelligence Review.

[52]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[53]  Matthew E. Taylor Autonomous Inter-Task Transfer in Reinforcement Learning Domains , 2007, AAAI.