论文信息 - Heuristically Accelerated Reinforcement Learning by Means of Case-Based Reasoning and Transfer Learning

Heuristically Accelerated Reinforcement Learning by Means of Case-Based Reasoning and Transfer Learning

Reinforcement Learning (RL) is a well-known technique for learning the solutions of control problems from the interactions of an agent in its domain. However, RL is known to be inefficient in problems of the real-world where the state space and the set of actions grow up fast. Recently, heuristics, case-based reasoning (CBR) and transfer learning have been used as tools to accelerate the RL process. This paper investigates a class of algorithms called Transfer Learning Heuristically Accelerated Reinforcement Learning (TLHARL) that uses CBR as heuristics within a transfer learning setting to accelerate RL. The main contributions of this work are the proposal of a new TLHARL algorithm based on the traditional RL algorithm Q(λ) and the application of TLHARL on two distinct real-robot domains: a robot soccer with small-scale robots and the humanoid-robot stability learning. Experimental results show that our proposed method led to a significant improvement of the learning rate in both domains.

Reinaldo A. C. Bianchi | Ramón López de Mántaras | Paulo E. Santos | Luiz A. Celiberto | Isaac J. da Silva

[1] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[2] Barry D. Nichols. Continuous Action-Space Reinforcement Learning Methods Applied to the Minimum-Time Swing-Up of the Acrobot , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[3] Radhika Nagpal,et al. Kilobot: A low cost scalable robot system for collective behaviors , 2012, 2012 IEEE International Conference on Robotics and Automation.

[4] Student,et al. THE PROBABLE ERROR OF A MEAN , 1908 .

[5] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[6] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.

[7] Bikramjit Banerjee,et al. General Game Learning Using Knowledge Transfer , 2007, IJCAI.

[8] Jackson P. Matsuura,et al. Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach , 2010, 2010 Latin American Robotics Symposium and Intelligent Robotics Meeting.

[9] Reinaldo A. C. Bianchi,et al. Newton: A High Level Control Humanoid Robot for the RoboCup Soccer KidSize League , 2014 .

[10] Richard S. Sutton,et al. Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.

[11] Reinaldo A. C. Bianchi,et al. Accelerating autonomous learning by using heuristic selection of actions , 2008, J. Heuristics.

[12] Hajime Asama,et al. Development of open humanoid platform DARwIn-OP , 2011, SICE Annual Conference 2011.

[13] Tao Yu,et al. Consensus Transfer ${Q}$ -Learning for Decentralized Generation Command Dispatch Based on Virtual Generation Tribe , 2018, IEEE Transactions on Smart Grid.

[14] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.

[15] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.

[16] Katsuhisa Furuta,et al. Swinging up a pendulum by energy control , 1996, Autom..

[17] Peng Hao,et al. Transfer learning using computational intelligence: A survey , 2015, Knowl. Based Syst..

[18] Mark W. Spong,et al. The swing up control problem for the Acrobot , 1995 .

[19] Carlos H. C. Ribeiro,et al. Heuristically accelerated reinforcement learning modularization for multi-agent multi-objective problems , 2014, Applied Intelligence.

[20] Barry Smyth,et al. Retrieval, reuse, revision and retention in case-based reasoning , 2005, The Knowledge Engineering Review.

[21] Chris Watkins,et al. Learning from delayed rewards , 1989 .

[22] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23] Taghi M. Khoshgoftaar,et al. A survey of transfer learning , 2016, Journal of Big Data.

[24] Barbara Caputo,et al. Learning to Learn, from Transfer Learning to Domain Adaptation: A Unifying Perspective , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Peter Stone,et al. Transferring Instances for Model-Based Reinforcement Learning , 2008, ECML/PKDD.

[26] David W. Aha,et al. Case-Based Reasoning in Transfer Learning , 2009, ICCBR.

[27] Ian D. Watson,et al. Combining Case-Based Reasoning and Reinforcement Learning for Tactical Unit Selection in Real-Time Strategy Game AI , 2016, ICCBR.

[28] Ancai Zhang,et al. Motion planning and tracking control for an acrobot based on a rewinding approach , 2013, Autom..

[29] Felipe Leno da Silva,et al. Towards Knowledge Transfer in Deep Reinforcement Learning , 2016, 2016 5th Brazilian Conference on Intelligent Systems (BRACIS).

[30] Reinaldo A. C. Bianchi,et al. Transferring knowledge as heuristics in reinforcement learning: A case-based approach , 2015, Artif. Intell..

[31] Manuela M. Veloso,et al. Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.

[32] Antoni Wibowo,et al. Review of state of the art for metaheuristic techniques in Academic Scheduling Problems , 2013, Artificial Intelligence Review.

[33] Andrea Lockerd Thomaz,et al. Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.

[34] Bo Yang,et al. Accelerating bio-inspired optimizer with transfer reinforcement learning for reactive power optimization , 2017, Knowl. Based Syst..

[35] Reinaldo A. C. Bianchi,et al. Heuristically Accelerated Q-Learning: A New Approach to Speed Up Reinforcement Learning , 2004, SBIA.

[36] Ashok K. Goel,et al. Abstracting Reusable Cases from Reinforcement Learning , 2005, ICCBR Workshops.

[37] Alessandro Lazaric,et al. Transfer in Reinforcement Learning: A Framework and a Survey , 2012, Reinforcement Learning.

[38] E. Thorndike,et al. The influence of improvement in one mental function upon the efficiency of other functions. (I). , 1901 .

[39] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[40] Jelle R. Kok,et al. The Incremental Development of a Synthetic Multi-Agent System: The UvA Trilearn 2001 Robotic Soccer Simulation Team , 2002 .

[41] Qiang Yang,et al. Transitive Transfer Learning , 2015, KDD.

[42] Rich Caruana,et al. Learning Many Related Tasks at the Same Time with Backpropagation , 1994, NIPS.

[43] Welch Bl. THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED , 1947 .

[44] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[45] Chris Drummond,et al. Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks , 2011, J. Artif. Intell. Res..

[46] Rich Caruana,et al. Inductive Transfer for Bayesian Network Structure Learning , 2007, ICML Unsupervised and Transfer Learning.

[47] Sebastian Thrun,et al. Learning One More Thing , 1994, IJCAI.

[48] Reinaldo A. C. Bianchi,et al. Transfer Learning Heuristically Accelerated Algorithm: A Case Study with Real Robots , 2016, 2016 XIII Latin American Robotics Symposium and IV Brazilian Robotics Symposium (LARS/SBR).

[49] Yoshua Bengio,et al. Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[50] Sergey Levine,et al. Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.

[51] Bogdan Gabrys,et al. Metalearning: a survey of trends and technologies , 2013, Artificial Intelligence Review.

[52] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[53] Matthew E. Taylor. Autonomous Inter-Task Transfer in Reinforcement Learning Domains , 2007, AAAI.