论文信息 - Heuristically Accelerated Q-Learning: A New Approach to Speed Up Reinforcement Learning

Heuristically Accelerated Q-Learning: A New Approach to Speed Up Reinforcement Learning

This work presents a new algorithm, called Heuristically Accelerated Q–Learning (HAQL), that allows the use of heuristics to speed up the well-known Reinforcement Learning algorithm Q–learning. A heuristic function \(\mathcal{H}\) that influences the choice of the actions characterizes the HAQL algorithm. The heuristic function is strongly associated with the policy: it indicates that an action must be taken instead of another. This work also proposes an automatic method for the extraction of the heuristic function \(\mathcal{H}\) from the learning process, called Heuristic from Exploration. Finally, experimental results shows that even a very simple heuristic results in a significant enhancement of performance of the reinforcement learning algorithm.

Reinaldo A. C. Bianchi | Anna Helena Reali Costa | Carlos H. C. Ribeiro | C. Ribeiro

[1] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[2] Peter Dayan,et al. Structure in the Space of Value Functions , 2002, Machine Learning.

[3] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[4] Ulrich Nehmzow. Mobile Robotics: A Practical Introduction , 2003 .

[5] Chris Drummond,et al. Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks , 2011, J. Artif. Intell. Res..

[6] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[7] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[8] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.

[9] G. Theraulaz,et al. Inspiration for optimization from social insect behaviour , 2000, Nature.

[10] Nils J. Nilsson,et al. A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[11] Luca Maria Gambardella,et al. Ant-Q: A Reinforcement Learning Approach to the Traveling Salesman Problem , 1995, ICML.