论文信息 - Control-Tutored Reinforcement Learning

Control-Tutored Reinforcement Learning

We introduce a control-tutored reinforcement learning (CTRL) algorithm. The idea is to enhance tabular learning algorithms so as to improve the exploration of the state-space, and substantially reduce learning times by leveraging some limited knowledge of the plant encoded into a tutoring model-based control strategy. We illustrate the benefits of our novel approach and its effectiveness by using the problem of controlling one or more agents to herd and contain within a goal region a set of target free-roving agents in the plane.

[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[2] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.

[3] Giovanni Russo,et al. Driving Reinforcement Learning with Models , 2020, IntelliSys.

[4] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[5] Andreas Krause,et al. Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.

[6] Francesco Borrelli,et al. Learning Model Predictive Control for Iterative Tasks. A Data-Driven Control Framework , 2016, IEEE Transactions on Automatic Control.

[7] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[8] Francesco Borrelli,et al. Repetitive learning model predictive control: An autonomous racing example , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[9] D. Bertsekas. Reinforcement Learning and Optimal ControlA Selective Overview , 2018 .

[10] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[11] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[12] Vali Derhami,et al. Supervised fuzzy reinforcement learning for robot navigation , 2016, Appl. Soft Comput..

[13] Mac Schwager,et al. Controlling Noncooperative Herds with Robotic Herders , 2018, IEEE Transactions on Robotics.

[14] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[15] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..

[16] Gábor Orosz,et al. End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks , 2019, AAAI.

[17] Warren E. Dixon,et al. Single Agent Herding of n-Agents: A Switched Systems Approach , 2017 .

[18] Giacomo Albi,et al. Invisible Control of Self-Organizing Agents Leaving Unknown Environments , 2015, SIAM J. Appl. Math..