论文信息 - Reinforcement Learning of the Prediction Horizon in Model Predictive Control

Reinforcement Learning of the Prediction Horizon in Model Predictive Control

Model predictive control (MPC) is a powerful trajectory optimization control technique capable of controlling complex nonlinear systems while respecting system constraints and ensuring safe operation. The MPC’s capabilities come at the cost of a high online computational complexity, the requirement of an accurate model of the system dynamics, and the necessity of tuning its parameters to the specific control application. The main tunable parameter affecting the computational complexity is the prediction horizon length, controlling how far into the future the MPC predicts the system response and thus evaluates the optimality of its computed trajectory. A longer horizon generally increases the control performance, but requires an increasingly powerful computing platform, excluding certain control applications. The performance sensitivity to the prediction horizon length varies over the state space, and this motivated the adaptive horizon model predictive control (AHMPC), which adapts the prediction horizon according to some criteria. In this paper we propose to learn the optimal prediction horizon as a function of the state using reinforcement learning (RL). We show how the RL learning problem can be formulated and test our method on two control tasks — showing clear improvements over the fixed horizon MPC scheme — while requiring only minutes of learning.

[1] Mario Zanon,et al. Safe Reinforcement Learning Using Robust MPC , 2019, IEEE Transactions on Automatic Control.

[2] D. Mayne,et al. Robust receding horizon control of constrained nonlinear systems , 1993, IEEE Trans. Autom. Control..

[3] Philip S. Thomas,et al. Safe Reinforcement Learning , 2015 .

[4] Stephen J. Wright,et al. Application of Interior-Point Methods to Model Predictive Control , 1998 .

[5] Arthur J. Krener,et al. Adaptive Horizon Model Predictive Control , 2016, 1602.08619.

[6] Yuval Tassa,et al. Value function approximation and model predictive control , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[7] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.

[8] Jaime F. Fisac,et al. A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems , 2017, IEEE Transactions on Automatic Control.

[9] David Q. Mayne,et al. Constrained model predictive control: Stability and optimality , 2000, Autom..

[10] S. Shankar Sastry,et al. Provably safe and robust learning-based model predictive control , 2011, Autom..

[11] D. Mayne,et al. Min-max feedback model predictive control for constrained linear systems , 1998, IEEE Trans. Autom. Control..

[12] Harm van Seijen,et al. Effective Multi-step Temporal-Difference Learning for Non-Linear Function Approximation , 2016, ArXiv.

[13] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[14] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[15] Stephen J. Wright,et al. Nonlinear Predictive Control and Moving Horizon Estimation — An Introductory Overview , 1999 .

[16] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[17] Sham M. Kakade,et al. Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control , 2018, ICLR.

[18] Efe Camci,et al. Automated Tuning of Nonlinear Model Predictive Controller by Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19] Jan M. Maciejowski,et al. A comparison of interior point and active set methods for FPGA implementation of model predictive control , 2009, 2009 European Control Conference (ECC).

[20] Ammar Hasan,et al. Machine Learning Based Adaptive Prediction Horizon in Finite Control Set Model Predictive Control , 2018, IEEE Access.