Model-free Learning to Avoid Constraint Violations: An Explicit Reference Governor Approach

Constraints, including thermal, power, traction and rollover limits, as well as actuator range and rate limits, are ubiquitous in advanced ground vehicles and propulsion systems, and in their components, especially as these systems are downsized. These vehicles and systems will be operating in unknown environments where the recognition and avoidance of degradation or damage will be required. This paper proposes a model-free learning algorithm that over time modifies the parameters of an explicit reference governor (ERG) scheme so that violations of pre-specified constraints are avoided after a sufficiently informative learning phase. The ERG modifies setpoint commands to a nominal closed-loop system. Our learning algorithm modifies the ERG parameters based on observed constraint violations during a learning phase so as to eliminate constraint violations after learning is completed. Theoretical properties of the algorithm are analyzed and several examples that illustrate its effectiveness are presented.

[1]  Ilya Kolmanovsky,et al.  Reference Governor Strategies for Vehicle Rollover Avoidance , 2016, IEEE Transactions on Control Systems Technology.

[2]  Angela P. Schoellig,et al.  Learning-based nonlinear model predictive control to improve vision-based mobile robot path-tracking in challenging outdoor environments , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Francesco Borrelli,et al.  A Learning-Based Framework for Velocity Control in Autonomous Driving , 2016, IEEE Transactions on Automation Science and Engineering.

[4]  Emanuele Garone,et al.  Explicit Reference Governor for Constrained Nonlinear Systems , 2016, IEEE Transactions on Automatic Control.

[5]  J. Mattis Summary of the 2018 National Defense Strategy of the United States of America , 2018 .

[6]  Emanuele Garone,et al.  Explicit reference governor for linear systems , 2018, Int. J. Control.

[7]  Frank L. Lewis,et al.  Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems , 2014, Autom..

[8]  Benjamin Recht,et al.  A Tour of Reinforcement Learning: The View from Continuous Control , 2018, Annu. Rev. Control. Robotics Auton. Syst..

[9]  Fernando Puente León,et al.  Thermal and energy battery management optimization in electric vehicles using Pontryagin's maximum principle , 2014 .

[10]  Andreas Krause,et al.  Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.

[11]  Emanuele Garone,et al.  The Explicit Reference Governor: A General Framework for the Closed-Form Control of Constrained Nonlinear Systems , 2018, IEEE Control Systems.

[12]  Jing Sun,et al.  A Multi-mode Switching-based Command Tracking in Network Controlled Systems with Pointwise-in-Time Constraints and Disturbance Inputs , 2006, 2006 6th World Congress on Intelligent Control and Automation.

[13]  S. Shankar Sastry,et al.  Provably safe and robust learning-based model predictive control , 2011, Autom..

[14]  F. Pillichshammer,et al.  Discrepancy Theory and Quasi-Monte Carlo Integration , 2014 .

[15]  Martin Guay,et al.  Adaptive Model Predictive Control for Constrained Nonlinear Systems , 2008 .

[16]  Javier García,et al.  A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..