Simple Policy Evaluation for Data-Rich Iterative Tasks

A data-based policy for iterative control task is presented. The proposed strategy is model-free and can be applied whenever safe input and state trajectories of a system performing an iterative task are available. These trajectories, together with a user-defined cost function, are exploited to construct a piecewise affine approximation to the value function. The approximated value function is then used to evaluate the control policy by solving a linear program. We show that for linear system subject to convex cost and constraints, the proposed strategy guarantees closed-loop constraint satisfaction and performance bounds for the closed-loop trajectory. We evaluate the proposed strategy in simulations and experiments, the latter carried out on the Berkeley Autonomous Race Car (BARC) platform. We show that the proposed strategy is able to reduce the computation time by one order of magnitude while achieving the same performance as our model-based control algorithm.

[1]  Alberto Bemporad,et al.  Predictive Control for Linear and Hybrid Systems , 2017 .

[2]  Elman Mansimov,et al.  Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.

[3]  Jay H. Lee,et al.  Model predictive control for nonlinear batch processes with asymptotically perfect tracking , 1997 .

[4]  Jay H. Lee,et al.  Model-based iterative learning control with a quadratic criterion for time-varying linear systems , 2000, Autom..

[5]  Francesco Borrelli,et al.  Repetitive learning model predictive control: An autonomous racing example , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[6]  C. Samson,et al.  Trajectory tracking for unicycle-type and two-steering-wheels mobile robots , 1993 .

[7]  Daniel Kuhn,et al.  From Infinite to Finite Programs: Explicit Error Bounds with Applications to Approximate Dynamic Programming , 2018, SIAM J. Optim..

[8]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[9]  J. R. Cueli,et al.  Iterative nonlinear model predictive control. Stability, robustness and applications , 2008 .

[10]  Benjamin Recht,et al.  Simple random search provides a competitive approach to reinforcement learning , 2018, ArXiv.

[11]  A.G. Alleyne,et al.  A survey of iterative learning control , 2006, IEEE Control Systems.

[12]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[13]  Benjamin Recht,et al.  A Tour of Reinforcement Learning: The View from Continuous Control , 2018, Annu. Rev. Control. Robotics Auton. Syst..

[14]  Francesco Borrelli,et al.  Learning Model Predictive Control for Iterative Tasks. A Data-Driven Control Framework , 2016, IEEE Transactions on Automatic Control.

[15]  Jay H. Lee,et al.  Convergence of constrained model-based predictive control for batch processes , 2000, IEEE Trans. Autom. Control..

[16]  Francesco Borrelli,et al.  Learning Model Predictive Control for Iterative Tasks: A Computationally Efficient Approach for Linear System , 2017, ArXiv.

[17]  Xiaobing Kong,et al.  Nonlinear fuzzy model predictive iterative learning control for drum-type boiler–turbine system , 2013 .

[18]  Jay H. Lee,et al.  Model predictive control technique combined with iterative learning for batch processes , 1999 .