Improving model reference control performance using model-free VRFT and Q-learning

This paper proposes the combination of two model-free controller tuning techniques, namely linear Virtual Reference Feedback Tuning (VRFT) and nonlinear state-feedback Q-learning, referred to as a new mixed VRFT-Q learning approach. VRFT is first applied to find the stabilizing feedback controller using only input-output (IO) experimental data from the process in a model reference tracking setting. Reinforcement Q-learning is next applied in the same setting using input-state experimental data collected with a perturbed stabilizing VRFT feedback controller in closed-loop, to ensure good exploration of the state-action space and to avoid data collection under non-stabilizing control. The Q-learning controller is then learned from the input-state data using a batch neural fitted framework. The mixed VRFT-Q learning approach is validated on a case study that deals with the position control of a two-degrees-of-motion open-loop stable Multi Input-Multi Output aerodynamic system. Experimental results show that the Q-learning controllers lead to improved control performance over the initial VRFT controllers.

[1]  Radu-Emil Precup,et al.  Model-free constrained data-driven iterative reference input tuning algorithm with experimental validation , 2016, Int. J. Gen. Syst..

[2]  Michel Fliess,et al.  On ramp metering: towards a better understanding of ALINEA via model-free control , 2017, Int. J. Control.

[3]  Sergio M. Savaresi,et al.  Direct nonlinear control design: the virtual reference feedback tuning (VRFT) approach , 2006, IEEE Transactions on Automatic Control.

[4]  Radu-Emil Precup,et al.  Data-driven virtual reference feedback tuning and reinforcement Q-learning for model-free position control of an aerodynamic system , 2016, 2016 24th Mediterranean Conference on Control and Automation (MED).

[5]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[6]  Sergio M. Savaresi,et al.  Data-driven control design for neuroprotheses: a virtual reference feedback tuning (VRFT) approach , 2004, IEEE Transactions on Control Systems Technology.

[7]  Frank L. Lewis,et al.  Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[8]  Haibo He,et al.  Model-Free Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Steven X. Ding,et al.  Data-driven design of two-degree-of-freedom controllers using reinforcement learning techniques , 2015 .

[10]  Data-driven control based on simultaneous perturbation stochastic approximation with adaptive weighted gradient estimation , 2016 .

[11]  Radu-Emil Precup,et al.  Optimal motion prediction using a primitive-based model-free iterative control approach for crane systems , 2015, 2015 IEEE International Conference on Industrial Technology (ICIT).

[12]  Paul J. Werbos,et al.  Approximate dynamic programming for real-time control and neural modeling , 1992 .

[13]  Radu-Emil Precup,et al.  Three-level hierarchical model-free learning approach to trajectory tracking control , 2016, Eng. Appl. Artif. Intell..

[14]  Donald A. Sofge,et al.  Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .

[15]  Martin A. Riedmiller,et al.  Reinforcement learning in feedback control , 2011, Machine Learning.

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  F. Lewis,et al.  Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[18]  Haibo He,et al.  Data-driven partially observable dynamic processes using adaptive dynamic programming , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[19]  Tassiano Neuhaus,et al.  Tuning Nonlinear Controllers with the Virtual Reference Approach , 2014 .

[20]  Guangfu Ma,et al.  Fault-tolerant control of linear multivariable controllers using iterative feedback tuning , 2015 .

[21]  Marco Lovera,et al.  Data-driven attitude control law design for a variable-pitch quadrotor , 2016, 2016 American Control Conference (ACC).

[22]  Derong Liu,et al.  Data-driven virtual reference controller design for high-order nonlinear systems via neural network , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[23]  Sergio M. Savaresi,et al.  Direct learning of LPV controllers from data , 2016, Autom..

[24]  Derong Liu,et al.  Data-Based Controllability and Observability Analysis of Linear Discrete-Time Systems , 2011, IEEE Transactions on Neural Networks.

[25]  Diego Eckhard,et al.  Unbiased MIMO VRFT with application to process control , 2016 .

[26]  Yang Liu,et al.  ADP based optimal tracking control for a class of linear discrete-time system with multiple delays , 2016, Journal of the Franklin Institute.

[27]  Zhongsheng Hou,et al.  Attitude adjustment of quadrotor aircraft platform via a data-driven model free adaptive control cascaded with intelligent PID , 2016, 2016 Chinese Control and Decision Conference (CCDC).

[28]  Masayoshi Tomizuka,et al.  Iterative design of feedback and feedforward controller with input saturation constraint for building temperature control , 2016, 2016 American Control Conference (ACC).

[29]  Tingwen Huang,et al.  Model-Free Optimal Tracking Control via Critic-Only Q-Learning , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Radu-Emil Precup,et al.  Virtual Reference Feedback Tuning for position control of a twin rotor aerodynamic system , 2016, 2016 IEEE 11th International Symposium on Applied Computational Intelligence and Informatics (SACI).

[31]  Sergio M. Savaresi,et al.  Virtual reference feedback tuning: a direct method for the design of feedback controllers , 2002, Autom..

[32]  Huaguang Zhang,et al.  Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.