论文信息 - Improving model reference control performance using model-free VRFT and Q-learning

Improving model reference control performance using model-free VRFT and Q-learning

This paper proposes the combination of two model-free controller tuning techniques, namely linear Virtual Reference Feedback Tuning (VRFT) and nonlinear state-feedback Q-learning, referred to as a new mixed VRFT-Q learning approach. VRFT is first applied to find the stabilizing feedback controller using only input-output (IO) experimental data from the process in a model reference tracking setting. Reinforcement Q-learning is next applied in the same setting using input-state experimental data collected with a perturbed stabilizing VRFT feedback controller in closed-loop, to ensure good exploration of the state-action space and to avoid data collection under non-stabilizing control. The Q-learning controller is then learned from the input-state data using a batch neural fitted framework. The mixed VRFT-Q learning approach is validated on a case study that deals with the position control of a two-degrees-of-motion open-loop stable Multi Input-Multi Output aerodynamic system. Experimental results show that the Q-learning controllers lead to improved control performance over the initial VRFT controllers.

Radu-Emil Precup | Mircea-Bogdan Radac | R. Precup | M. Radac

[1] Radu-Emil Precup,et al. Model-free constrained data-driven iterative reference input tuning algorithm with experimental validation , 2016, Int. J. Gen. Syst..

[2] Michel Fliess,et al. On ramp metering: towards a better understanding of ALINEA via model-free control , 2017, Int. J. Control.

[3] Sergio M. Savaresi,et al. Direct nonlinear control design: the virtual reference feedback tuning (VRFT) approach , 2006, IEEE Transactions on Automatic Control.

[4] Radu-Emil Precup,et al. Data-driven virtual reference feedback tuning and reinforcement Q-learning for model-free position control of an aerodynamic system , 2016, 2016 24th Mediterranean Conference on Control and Automation (MED).

[5] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[6] Sergio M. Savaresi,et al. Data-driven control design for neuroprotheses: a virtual reference feedback tuning (VRFT) approach , 2004, IEEE Transactions on Control Systems Technology.

[7] Frank L. Lewis,et al. Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[8] Haibo He,et al. Model-Free Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[9] Steven X. Ding,et al. Data-driven design of two-degree-of-freedom controllers using reinforcement learning techniques , 2015 .

[10] Data-driven control based on simultaneous perturbation stochastic approximation with adaptive weighted gradient estimation , 2016 .

[11] Radu-Emil Precup,et al. Optimal motion prediction using a primitive-based model-free iterative control approach for crane systems , 2015, 2015 IEEE International Conference on Industrial Technology (ICIT).

[12] Paul J. Werbos,et al. Approximate dynamic programming for real-time control and neural modeling , 1992 .

[13] Radu-Emil Precup,et al. Three-level hierarchical model-free learning approach to trajectory tracking control , 2016, Eng. Appl. Artif. Intell..

[14] Donald A. Sofge,et al. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .

[15] Martin A. Riedmiller,et al. Reinforcement learning in feedback control , 2011, Machine Learning.

[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17] F. Lewis,et al. Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[18] Haibo He,et al. Data-driven partially observable dynamic processes using adaptive dynamic programming , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[19] Tassiano Neuhaus,et al. Tuning Nonlinear Controllers with the Virtual Reference Approach , 2014 .

[20] Guangfu Ma,et al. Fault-tolerant control of linear multivariable controllers using iterative feedback tuning , 2015 .

[21] Marco Lovera,et al. Data-driven attitude control law design for a variable-pitch quadrotor , 2016, 2016 American Control Conference (ACC).

[22] Derong Liu,et al. Data-driven virtual reference controller design for high-order nonlinear systems via neural network , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[23] Sergio M. Savaresi,et al. Direct learning of LPV controllers from data , 2016, Autom..

[24] Derong Liu,et al. Data-Based Controllability and Observability Analysis of Linear Discrete-Time Systems , 2011, IEEE Transactions on Neural Networks.

[25] Diego Eckhard,et al. Unbiased MIMO VRFT with application to process control , 2016 .

[26] Yang Liu,et al. ADP based optimal tracking control for a class of linear discrete-time system with multiple delays , 2016, Journal of the Franklin Institute.

[27] Zhongsheng Hou,et al. Attitude adjustment of quadrotor aircraft platform via a data-driven model free adaptive control cascaded with intelligent PID , 2016, 2016 Chinese Control and Decision Conference (CCDC).

[28] Masayoshi Tomizuka,et al. Iterative design of feedback and feedforward controller with input saturation constraint for building temperature control , 2016, 2016 American Control Conference (ACC).

[29] Tingwen Huang,et al. Model-Free Optimal Tracking Control via Critic-Only Q-Learning , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[30] Radu-Emil Precup,et al. Virtual Reference Feedback Tuning for position control of a twin rotor aerodynamic system , 2016, 2016 IEEE 11th International Symposium on Applied Computational Intelligence and Informatics (SACI).

[31] Sergio M. Savaresi,et al. Virtual reference feedback tuning: a direct method for the design of feedback controllers , 2002, Autom..

[32] Huaguang Zhang,et al. Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.