Model-Free Predictive Control of Nonlinear Processes Based on Reinforcement Learning

Abstract Model predictive control (MPC) is a model-based control philosophy in which the current control action is obtained by on-line optimization of objective function. MPC is, by now, considered to be a mature technology owing to the plethora of research and industrial process control applications. The model under consideration is either linear or piece-wise linear. However, turning to the nonlinear processes, the difficulties are in obtaining a good nonlinear model, and the excessive computational burden associated with the control optimization. Proposed framework, named as model-free predictive control (MFPC), takes care of both the issues of conventional MPC. Model-free reinforcement learning formulates predictive control problem with a control horizon of only length one, but takes a decision based on infinite horizon information. In order to facilitate generalization in continuous state and action spaces, fuzzy inference system is used as a function approximator in conjunction with Q-learning. Empirical study on a continuous stirred tank reactor shows that the MFPC reinforcement learning framework is efficient, and strongly robust.

[1]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[2]  Hitesh Shah,et al.  Reinforcement learning control of robot manipulators in uncertain environments , 2009, 2009 IEEE International Conference on Industrial Technology.

[3]  S. Syafiie,et al.  Learning to Control pH Processes at Multiple Time Scales: Performance Assessment in a Laboratory Plant , 2007 .

[4]  Jong Min Lee,et al.  Simulation-based learning of cost-to-go for control of nonlinear processes , 2004 .

[5]  Reinforcement learning framework for adaptive control of nonlinear chemical processes , 2011 .

[6]  José del R. Millán,et al.  Continuous-Action Q-Learning , 2002, Machine Learning.

[7]  Lionel Jouffe,et al.  Fuzzy inference system learning by reinforcement methods , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[8]  Masoud Nikravesh,et al.  Control of nonisothermal CSTR with time varying parameters via dynamic neural network control (DNNC) , 2000 .

[9]  Fernando Tadeo,et al.  Model-free learning control of neutralization processes using reinforcement learning , 2007, Eng. Appl. Artif. Intell..

[10]  Chia-Feng Juang,et al.  Combination of online clustering and Q-value based GA for reinforcement fuzzy system design , 2005, IEEE Trans. Fuzzy Syst..

[11]  Jay H. Lee,et al.  Model predictive control: past, present and future , 1999 .

[12]  J H Lee,et al.  Approximate dynamic programming approach for process control , 2010, ICCAS 2010.

[13]  Anders Stenman,et al.  Model-free predictive control , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).

[14]  M. Gopal,et al.  A fuzzy decision tree-based robust Markov game controller for robot manipulators , 2010, Int. J. Autom. Control..

[15]  Meng Joo Er,et al.  Online tuning of fuzzy inference systems using dynamic fuzzy Q-learning , 2004, IEEE Trans. Syst. Man Cybern. Part B.