论文信息 - Robust Dual Control of Batch Processes with Parametric Uncertainty using Proximal Policy Optimization

Robust Dual Control of Batch Processes with Parametric Uncertainty using Proximal Policy Optimization

This study presents a robust dual control method for batch processes under parametric uncertainty. Proximal policy optimization (PPO), a policy gradient reinforcement learning algorithm, is employed to construct an implicit dual controller in a computationally amenable way. The proposed control method can robustly and actively cope with uncertainties seen in a repeated sequence of batch operations by incorporating a penalty term for constraint violation into the reward function and by considering the effect of control inputs on future uncertainty. An application to a bioethanol fermentation process is discussed to demonstrate the effectiveness of the proposed control strategy. It is shown that the proposed robust dual controller has an active learning feature such that the overall performance improves compared to a certainty-equivalence based approach.

[1] Jong Min Lee,et al. An approximate dynamic programming based approach to dual adaptive control , 2009 .

[2] Dominique Bonvin,et al. Dynamic optimization of batch processes: I. Characterization of the nominal solution , 2003, Comput. Chem. Eng..

[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[4] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[5] Chyi Hwang,et al. OPTIMAL CONTROL COMPUTATION FOR DIFFERENTIAL-ALGEBRAIC PROCESS SYSTEMS WITH GENERAL CONSTRAINTS , 1990 .

[6] Sen Wang,et al. Deep Reinforcement Learning for Autonomous Driving , 2018, ArXiv.

[7] Y. Bar-Shalom,et al. Dual effect, certainty equivalence, and separation in stochastic control , 1974 .

[8] Rainer Manuel Schaich,et al. Robust model predictive control , 2017 .

[9] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[10] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[11] R. Bhushan Gopaluni,et al. Deep Reinforcement Learning for Process Control: A Primer for Beginners , 2019, AIChE Journal.

[12] N. Filatov,et al. Survey of adaptive dual control methods , 2000 .

[13] Tamer Basar,et al. Dual Control Theory , 2001 .

[14] Z. Nagy,et al. Robust nonlinear model predictive control of batch processes , 2003 .

[15] Sebastian Engell,et al. Multi-stage nonlinear model predictive control applied to a semi-batch polymerization reactor under uncertainty , 2013 .

[16] Marko Bacic,et al. Model predictive control , 2003 .

[17] Jay H. Lee,et al. Reinforcement Learning - Overview of recent progress and implications for process control , 2019, Comput. Chem. Eng..

[18] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.