An improved reinforcement learning control strategy for batch processes

Batch processes are significant and essential manufacturing route for the agile manufacturing of high value added products and they are typically difficult to control because of unknown disturbances, model plant mismatches, and highly nonlinear characteristic. Traditional one-step reinforcement learning and neural network have been applied to optimize and control batch processes. However, traditional one-step reinforcement learning and the neural network lack accuracy and robustness leading to unsatisfactory performance. To overcome these issues and difficulties, a modified multi-step action Q-learning algorithm (MMSA) based on multiple step action Q-learning (MSA) is proposed in this paper. For MSA, the action space is divided into some periods of same time steps and the same action is explored with fixed greedy policy being applied continuously during a period. Compared with MSA, the modification of MMSA is that the exploration and selection of action will follow an improved and various greedy policy in the whole system time which can improve the flexibility and speed of the learning algorithm. The proposed algorithm is applied to a highly nonlinear batch process and it is shown giving better control performance than the traditional one-step reinforcement learning and MSA.

[1]  Eduardo Gómez-Sánchez,et al.  Automatization of a penicillin production process with soft sensors and an adaptive controller based on neuro fuzzy systems , 2004 .

[2]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[3]  Jin Wang,et al.  Overview on DeepMind and Its AlphaGo Zero AI , 2018, ICBDE.

[4]  Zoltan K. Nagy,et al.  Model based control of a yeast fermentation bioreactor using optimally designed artificial neural networks , 2007 .

[5]  Frank L. Lewis,et al.  Reinforcement learning and optimal adaptive control: An overview and implementation examples , 2012, Annu. Rev. Control..

[6]  Evgeny Burnaev,et al.  Reinforcement learning in computer vision , 2018, International Conference on Machine Vision.

[7]  Jie Zhang,et al.  Product Quality Trajectory Tracking in Batch Processes Using Iterative Learning Control Based on Time-Varying Perturbation Models , 2003 .

[8]  Jie Zhang,et al.  Modeling and optimal control of batch processes using recurrent neuro-fuzzy networks , 2005, IEEE Transactions on Fuzzy Systems.

[9]  Jie Zhang,et al.  Reliable Multi-objective On-Line Re-optimisation Control of a Fed-Batch Fermentation Process Using Bootstrap Aggregated Neural Networks , 2017, 2017 International Symposium on Computer Science and Intelligent Controls (ISCSIC).

[10]  Martin A. Riedmiller,et al.  Learning to Control at Multiple Time Scales , 2003, ICANN.

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  Amruta Lambe Reinforcement learning for optimal path length of nanobots using dynamic programming , 2017, 2017 IEEE International Conference on Industrial and Information Systems (ICIIS).

[13]  Fernando Tadeo,et al.  Model-free learning control of neutralization processes using reinforcement learning , 2007, Eng. Appl. Artif. Intell..

[14]  Li Qian,et al.  Reinforcement learning control with adaptive gain for a Saccharomyces cerevisiae fermentation process , 2011, Appl. Soft Comput..

[15]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[16]  R. B. Gopaluni,et al.  Deep reinforcement learning approaches for process control , 2017, 2017 6th International Symposium on Advanced Control of Industrial Processes (AdCONIP).

[17]  Madan Gopal,et al.  Model-Free Predictive Control of Nonlinear Processes Based on Reinforcement Learning , 2016 .

[18]  E. C. Martinez Batch process modeling for optimization using reinforcement learning , 2000 .

[19]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .