An intelligent non-optimality self-recovery method based on reinforcement learning with small data in big data era

Abstract Batch processes have attracted extensive attention as a crucial manufacturing way in modern industries. Although they are well equipped with control devices, batch processes may operate at a non-optimal status because of process disturbances, equipment aging, feedstock variations, etc. As a result, the quality indices or economic benefits may be undesirable using the pre-defined normal operation conditions. And this phenomenon is called non-optimality here. Therefore, it is indispensable to timely remedy the process to its optimal status without accurate models or amounts of data. To solve this problem, this study proposes an intelligent non-optimality self-recovery method based on reinforcement learning. First, the causal variables that lead to the non-optimality are identified by developing a status-degraded Fisher discriminant analysis with consideration of sparsity. Second, on the basis of self-learning mechanism, an intelligent self-recovery method is proposed using the reinforcement learning to automatically adjust the set-points of the causal controlled variables. The self-recovery action is taken iteratively through the Actor-Critic structure with two neural networks. In this way, effective actions are taken to remedy the process to its expected status which only require small data. Finally, the efficacy of the proposed method is illustrated by both numerical case and a typical batch-type manufacturing process, i.e., the injection molding process.

[1]  Chunhui Zhao,et al.  A Quality-Relevant Sequential Phase Partition Approach for Regression Modeling and Quality Prediction Analysis in Manufacturing Processes , 2014, IEEE Transactions on Automation Science and Engineering.

[2]  Xiaogang Deng,et al.  Modified kernel principal component analysis based on local structure analysis and its application to nonlinear process fault diagnosis , 2013 .

[3]  Wei Dai,et al.  Hardware-in-the-loop simulation platform for supervisory control of mineral grinding process , 2016 .

[4]  M Villafin,et al.  Learning control for batch thermal sterilization of canned foods. , 2011, ISA transactions.

[5]  Chunhui Zhao,et al.  Critical-to-Fault-Degradation Variable Analysis and Direction Extraction for Online Fault Prognostic , 2017, IEEE Transactions on Control Systems Technology.

[6]  Frank Allgöwer,et al.  Iterative Learning and Extremum Seeking for Repetitive Time-Varying Mappings , 2015, IEEE Transactions on Automatic Control.

[7]  Prashant Mhaskar,et al.  Data‐driven model predictive quality control of batch processes , 2013 .

[8]  Fei Liu,et al.  Linear Optimal Unbiased Filter for Time-Variant Systems Without Apriori Information on Initial Conditions , 2017, IEEE Transactions on Automatic Control.

[9]  Weiwu Yan,et al.  Nonlinear and robust statistical process monitoring based on variant autoencoders , 2016 .

[10]  Hongbo Shi,et al.  Time–space locality preserving coordination for multimode process monitoring , 2016 .

[11]  Haibo He,et al.  Online Learning Control Using Adaptive Critic Designs With Sparse Kernel Machines , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Tianyou Chai,et al.  Data-Driven Robust RVFLNs Modeling of a Blast Furnace Iron-Making Process Using Cauchy Distribution Weighted M-Estimation , 2017, IEEE Transactions on Industrial Electronics.

[13]  Louis Wehenkel,et al.  Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  Q. Peter He,et al.  A New Fault Diagnosis Method Using Fault Directions in Fisher Discriminant Analysis , 2005 .

[15]  P Mhaskar,et al.  Robust model predictive control & fault-handling of batch processes , 2010, Proceedings of the 2010 American Control Conference.

[16]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[17]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[18]  Jennie Si,et al.  Online learning control by association and reinforcement. , 2001, IEEE transactions on neural networks.

[19]  Chunhui Zhao,et al.  Subspace decomposition and critical phase selection based cumulative quality analysis for multiphase batch processes , 2017 .

[20]  Furong Gao,et al.  Two-time-dimensional model predictive control of weld line positioning in bi-injection molding , 2015 .

[21]  Tianyou Chai,et al.  Intelligence-Based Supervisory Control for Optimal Operation of a DCS-Controlled Grinding System , 2013, IEEE Transactions on Control Systems Technology.

[22]  Jie Zhang,et al.  Fault detection in dynamic processes using a simplified monitoring-specific CVA state space modelling approach , 2012, Comput. Chem. Eng..

[23]  Fuli Wang,et al.  Operating optimality assessment and cause identification for nonlinear industrial processes , 2017 .

[24]  Trevor J. Hastie,et al.  Sparse Discriminant Analysis , 2011, Technometrics.

[25]  Chunhui Zhao,et al.  Fault Subspace Selection Approach Combined With Analysis of Relative Changes for Reconstruction Modeling and Multifault Diagnosis , 2016, IEEE Transactions on Control Systems Technology.

[26]  Dominique Bonvin,et al.  Real-Time Optimization of Batch Processes by Tracking the Necessary Conditions of Optimality , 2007 .

[27]  Jialin Liu,et al.  Bayesian filtering of the smearing effect: Fault isolation in chemical process monitoring , 2014 .

[28]  Cristobal Ruiz-Carcel,et al.  Statistical process monitoring of a multiphase flow facility , 2015 .

[29]  X. Chen,et al.  Quality Control via Model-Free Optimization for a Type of Batch Process with a Short Cycle Time and Low Operational Cost , 2011 .

[30]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[31]  Fei Liu,et al.  Minimum variance unbiased FIR filter for discrete time-variant systems , 2015, Autom..

[32]  Furong Gao,et al.  Stage-based online quality control for batch processes , 2006 .

[33]  Chunhui Zhao,et al.  Linearity Evaluation and Variable Subset Partition Based Hierarchical Process Modeling and Monitoring , 2018, IEEE Transactions on Industrial Electronics.

[34]  Furong Gao,et al.  A survey on multistage/multiphase statistical modeling methods for batch processes , 2009, Annu. Rev. Control..

[35]  Chunhui Zhao,et al.  Fault-relevant Principal Component Analysis (FPCA) method for multivariate statistical modeling and process monitoring , 2014 .

[36]  R. Tibshirani,et al.  Penalized Discriminant Analysis , 1995 .

[37]  F.L. Lewis,et al.  Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[38]  Prashant Mhaskar,et al.  Latent Variable Model Predictive Control (LV-MPC) for trajectory tracking in batch processes , 2010 .

[39]  Theodora Kourti,et al.  Process analysis, monitoring and diagnosis, using multivariate projection methods , 1995 .

[40]  Prashant Mhaskar,et al.  Subspace identification for data‐driven modeling and quality control of batch processes , 2016 .

[41]  Richard D. Braatz,et al.  Perspectives on process monitoring of industrial systems , 2016, Annu. Rev. Control..

[42]  Youxian Sun,et al.  Step-wise sequential phase partition (SSPP) algorithm based statistical modeling and online process monitoring , 2013 .

[43]  Chunhui Zhao,et al.  Sparse Exponential Discriminant Analysis and Its Application to Fault Diagnosis , 2018, IEEE Transactions on Industrial Electronics.

[44]  Youxian Sun,et al.  Multispace Total Projection to Latent Structures and its Application to Online Process Monitoring , 2014, IEEE Transactions on Control Systems Technology.