Off-Policy Q-Learning: Set-Point Design for Optimizing Dual-Rate Rougher Flotation Operational Processes

Rougher flotation, composed of unit processes operating at a fast time scale and economic performance measurements known as operational indices measured at a slower time scale, is very basic and the first concentration stage for flotation plants. Optimizing operational process for rougher flotation circuits is extremely important due to high economic profit arising from the optimality of operational indices. This paper presents a novel off-policy Q-learning method to learn the optimal solution to rougher flotation operational processes without the knowledge of dynamics of unit processes and operational indices. To this end, first, the optimal operational control for dual-rate rougher flotation processes is formulated. Second, H∞ tracking control problem is developed to optimally prescribe the set-points for the rougher flotation processes. Then, a zero-sum game off-policy Q-learning algorithm is proposed to find the optimal set-points by using measured data. Finally, simulation experiments are employed to show the effectiveness of the proposed method.

[1]  K. Najim,et al.  Adaptive control in mineral processing , 1992 .

[2]  Frank L. Lewis,et al.  Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics , 2014, Autom..

[3]  Frank L. Lewis,et al.  H∞ control of linear discrete-time systems: Off-policy reinforcement learning , 2017, Autom..

[4]  Frank L. Lewis,et al.  Data-Based Multiobjective Plant-Wide Performance Optimization of Industrial Processes Under Dynamic Environments , 2016, IEEE Transactions on Industrial Informatics.

[5]  Daniel Sbarbaro,et al.  Optimal control of a rougher flotation process based on dynamic programming , 2007 .

[6]  Tianyou Chai,et al.  Optimal operational control for complex industrial processes , 2014, Annu. Rev. Control..

[7]  F. Lewis,et al.  Model-free Q-learning designs for discrete-time zero-sum games with application to H-infinity control , 2007, 2007 European Control Conference (ECC).

[8]  Tianyou Chai,et al.  Data-Driven Abnormal Condition Identification and Self-Healing Control System for Fused Magnesium Furnace , 2015, IEEE Transactions on Industrial Electronics.

[9]  Tianyou Chai,et al.  Data-Driven Optimization Control for Safety Operation of Hematite Grinding Process , 2015, IEEE Transactions on Industrial Electronics.

[10]  Tianyou Chai,et al.  Networked Multirate Output Feedback Control for Setpoints Compensation and Its Application to Rougher Flotation Process , 2014, IEEE Transactions on Industrial Electronics.

[11]  Frank L. Lewis,et al.  $ {H}_{ {\infty }}$ Tracking Control of Completely Unknown Continuous-Time Systems via Off-Policy Reinforcement Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[12]  K. Najim,et al.  Long-range predictive control of a rougher flotation unit , 1994 .

[13]  T. Basar,et al.  H∞-0ptimal Control and Related Minimax Design Problems: A Dynamic Game Approach , 1996, IEEE Trans. Autom. Control..

[14]  Tianyou Chai,et al.  Integrated Network-Based Model Predictive Control for Setpoints Compensation in Industrial Processes , 2013, IEEE Transactions on Industrial Informatics.

[15]  Jae Young Lee,et al.  Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems , 2012, Autom..

[16]  Mika Järvensivu,et al.  Integrated expert control system for grinding and flotation , 1993 .

[17]  A. J. Thornton Cautious adaptive control of an industrial flotation circuit , 1991 .

[18]  Huai‐Ning Wu,et al.  Computationally efficient simultaneous policy update algorithm for nonlinear H∞ state feedback control with Galerkin's method , 2013 .

[19]  D. G. Hulbert Multivariable Control of Pulp Levels in Flotation Circuits , 1995 .

[20]  Tianyou Chai,et al.  Hybrid intelligent control for optimal operation of shaft furnace roasting process , 2011 .

[21]  R. Pérez-Correa,et al.  Dynamic modelling and advanced multivariable control of conventional flotation circuits , 1998 .

[22]  Derong Liu,et al.  A Novel Dual Iterative $Q$-Learning Method for Optimal Battery Management in Smart Residential Environments , 2015, IEEE Transactions on Industrial Electronics.

[23]  R. Zaragoza,et al.  Model-based feedforward control scheme for flotation plants , 1988 .

[24]  S. Joe Qin,et al.  A survey of industrial model predictive control technology , 2003 .

[25]  B. K. Loveday,et al.  An improved model for simulation of flotation circuits , 2000 .

[26]  Aldo Cipriano,et al.  MODEL BASED PREDICTIVE CONTROL OF A ROUGHER FLOTATION CIRCUIT CONSIDERING GRADE ESTIMATION IN INTERMEDIATE CELLS , 2011 .

[27]  Weihua Gui,et al.  Reagent Addition Control for Stibium Rougher Flotation Based on Sensitive Froth Image Features , 2017, IEEE Transactions on Industrial Electronics.

[28]  Frank L. Lewis,et al.  Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning , 2014, IEEE Transactions on Automatic Control.