Optimizing zinc electrowinning processes with current switching via Deep Deterministic Policy Gradient learning

Abstract This paper proposes a model-free Deep Deterministic Policy Gradient (DDPG) learning controller for zinc electrowinning processes (ZEP) to save energy consumption during the current switching periods. To overcome the problems such as inaccurate modeling and various time delays, the proposed DDPG controller utilizes various control periods and parameters for different working conditions. Strategies such as action boundary setting, reward function definition, state normalization are applied to ensure its learning performance. Simulations and experiments show that the DDPG learning controller can significantly decrease energy consumption during the ZEP current switching periods. The optimal control policy will be learnt for different working conditions with only one group hyperparameters. Furthermore, the smoother control actions of the DDPG controller will improve the stability and reduce more energy consumption by comparing with traditional proportional-integral (PI) controller, model predictive control (MPC) and artificial experiences. The artificial intelligence-based optimal control framework brings both energy saving and intelligence to zinc manufacturing plants.

[1]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[2]  Lilan Liu,et al.  Research on Motion Planning of Seven Degree of Freedom Manipulator Based on DDPG , 2018, Advanced Manufacturing and Automation VIII.

[3]  Michael Mahon,et al.  Application and optimisation studies of a zinc electrowinning process simulation , 2014 .

[4]  Weihua Gui,et al.  An optimal power-dispatching system using neural networks for the electrochemical process of zinc depending on varying prices of electricity , 2002, IEEE Trans. Neural Networks.

[5]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Jinxiang Zhu,et al.  Forecasting energy prices in a competitive market , 1999 .

[7]  Geoff Barton,et al.  Scale-up effects in modelling a full-size zinc electrowinning cell , 1992 .

[8]  Renquan Lu,et al.  Output Synchronization and $L_{2}$ -Gain Analysis for Network Systems , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[9]  S. Joe Qin,et al.  A survey of industrial model predictive control technology , 2003 .

[10]  Enmin Feng,et al.  Optimization for Nonlinear Uncertain Switched Stochastic Systems with Initial State Difference in Batch Culture Process , 2019, Complex..

[11]  Jose A. Romagnoli,et al.  Continuous control of a polymerization system with deep reinforcement learning , 2019, Journal of Process Control.

[12]  Chunhua Yang,et al.  Optimal Control for Zinc Electrowinning Process With Current Switching , 2017, IEEE Access.

[13]  Françoise Couenne,et al.  The port Hamiltonian approach to modeling and control of Continuous Stirred Tank Reactors , 2011 .

[14]  Zhen Liu,et al.  Adaptive neural network tracking control-based reinforcement learning for wheeled mobile robots with skidding and slipping , 2017, Neurocomputing.

[15]  Tieshan Li,et al.  Event-Triggered Finite-Time Control for Networked Switched Linear Systems With Asynchronous Switching , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[16]  G. Barton,et al.  Industrial applications of a mathematical model for the zinc electrowinning process , 1994 .

[17]  Weihua Gui,et al.  Kinetic Modeling and Parameter Estimation for Competing Reactions in Copper Removal Process from Zinc Sulfate Solution , 2013 .

[18]  Soon-Yi Wu,et al.  Optimization Methods, Theory and Applications , 2015 .

[19]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[20]  Meng Wei,et al.  Robot skill acquisition in assembly process using deep reinforcement learning , 2019, Neurocomputing.

[21]  Yueying Wang,et al.  On Stabilization of Quantized Sampled-Data Neural-Network-Based Control Systems , 2017, IEEE Transactions on Cybernetics.

[22]  Wang Ya Intelligent modeling and optimization on time-sharing power dispatching system for electrolytic zinc process , 2000 .

[23]  Weihua Gui,et al.  Additive requirement ratio prediction using trend distribution features for hydrometallurgical purification processes , 2016 .

[24]  Louis Wehenkel,et al.  Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  P. Wang,et al.  Optimal Design of PID Process Controllers based on Genetic Algorithms , 1993 .

[26]  Hamid Reza Karimi,et al.  $H_\infty$ Refined Antidisturbance Control of Switched LPV Systems With Application to Aero-Engine , 2020, IEEE Transactions on Industrial Electronics.

[27]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  A. E Saba,et al.  Continuous electrowinning of zinc , 2000 .

[29]  J. Moghaddam,et al.  Statistical evaluation and optimization of zinc electrolyte hot purification process by Taguchi method , 2015, Journal of Central South University.

[30]  Enmin Feng,et al.  Optimization of a fed-batch bioreactor for 1,3-propanediol production using hybrid nonlinear optimal control , 2014 .

[31]  Weihua Gui,et al.  A data-driven optimal control approach for solution purification process , 2018, Journal of Process Control.

[32]  Hamid Reza Karimi,et al.  Asynchronous Finite-Time Filtering of Networked Switched Systems and its Application: an Event-Driven Method , 2019, IEEE Transactions on Circuits and Systems I: Regular Papers.

[33]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[34]  Chunhua Yang,et al.  Spatiotemporal distribution model for zinc electrowinning process and its parameter estimation , 2017 .

[35]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[36]  Xiangyu Wang,et al.  Optimization and Control Methods in Industrial Engineering and Construction , 2014 .

[37]  Shengyuan Xu,et al.  Neural-Network-Based Decentralized Adaptive Output-Feedback Control for Large-Scale Stochastic Nonlinear Systems , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[38]  Yan Chen,et al.  Deep Deterministic Policy Gradient (DDPG)-Based Energy Harvesting Wireless Communications , 2019, IEEE Internet of Things Journal.

[39]  Geoff Barton,et al.  A validated mathematical model for a zinc electrowinning cell , 1992 .