The reinforcement learning-based multi-agent cooperative approach for the adaptive speed regulation on a metallurgical pickling line

We present a holistic data-driven approach to the problem of productivity increase on the example of a metallurgical pickling line. The proposed approach combines mathematical modeling as a base algorithm and a cooperative Multi-Agent Reinforcement Learning (MARL) system implemented such as to enhance the performance by multiple criteria while also meeting safety and reliability requirements and taking into account the unexpected volatility of certain technological processes. We demonstrate how Deep Q-Learning can be applied to a real-life task in a heavy industry, resulting in significant improvement of previously existing automation systems.The problem of input data scarcity is solved by a two-step combination of LSTM and CGAN, which helps to embrace both the tabular representation of the data and its sequential properties. Offline RL training, a necessity in this setting, has become possible through the sophisticated probabilistic kinematic environment.

[1]  Antonio Torralba,et al.  Generating Videos with Scene Dynamics , 2016, NIPS.

[2]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[3]  Yu Cheng,et al.  Boosting Deep Learning Risk Prediction with Generative Adversarial Networks for Electronic Health Records , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[4]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[5]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[6]  Rui Nian,et al.  A review On reinforcement learning: Introduction and applications in industrial process control , 2020, Comput. Chem. Eng..

[7]  Alberto Gómez,et al.  A review of machine learning in dynamic scheduling of flexible manufacturing systems , 2001, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[8]  Souradeep Chakraborty,et al.  Capturing Financial markets to apply Deep Reinforcement Learning , 2019, 1907.04373.

[9]  Horacio Ahuett-Garza,et al.  A brief discussion on the trends of habilitating technologies for Industry 4.0 and Smart manufacturing , 2018 .

[10]  Lei Xu,et al.  Modeling Tabular data using Conditional GAN , 2019, NeurIPS.

[11]  T. Urbanik,et al.  Reinforcement learning-based multi-agent system for network traffic signal control , 2010 .

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Amornchai Arpornwichanop,et al.  Neural network inverse model-based controller for the control of a steel pickling process , 2005, Comput. Chem. Eng..

[14]  Ian J. Goodfellow,et al.  NIPS 2016 Tutorial: Generative Adversarial Networks , 2016, ArXiv.

[15]  Chris Donahue,et al.  Adversarial Audio Synthesis , 2018, ICLR.

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  Chris Donahue,et al.  Synthesizing Audio with Generative Adversarial Networks , 2018, ArXiv.

[18]  Mohd Azlan Hussain,et al.  Neural network based model predictive control for a steel pickling process , 2009 .

[19]  Mihaela van der Schaar,et al.  Time-series Generative Adversarial Networks , 2019, NeurIPS.

[20]  Ramakrishnan Mukundan,et al.  3D Human Pose Dataset Augmentation Using Generative Adversarial Network , 2019, Proceedings of the 2019 3rd International Conference on Graphics and Signal Processing.

[21]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[22]  Lei Xu,et al.  Synthesizing Tabular Data using Generative Adversarial Networks , 2018, ArXiv.

[23]  Srivatsan Srinivasan,et al.  Evaluating Reinforcement Learning Algorithms in Observational Health Settings , 2018, ArXiv.