论文信息 - Deep reinforcement learning for semiconductor production scheduling

Deep reinforcement learning for semiconductor production scheduling

Despite producing tremendous success stories by identifying cat videos [1] or solving computer as well as board games [2], [3], the adoption of deep learning in the semiconductor industry is moderatre. In this paper, we apply Google DeepMind's Deep Q Network (DQN) agent algorithm for Reinforcement Learning (RL) to semiconductor production scheduling. In an RL environment several cooperative DQN agents, which utilize deep neural networks, are trained with flexible user-defined objectives. We show benchmarks comparing standard dispatching heuristics with the DQN agents in an abstract frontend-of-line semiconductor production facility. Results are promising and show that DQN agents optimize production autonomously for different targets.

[1] Lenz Belzner,et al. Optimization of global production scheduling with deep reinforcement learning , 2018 .

[2] Wilfried Brauer,et al. Multi-machine scheduling-a multi-agent learning approach , 1998, Proceedings International Conference on Multi Agent Systems (Cat. No.98EX160).

[3] Srikanth Kandula,et al. Resource Management with Deep Reinforcement Learning , 2016, HotNets.

[4] Michael O. Duff,et al. Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.

[5] Wei Zhang,et al. A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[6] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[7] Kevin Barraclough,et al. I and i , 2001, BMJ : British Medical Journal.

[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9] Jie Wang,et al. Optimized Adaptive Scheduling of a Manufacturing Process System with Multi-skill Workforce and Multiple Machine Types: An Ontology-based, Multi-agent Reinforcement Learning Approach , 2016 .

[10] Martin A. Riedmiller,et al. A Neural Reinforcement Learning Approach to Learn Local Dispatching Policies in Production Scheduling , 1999, IJCAI.

[11] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12] Tapas K. Das,et al. A multi-agent reinforcement learning approach to obtaining dynamic control policies for stochastic lot scheduling problem , 2005, Simul. Model. Pract. Theory.

[13] Sridhar Mahadevan,et al. Optimizing Production Manufacturing Using Reinforcement Learning , 1998, FLAIRS.

[14] Florin Pop,et al. New scheduling approach using reinforcement learning for heterogeneous distributed systems , 2017, J. Parallel Distributed Comput..

[15] Thomas G. Dietterich,et al. High-Performance Job-Shop Scheduling With A Time-Delay TD(λ) Network , 1995, NIPS 1995.

[16] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.

[17] Alexander Mordvintsev,et al. Inceptionism: Going Deeper into Neural Networks , 2015 .

[18] Sanja Petrovic,et al. SURVEY OF DYNAMIC SCHEDULING IN MANUFACTURING SYSTEMS , 2006 .

[19] Martin A. Riedmiller,et al. Scaling Adaptive Agent-Based Reactive Job-Shop Scheduling to Large-Scale Problems , 2007, 2007 IEEE Symposium on Computational Intelligence in Scheduling.

[20] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[21] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.