Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning

Abstract In modern manufacturing industry, dynamic scheduling methods are urgently needed with the sharp increase of uncertainty and complexity in production process. To this end, this paper addresses the dynamic flexible job shop scheduling problem (DFJSP) under new job insertions aiming at minimizing the total tardiness. Without lose of generality, the DFJSP can be modeled as a Markov decision process (MDP) where an intelligent agent should successively determine which operation to process next and which machine to assign it on according to the production status of current decision point, making it particularly feasible to be solved by reinforcement learning (RL) methods. In order to cope with continuous production states and learn the most suitable action (i.e. dispatching rule) at each rescheduling point, a deep Q-network (DQN) is developed to address this problem. Six composite dispatching rules are proposed to simultaneously select an operation and assign it on a feasible machine every time an operation is completed or a new job arrives. Seven generic state features are extracted to represent the production status at a rescheduling point. By taking the continuous state features as input to the DQN, the state–action value (Q-value) of each dispatching rule can be obtained. The proposed DQN is trained using deep Q-learning (DQL) enhanced by two improvements namely double DQN and soft target weight update. Moreover, a “softmax” action selection policy is utilized in real implementation of the trained DQN so as to promote the rules with higher Q-values while maintaining the policy entropy. Numerical experiments are conducted on a large number of instances with different production configurations. The results have confirmed both the superiority and generality of DQN compared to each composite rule, other well-known dispatching rules as well as the stand Q-learning-based agent.

[1]  Yuxi Li,et al.  Deep Reinforcement Learning: An Overview , 2017, ArXiv.

[2]  Hongjia Li,et al.  Deep reinforcement learning: Algorithm, applications, and ultra-low-power implementation , 2018, Nano Commun. Networks.

[3]  Gongfa Li,et al.  A simulation-based study of dispatching rules in a dynamic job shop scheduling problem with batch release and extended technical precedence constraints , 2017, Eur. J. Oper. Res..

[4]  Mostafa Zandieh,et al.  Robust and stable flexible job shop scheduling with random machine breakdowns: multi-objectives genetic algorithm approach , 2019 .

[5]  Liang Gao,et al.  An effective multi-objective discrete virus optimization algorithm for flexible job-shop scheduling problem with controllable processing times , 2017, Comput. Ind. Eng..

[6]  Yoke San Wong,et al.  Machine Selection Rules in a Dynamic Job Shop , 2000 .

[7]  Sanjay Mehta,et al.  Predictable scheduling of a single machine subject to breakdowns , 1999, Int. J. Comput. Integr. Manuf..

[8]  Charles A. Holloway,et al.  Centralized Scheduling and Priority Implementation Heuristics for a Dynamic Job Shop Model , 1977 .

[9]  Yu-Fang Wang,et al.  Adaptive job shop scheduling strategy based on weighted Q-learning algorithm , 2018, Journal of Intelligent Manufacturing.

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Sanja Petrovic,et al.  SURVEY OF DYNAMIC SCHEDULING IN MANUFACTURING SYSTEMS , 2006 .

[12]  Edward C. Sewell,et al.  Heuristic, optimal, static, and dynamic schedules when processing times are uncertain , 1997 .

[13]  Mostafa Zandieh,et al.  Dynamic job shop scheduling using variable neighbourhood search , 2010 .

[14]  R. Bellman A Markovian Decision Process , 1957 .

[15]  Liang Gao,et al.  A GEP-based reactive scheduling policies constructing approach for dynamic flexible job shop scheduling problem with job release dates , 2013, J. Intell. Manuf..

[16]  Abdelghani Bekrar,et al.  Towards Energy Efficient Scheduling and Rescheduling for Dynamic Flexible Job Shop Problem , 2018 .

[17]  Ali Doniavi,et al.  A heuristic model for dynamic flexible job shop scheduling problem considering variable processing times , 2018, Int. J. Prod. Res..

[18]  Hua Jin,et al.  A novel dynamic scheduling strategy for solving flexible job-shop problems , 2016, J. Ambient Intell. Humaniz. Comput..

[19]  Yi-Chi Wang,et al.  Learning policies for single machine job dispatching , 2004 .

[20]  Ravi Sethi,et al.  The Complexity of Flowshop and Jobshop Scheduling , 1976, Math. Oper. Res..

[21]  Lenz Belzner,et al.  Optimization of global production scheduling with deep reinforcement learning , 2018 .

[22]  Mehmet Emin Aydin,et al.  Dynamic job-shop scheduling using reinforcement learning agents , 2000, Robotics Auton. Syst..

[23]  Bouziane Beldjilali,et al.  A distributed approach solving partially flexible job-shop scheduling problem with a Q-learning effect , 2017 .

[24]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[25]  Martin A. Riedmiller,et al.  Distributed policy search reinforcement learning for job-shop scheduling tasks , 2012 .

[26]  Huaiqing Wang,et al.  Multi-agent-based proactive–reactive scheduling for a job shop , 2012 .

[27]  Chao-Ton Su,et al.  Real-time scheduling for a smart factory using a reinforcement learning approach , 2018, Comput. Ind. Eng..

[28]  Xin Yao,et al.  Mathematical modeling and multi-objective evolutionary algorithms applied to dynamic flexible job shop scheduling problems , 2015, Inf. Sci..

[29]  Reha Uzsoy,et al.  Predictable scheduling of a job shop subject to breakdowns , 1998, IEEE Trans. Robotics Autom..

[30]  Hado van Hasselt,et al.  Double Q-learning , 2010, NIPS.

[31]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[32]  Bing Wang,et al.  A NSGA-II Algorithm Hybridizing Local Simulated-Annealing Operators for a Bi-Criteria Robust Job-Shop Scheduling Problem Under Scenarios , 2019, IEEE Transactions on Fuzzy Systems.

[33]  Samy Bengio,et al.  Device Placement Optimization with Reinforcement Learning , 2017, ICML.

[34]  Adil Baykasoğlu,et al.  Solving comprehensive dynamic job shop scheduling problem by using a GRASP-based approach , 2017, Int. J. Prod. Res..

[35]  Le Song,et al.  2 Common Formulation for Greedy Algorithms on Graphs , 2018 .

[36]  Chandrasekharan Rajendran,et al.  A comparative study of dispatching rules in dynamic flowshops and jobshops , 1999, Eur. J. Oper. Res..

[37]  Osman Kulak,et al.  Hybrid genetic algorithms for minimizing makespan in dynamic job shop scheduling problem , 2016, Comput. Ind. Eng..

[38]  Benyuan Liu,et al.  A Deep Reinforcement Learning Approach to Multi-Component Job Scheduling in Edge Computing , 2019, 2019 15th International Conference on Mobile Ad-Hoc and Sensor Networks (MSN).

[39]  Siba Sankar Mahapatra,et al.  Two-stage teaching-learning-based optimization method for flexible job-shop scheduling under machine breakdown , 2018, The International Journal of Advanced Manufacturing Technology.

[40]  Martin A. Riedmiller,et al.  ADAPTIVE REACTIVE JOB-SHOP SCHEDULING WITH REINFORCEMENT LEARNING AGENTS , 2008 .

[41]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[42]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[43]  Hao Wen Lin,et al.  Rule driven multi objective dynamic scheduling by data envelopment analysis and reinforcement learning , 2010, 2010 IEEE International Conference on Automation and Logistics.

[44]  Jamal Shahrabi,et al.  A reinforcement learning approach to parameter estimation in dynamic job shop scheduling , 2017, Comput. Ind. Eng..

[45]  Tarek Y. ElMekkawy,et al.  Robust and stable flexible job shop scheduling with random machine breakdowns using a hybrid genetic algorithm , 2011 .

[46]  Quan-Ke Pan,et al.  A two-stage artificial bee colony algorithm scheduling flexible job-shop scheduling problem with new job insertion , 2015, Expert Syst. Appl..