Application of reinforcement learning to multi-agent production scheduling

Reinforcement learning (RL) has received attention in recent years from agent-based researchers because it can be applied to problems where autonomous agents learn to select proper actions for achieving their goals based on interactions with their environment. Each time an agent performs an action, the environment's response, as indicated by its new state, is used by the agent to reward or penalize its action. The agent's goal is to maximize the total amount of reward it receives over the long run. Although there have been several successful examples demonstrating the usefulness of RL, its application to manufacturing systems has not been fully explored. The objective of this research is to develop a set of guidelines for applying the Q-learning algorithm to enable an individual agent to develop a decision making policy for use in agent-based production scheduling applications such as dispatching rule selection and job routing. For the dispatching rule selection problem, a single machine agent employs the Q-learning algorithm to develop a decision-making policy on selecting the appropriate dispatching rule from among three given dispatching rules. In the job routing problem, a simulated job shop system is used for examining the implementation of the Q-learning algorithm for use by job agents when making routing decisions in such an environment. Two factorial experiment designs for studying the settings used to apply Q-learning to the single machine dispatching rule selection problem and the job routing problem are carried out. This study not only investigates the main effects of this Q-learning application but also provides recommendations for factor settings and useful guidelines for future applications of Q-learning to agent-based production scheduling.

[1]  James R. Burns,et al.  An adaptive production control system utilizing agent technology , 2000 .

[2]  S. Mahadevan,et al.  Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning , 1999 .

[3]  Joseph J. Talavage,et al.  Intelligent dispatching for flexible manufacturing , 1991 .

[4]  Deyi Xue,et al.  An intelligent optimal production scheduling approach using constraint-based search and agent-based collaboration , 2001, Comput. Ind..

[5]  Yoke San Wong,et al.  Dynamic selection of dispatching rules for job shop scheduling , 2000 .

[6]  Gerhard Weiss,et al.  Multiagent systems: a modern approach to distributed artificial intelligence , 1999 .

[7]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[8]  C. Saygin,et al.  Real-Time Manipulation of Alternative Routeings in Flexible Manufacturing Systems: A Simulation Study , 2001 .

[9]  Oded Maimon,et al.  Heuristics for dynamic selection and routing of parts in an FMS , 1992 .

[10]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[11]  Gautam Biswas,et al.  Performance Evaluation of Contract Net-Based Heterarchical Scheduling for Flexible Manufacturing Systems , 1997, Intell. Autom. Soft Comput..

[12]  Yohanan Arzi,et al.  On-line scheduling in a multi-cell flexible manufacturing system , 1995 .

[13]  Tapas K. Das,et al.  Intelligent dynamic control policies for serial production lines , 2001 .

[14]  Richard Murch,et al.  Intelligent Software Agents , 1998 .

[15]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[16]  Sanchoy K. Das,et al.  The measurement of flexibility in manufacturing systems , 1996 .

[17]  Mehmet Emin Aydin,et al.  Dynamic job-shop scheduling using reinforcement learning agents , 2000, Robotics Auton. Syst..

[18]  K. T. Yeo,et al.  An expert neural network system for dynamic job shop scheduling , 1994 .

[19]  Michael J. Shaw,et al.  Intelligent Scheduling with Machine Learning Capabilities: The Induction of Scheduling Knowledge§ , 1992 .

[20]  Mohsen Jahangirian,et al.  Intelligent dynamic scheduling system: the application of genetic algorithms , 2000 .

[21]  P. Brunn,et al.  Workshop scheduling using practical (inaccurate) data Part 1: The performance of heuristic scheduling rules in a dynamic job shop environment using a rolling time horizon approach , 1999 .

[22]  George Chryssolouris,et al.  Dynamic scheduling of manufacturing job shops using genetic algorithms , 2001, J. Intell. Manuf..

[23]  Sanjay B. Joshi,et al.  Dynamic single-machine scheduling under distributed decision-making , 2000 .

[24]  Leslie Pack Kaelbling,et al.  The NSF Workshop on Reinforcement Learning: Summary and Observations , 1996 .

[25]  Alberto Gómez,et al.  A review of machine learning in dynamic scheduling of flexible manufacturing systems , 2001, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[26]  Michael J. Shaw,et al.  Information-Based Dynamic Manufacturing System Scheduling , 2001 .

[27]  Yugeng Xi,et al.  A rolling horizon job shop rescheduling strategy in the dynamic environment , 1997 .

[28]  Weiming Shen,et al.  MetaMorph: An adaptive agent-based architecture for intelligent manufacturing , 1999 .

[29]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[30]  Thomas E. Morton,et al.  Heuristic scheduling systems : with applications to production systems and project management , 1993 .

[31]  Carlos Ramos,et al.  A distributed architecture and negotiation protocol for scheduling in manufacturing systems , 1999 .

[32]  David M. Dilts,et al.  The evolution of control architectures for automated manufacturing systems , 1991 .

[33]  Carlos Ramos,et al.  A dynamic scheduling holon for manufacturing orders , 1998, J. Intell. Manuf..

[34]  Richard S. Sutton,et al.  Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[35]  Felix T. S. Chan,et al.  The effects of routing flexibility on a flexible manufacturing system , 2001, Int. J. Comput. Integr. Manuf..

[36]  Ihsan Sabuncuoglu,et al.  Rescheduling frequency in an FMS with uncertain processing times and unreliable machines , 1999 .

[37]  Dug Hee Moon,et al.  A simulation study for dynamic scheduling in a hybrid assembly/job shop considering the JIT context , 1998 .

[38]  Yoke San Wong,et al.  Machine Selection Rules in a Dynamic Job Shop , 2000 .

[39]  H. Pierreval,et al.  Dynamic scheduling selection of dispatching rules for manufacturing system , 1997 .

[40]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[41]  Andrea Rossi,et al.  Dynamic scheduling of FMS using a real-time genetic algorithm , 2000 .

[42]  Neil A. Duffie,et al.  Real-time distributed scheduling of heterarchical manufacturing systems , 1994 .

[43]  Djamila Ouelhadj,et al.  Multi-agent system for dynamic scheduling and control in manufacturing cells , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).

[44]  Huajie Liu,et al.  Dispatching rule selection using artificial neural networks for dynamic planning and scheduling , 1996, J. Intell. Manuf..

[45]  Hiroki Okubo,et al.  Characteristics of distributed autonomous production control , 2000 .

[46]  P. Brunn,et al.  Workshop scheduling using practical (inaccurate) data Part 2: An investigation of the robustness of scheduling rules in a dynamic and stochastic environment , 1999 .

[47]  S. S. Panwalkar,et al.  A Survey of Scheduling Rules , 1977, Oper. Res..

[48]  C. McLean,et al.  A proposed hierarchical control model for automated manufacturing systems , 1986 .

[49]  Ping-Teng Chang,et al.  Modelling of job-shop scheduling with multiple quantitative and qualitative objectives and a GA/TS mixture approach , 2001, Int. J. Comput. Integr. Manuf..

[50]  Li Lin,et al.  A dynamic job shop scheduling framework: a backward approach , 1994 .

[51]  Farzad Mahmoodi,et al.  The effect of combining simple priority heuristics in flow-dominant shops , 1996 .

[52]  James J. Solberg,et al.  An agent-based flexible routing manufacturing control simulation system , 1994, Proceedings of Winter Simulation Conference.

[53]  Michael J. Shaw,et al.  Adaptive scheduling in dynamic flexible manufacturing systems: a dynamic rule selection approach , 1997, IEEE Trans. Robotics Autom..

[54]  Joseph J. Talavage,et al.  A transient-based real-time scheduling algorithm in FMS , 1991 .

[55]  James J. Solberg,et al.  INTEGRATED SHOP FLOOR CONTROL USING AUTONOMOUS AGENTS , 1992 .

[56]  P. Brunn,et al.  Workshop scheduling using practical (inaccurate) data Part 3: A framework to integrate job releasing, routing and scheduling functions to create a robust predictive schedule , 2000 .

[57]  Anil K. Jain,et al.  PRODUCTION SCHEDULING/RESCHEDULING IN FLEXIBLE MANUFACTURING , 1997 .

[58]  Michael Pinedo,et al.  Scheduling: Theory, Algorithms, and Systems , 1994 .

[59]  Dimitri P. Bertsekas,et al.  Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.

[60]  D. Atkin OR scheduling algorithms. , 2000, Anesthesiology.

[61]  Sridhar Mahadevan,et al.  Optimizing Production Manufacturing Using Reinforcement Learning , 1998, FLAIRS.

[62]  James J. Solberg,et al.  Effectiveness of flexible routing control , 1991 .

[63]  Minheekim,et al.  Simulation-based real-time scheduling in a flexible manufacturing system , 1994 .

[64]  Alberto Gómez,et al.  Dynamic Scheduling of Manufacturing Systems with Machine Learning , 2001, Int. J. Found. Comput. Sci..

[65]  Michael J. Shaw,et al.  Dynamic scheduling in cellular manufacturing systems , 1987 .

[66]  Stanley F. Bullington,et al.  Development of manufacturing control strategies using unsupervised machine learning , 1996 .

[67]  W. Punch,et al.  A Genetic Algorithm Approach to Dynamic Job Shop Scheduling Problems , 1997 .

[68]  Thomas J. Crowe,et al.  A proposed structure for distributed shopfloor control , 1995 .

[69]  Dipak Chaudhuri,et al.  Dynamic scheduling—a survey of research , 1993 .

[70]  Kevin R. Caskey,et al.  Heterogeneous dispatching rules in job and flow shops , 1996 .

[71]  Joong-In Kim,et al.  Multi-criteria operational control rules in flexible manufacturing systems (FMSs) , 1990 .

[72]  Yeong-Dae Kim,et al.  A real-time scheduling mechanism for a flexible manufacturing system: Using simulation and dispatching rules , 1998 .

[73]  Pooja Dewan,et al.  Implementation of an auction-based distributed scheduling model for a dynamic job shop environment , 2001, Int. J. Comput. Integr. Manuf..

[74]  L. P. Khoo,et al.  A Prototype Genetic Algorithm-Enhanced Multi-Objective Scheduler for Manufacturing Systems , 2000 .

[75]  Carlos Ramos,et al.  A holonic approach for task scheduling in manufacturing systems , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[76]  Shinichi Nakasuka,et al.  Dynamic scheduling system utilizing machine learning as a knowledge acquisition tool , 1992 .

[77]  Reid G. Smith,et al.  The Contract Net Protocol: High-Level Communication and Control in a Distributed Problem Solver , 1980, IEEE Transactions on Computers.

[78]  Yuehwern Yih,et al.  A learning-based methodology for dynamic scheduling in distributed manufacturing systems , 1995 .

[79]  Rüdiger Zarnekow,et al.  Intelligent software agents - foundations and applications , 1998 .

[80]  Eric M. Malstrom,et al.  Evaluation of traditional work scheduling rules in a flexible manufacturing system with a physical simulator , 1988 .