A reinforcement learning approach for developing routing policies in multi-agent production scheduling

Most recent research studies on agent-based production scheduling have focused on developing negotiation schema for agent cooperation. However, successful implementation of agent-based approaches not only relies on the cooperation among the agents, but the individual agent’s intelligence for making good decisions. Learning is one mechanism that could provide the ability for an agent to increase its intelligence while in operation. This paper presents a study examining the implementation of the Q-learning algorithm, one of the most widely used reinforcement learning approaches, for use by job agents when making routing decisions in a job shop environment. A factorial experiment design for studying the settings used to apply Q-learning to the job routing problem is carried out. This study not only investigates the effects of this Q-learning application but also provides recommendations for factor settings and useful guidelines for future applications of Q-learning to agent-based production scheduling.

[1]  Michael J. Shaw,et al.  Dynamic scheduling in cellular manufacturing systems: A framework for networked decision making , 1988 .

[2]  James J. Solberg,et al.  INTEGRATED SHOP FLOOR CONTROL USING AUTONOMOUS AGENTS , 1992 .

[3]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[4]  Carlos Ramos,et al.  A dynamic scheduling holon for manufacturing orders , 1998, J. Intell. Manuf..

[5]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[6]  Gautam Biswas,et al.  Performance Evaluation of Contract Net-Based Heterarchical Scheduling for Flexible Manufacturing Systems , 1997, Intell. Autom. Soft Comput..

[7]  Rüdiger Zarnekow,et al.  Intelligent software agents - foundations and applications , 1998 .

[8]  Djamila Ouelhadj,et al.  A multi-contract net protocol for dynamic scheduling in flexible manufacturing systems (FMS) , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[9]  Sridhar Mahadevan,et al.  Optimizing Production Manufacturing Using Reinforcement Learning , 1998, FLAIRS.

[10]  Djamila Ouelhadj,et al.  Multi-agent system for dynamic scheduling and control in manufacturing cells , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).

[11]  Rüdiger Zarnekow,et al.  Intelligent Software Agents , 1998, Springer Berlin Heidelberg.

[12]  Carlos Ramos,et al.  A distributed architecture and negotiation protocol for scheduling in manufacturing systems , 1999 .

[13]  Abhijit Gosavi,et al.  Self-Improving Factory Simulation using Continuous-time Average-Reward Reinforcement Learning , 2007 .

[14]  Pooja Dewan,et al.  Implementation of an auction-based distributed scheduling model for a dynamic job shop environment , 2001, Int. J. Comput. Integr. Manuf..

[15]  Mehmet Emin Aydin,et al.  Dynamic job-shop scheduling using reinforcement learning agents , 2000, Robotics Auton. Syst..

[16]  Gerhard Weiss,et al.  Multiagent systems: a modern approach to distributed artificial intelligence , 1999 .

[17]  James R. Burns,et al.  An adaptive production control system utilizing agent technology , 2000 .

[18]  James J. Solberg,et al.  An agent-based flexible routing manufacturing control simulation system , 1994, Proceedings of Winter Simulation Conference.

[19]  Carlos Ramos,et al.  A holonic approach for task scheduling in manufacturing systems , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[20]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[21]  Tapas K. Das,et al.  Intelligent dynamic control policies for serial production lines , 2001 .

[22]  Sanjay B. Joshi,et al.  Dynamic single-machine scheduling under distributed decision-making , 2000 .

[23]  Deyi Xue,et al.  An intelligent optimal production scheduling approach using constraint-based search and agent-based collaboration , 2001, Comput. Ind..