Learning agents for the multi-mode project scheduling problem

Intelligent optimization refers to the promising technique of integrating learning mechanisms into (meta-)heuristic search. In this paper, we use multi-agent reinforcement learning for building high-quality solutions for the multi-mode resource-constrained project scheduling problem (MRCPSP). We use a network of distributed reinforcement learning agents that cooperate to jointly learn a well-performing constructive heuristic. Each agent, being responsible for one activity, uses two simple learning devices, called learning automata, that learn to select a successor activity order and a mode, respectively. By coupling the reward signals for both learning tasks, we can clearly show the advantage of using reinforcement learning in search. We present some comparative results, to show that our method can compete with the best performing algorithms for the MRCPSP, yet using only simple learning schemes without the burden of complex fine-tuning.

[1]  Philip M. Wolfe,et al.  Multiproject Scheduling with Limited Resources: A Zero-One Programming Approach , 1969 .

[2]  F. Brian Talbot,et al.  Resource-Constrained Project Scheduling with Time-Resource Tradeoffs: The Nonpreemptive Case , 1982 .

[3]  Jan Karel Lenstra,et al.  Scheduling subject to resource constraints: classification and complexity , 1983, Discret. Appl. Math..

[4]  Richard Wheeler,et al.  Decentralized learning in finite Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.

[5]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[6]  C. Watkins Learning from delayed rewards , 1989 .

[7]  Roman Słowiński,et al.  Computational experience with a backtracking algorithm for solving a general class of precedence and resource-constrained scheduling problems , 1990 .

[8]  Robert J Willis,et al.  An iterative scheduling technique for resource-constrained project scheduling , 1992 .

[9]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[10]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[11]  R. Kolisch,et al.  Serial and parallel resource-constrained project scheduling methods revisited: Theory and computation , 1996 .

[12]  Rainer Kolisch,et al.  PSPLIB - a project scheduling problem library , 1996 .

[13]  Arno Sprecher,et al.  An exact algorithm for project scheduling with multiple modes , 1997 .

[14]  Rainer Kolisch,et al.  PSPLIB - A project scheduling problem library: OR Software - ORSEP Operations Research Software Exchange Program , 1997 .

[15]  Masao Mori,et al.  A genetic algorithm for multi-mode resource constrained project scheduling problem , 1997, Eur. J. Oper. Res..

[16]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[17]  Saïd Salhi,et al.  A Tabu Search Approach for the Resource Constrained Project Scheduling Problem , 1998, J. Heuristics.

[18]  Erik Demeulemeester,et al.  Resource-constrained project scheduling: A survey of recent developments , 1998, Comput. Oper. Res..

[19]  Bert De Reyck,et al.  The multi-mode resource-constrained project scheduling problem with generalized precedence relations , 1999, Eur. J. Oper. Res..

[20]  Rolf H. Möhring,et al.  Resource-constrained project scheduling: Notation, classification, models, and methods , 1999, Eur. J. Oper. Res..

[21]  Bruce C. Hartman,et al.  Agent-based project scheduling , 2000 .

[22]  Grzegorz Waligóra,et al.  Simulated Annealing for Multi-Mode Resource-Constrained Project Scheduling , 2001, Ann. Oper. Res..

[23]  Sönke Hartmann,et al.  Project Scheduling with Multiple Modes: A Genetic Algorithm , 2001, Ann. Oper. Res..

[24]  Erik Demeulemeester,et al.  Project scheduling : a research handbook , 2002 .

[25]  Rubén Ruiz,et al.  Solving the Multi-Mode Resource-Constrained Project Scheduling Problem with genetic algorithms , 2003, J. Oper. Res. Soc..

[26]  K. Bouleimen,et al.  A new efficient simulated annealing algorithm for the resource-constrained project scheduling problem and its multiple mode version , 2003, Eur. J. Oper. Res..

[27]  M. Thathachar,et al.  Networks of Learning Automata: Techniques for Online Stochastic Optimization , 2003 .

[28]  Gang Yu,et al.  A Branch-and-Cut Procedure for the Multimode Resource-Constrained Project-Scheduling Problem , 2006, INFORMS J. Comput..

[29]  Piotr Jędrzejowicz,et al.  Population Learning Algorithm for the Resource-Constrained Project Scheduling , 2006 .

[30]  Rainer Kolisch,et al.  Experimental investigation of heuristics for resource-constrained project scheduling: An update , 2006, Eur. J. Oper. Res..

[31]  Piotr Jedrzejowicz,et al.  Agent-Based Approach to Solving the Resource Constrained Project Scheduling Problem , 2007, ICANNGA.

[32]  Peter Vrancx,et al.  Decentralized Learning in Markov Games , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[33]  Bassem Jarboui,et al.  A combinatorial particle swarm optimization for solving multi-mode resource-constrained project scheduling problems , 2008, Appl. Math. Comput..

[34]  Mauro Brunato,et al.  Reactive Search and Intelligent Optimization , 2008 .

[35]  Federico Barber,et al.  An efficient hybrid genetic algorithm for scheduling projects with resource constraints and multiple execution modes , 2009 .

[36]  Mario Vanhoucke,et al.  An Artificial Immune System for the Multi-Mode Resource-Constrained Project Scheduling Problem , 2009, EvoCOP.

[37]  Mohammad Ranjbar,et al.  A hybrid scatter search for the discrete time/resource trade-off problem in project scheduling , 2009, Eur. J. Oper. Res..

[38]  Mario Vanhoucke,et al.  A genetic algorithm for the preemptive and non-preemptive multi-mode resource-constrained project scheduling problem , 2010, Eur. J. Oper. Res..