The feasibility of applying reinforcement learning to a flow shop scheduling problem, the objective of which is to minimize the maximum completion time, is studied for two and three machines. It is generally hard to obtain any optimal solution in this problem domain with more than two machines, whereas with exactly two machines, the optimal is given by Johnson's algorithm. The impressive points revealed by the implementation of various instances of the reinforcement learning formulations are as follows. First, a good formulation may sometimes lead an agent to acquire the optimal rule that minimizes an objective function. Secondly, an agent can learn and obtain the improved schedules even when the formulation is not perfect. Thirdly, the same formulation is sound not only for the two-machine problem, but for the three-machine problem where Johnson's algorithm does not necessarily give any optimal solution. Consequently, the utilization of the reinforcement learning has potential to help us find an approximate solution or sometimes the optimal solution in a relatively simple way. The capability of a reinforcement learning agent, however, mostly depends upon the problem formulation. It is devised by utilizing theoretical solution methods and heuristics. At the same time, the agent has great flexibility to obtain improved schedules under the formulation with less prior knowledge.
[1]
C. S. George Lee,et al.
Genetic reinforcement learning approach to the heterogeneous machine scheduling problem
,
1998,
IEEE Trans. Robotics Autom..
[2]
D. Pomerleau.
Eecient T Raining of Artiicial Neural Networks for Autonomous Navigation
,
1991
.
[3]
William L. Maxwell,et al.
Theory of scheduling
,
1967
.
[4]
Andrew G. Barto,et al.
Reinforcement learning
,
1998
.
[5]
Andrew W. Moore,et al.
Value Function Based Production Scheduling
,
1998,
ICML.
[6]
Wei Zhang,et al.
A Reinforcement Learning Approach to job-shop Scheduling
,
1995,
IJCAI.