An application of reinforcement learning to manufacturing scheduling problems

The feasibility of applying reinforcement learning to a flow shop scheduling problem, the objective of which is to minimize the maximum completion time, is studied for two and three machines. It is generally hard to obtain any optimal solution in this problem domain with more than two machines, whereas with exactly two machines, the optimal is given by Johnson's algorithm. The impressive points revealed by the implementation of various instances of the reinforcement learning formulations are as follows. First, a good formulation may sometimes lead an agent to acquire the optimal rule that minimizes an objective function. Secondly, an agent can learn and obtain the improved schedules even when the formulation is not perfect. Thirdly, the same formulation is sound not only for the two-machine problem, but for the three-machine problem where Johnson's algorithm does not necessarily give any optimal solution. Consequently, the utilization of the reinforcement learning has potential to help us find an approximate solution or sometimes the optimal solution in a relatively simple way. The capability of a reinforcement learning agent, however, mostly depends upon the problem formulation. It is devised by utilizing theoretical solution methods and heuristics. At the same time, the agent has great flexibility to obtain improved schedules under the formulation with less prior knowledge.