A Case Study in Scheduling Reentrant Manufacturing Lines: Optimal and Simulation-Based Approaches

This paper presents initial results of a research study in the optimal scheduling (i.e., job sequencing) in Reentrant Manufacturing Lines (RML), motivated by applications in semiconductor manufacturing. In particular, a simple benchmark RML is utilized, and the optimal scheduling policy is analyzed for an infinite horizon discounted cost problem formulation. The optimality equation and condition are derived, and optimal policy results are obtained for general non-negative one-stage cost functions (in the buffer size). Computational experiments are also performed using the Modified Policy Iteration algorithm. Preliminary experiments on the application of a Neuro-Dynamic Programming (NDP) method (i.e., Q-learning) to approximate the optimal scheduling policy are then presented, when linear and quadratic one-stage cost functions are considered. These experiments show that the Q-learning algorithm gradually approximates the optimal policy as the number of iterations increases and longer simulation lengths are utilized. However, the computational load required by the algorithm increases exponentially with the number of states. Results from this study represent an initial and exploratory research in the application of NDP methods to large-scale RML systems. More extensive research in both exact optimal results and efficient NDP schemes is in progress.

[1]  Randall P. Sadowski,et al.  Simulation with Arena , 1998 .

[2]  S. S. Panwalkar,et al.  A Survey of Scheduling Rules , 1977, Oper. Res..

[3]  Abhijit Gosavi,et al.  Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning , 2003 .

[4]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[5]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[6]  E. Fernandez-Gaucherand,et al.  An algorithm to convert wafer to calendar-based preventive maintenance schedules for semiconductor manufacturing systems , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[7]  Sean P. Meyn,et al.  Stability of queueing networks and scheduling policies , 1995, IEEE Trans. Autom. Control..

[8]  M.C. Fu,et al.  A Markov decision process model for capacity expansion and allocation , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).

[9]  S.C.H. Lu,et al.  Efficient scheduling policies to reduce mean and variance of cycle-time in semiconductor manufacturing plants , 1994 .

[10]  M.C. Fu,et al.  Optimal preventive maintenance scheduling in semiconductor manufacturing , 2004, IEEE Transactions on Semiconductor Manufacturing.

[11]  John N. Tsitsiklis,et al.  The Complexity of Optimal Queuing Network Control , 1999, Math. Oper. Res..

[12]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[13]  Reha Uzsoy,et al.  A review of production planning and scheduling models in the semiconductor industry , 1994 .

[14]  Lawrence M. Wein,et al.  Scheduling semiconductor wafer fabrication , 1988 .

[15]  Roland Sturm,et al.  Simulation-based evaluation of the ramp-up behavior of waferfabs , 2003, Advanced Semiconductor Manufacturing Conference and Workshop, 2003 IEEEI/SEMI.

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  P. R. Kumar,et al.  Re-entrant lines , 1993, Queueing Syst. Theory Appl..

[18]  J.-B. Suk,et al.  Optimal control of a storage-retrieval queuing system , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.

[19]  Abhijit Gosavi,et al.  Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning , 2003 .

[20]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[21]  Sean P. Meyn,et al.  Value iteration and optimization of multiclass queueing networks , 1999, Queueing Syst. Theory Appl..

[22]  John N. Tsitsiklis,et al.  The complexity of optimal queueing network control , 1994, Proceedings of IEEE 9th Annual Conference on Structure in Complexity Theory.

[23]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[24]  Steven A. Lippman,et al.  Applying a New Device in the Optimization of Exponential Queuing Systems , 1975, Oper. Res..

[25]  Sunil Kumar,et al.  Queueing network models in the design and analysis of semiconductor wafer fabs , 2001, IEEE Trans. Robotics Autom..