A simulation-based Approximate Dynamic Programming approach for the control of the Intel Mini-Fab benchmark model

This paper presents initial results on the application of a simulation-based Approximate Dynamic Programming (ADP) for the control of the benchmark model of a semiconductor fab denominated the Intel Mini-Fab. The ADP approach utilized is based on an Average Cost Temporal-Difference TD(¿) learning algorithm and under an Actor-Critic architecture. Results from simulation experiments, on which both policies generated via ADP and commonly utilized dispatching rules were utilized in the Mini-Fab, demonstrated that ADP yielded policies that provided a good performance in average Work-In-Process and average Cycle Time with respect to the dispatching rules considered.

[1]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[2]  Sunil Kumar,et al.  Fluctuation smoothing policies are stable for stochastic re-entrant lines , 1996, Discret. Event Dyn. Syst..

[3]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[4]  J. Little A Proof for the Queuing Formula: L = λW , 1961 .

[5]  Emmanuel Fernandez,et al.  Simulation-Based Approximate Dynamic Programming for Near-Optimal Control of Re-entrant Line Manufacturing Models , 2009 .

[6]  Warren B. Powell,et al.  Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics) , 2007 .

[7]  Sunil Kumar,et al.  Queueing network models in the design and analysis of semiconductor wafer fabs , 2001, IEEE Trans. Robotics Autom..

[8]  J.A. Ramirez-Hernandez,et al.  A Case Study in Scheduling Reentrant Manufacturing Lines: Optimal and Simulation-Based Approaches , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[9]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[10]  Sean P. Meyn Control Techniques for Complex Networks: Workload , 2007 .

[11]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[12]  Sean P. Meyn The policy iteration algorithm for average reward Markov decision processes with general state space , 1997, IEEE Trans. Autom. Control..

[13]  P.R. Kumar Scheduling semiconductor manufacturing plants , 1994, IEEE Control Systems.

[14]  Charles R. McLean,et al.  A framework for standard modular simulation in semiconductor wafer fabrication systems , 2005, Proceedings of the Winter Simulation Conference, 2005..

[15]  Jose A. Ramirez-Hernandez,et al.  Control of a re-entrant line manufacturing model with a reinforcement learning approach , 2007, ICMLA 2007.

[16]  Emmanuel Fernandez,et al.  Control of a re-entrant line manufacturing model with a reinforcement learning approach , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[17]  Jose A. Ramirez-Hernandez Optimal and Simulation-Based Approximate Dynamic Programming Approaches for the Control of Re-Entrant Line Manufacturing Models , 2010 .

[18]  Lawrence M. Wein,et al.  Scheduling semiconductor wafer fabrication , 1988 .

[19]  E. Fernandez,et al.  An Approximate Dynamic Programming Approach for Job Releasing and Sequencing in a Reentrant Manufacturing Line , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[20]  Shang Zhi,et al.  A proof of the queueing formula: L=λW , 2001 .

[21]  R. C. Leachman,et al.  Stochastic wafer fabrication scheduling , 2003 .

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  John N. Tsitsiklis,et al.  Average cost temporal-difference learning , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.

[24]  Jacek M. Zurada Neural Networks with Complex-Valued Neurons for Recurrent and Feedforward Architectures , 2007, ICMLA 2007.

[25]  P. R. Kumar,et al.  Re-entrant lines , 1993, Queueing Syst. Theory Appl..

[26]  A.A. Rodriguez,et al.  Hierarchical modeling and control for re-entrant semiconductor fabrication lines: a mini-fab benchmark , 1997, 1997 IEEE 6th International Conference on Emerging Technologies and Factory Automation Proceedings, EFTA '97.

[27]  Reha Uzsoy,et al.  A review of production planning and scheduling models in the semiconductor industry , 1994 .

[28]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[29]  Warren B. Powell,et al.  Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[30]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .