论文信息 - A simulation-based Approximate Dynamic Programming approach for the control of the Intel Mini-Fab benchmark model

A simulation-based Approximate Dynamic Programming approach for the control of the Intel Mini-Fab benchmark model

This paper presents initial results on the application of a simulation-based Approximate Dynamic Programming (ADP) for the control of the benchmark model of a semiconductor fab denominated the Intel Mini-Fab. The ADP approach utilized is based on an Average Cost Temporal-Difference TD(¿) learning algorithm and under an Actor-Critic architecture. Results from simulation experiments, on which both policies generated via ADP and commonly utilized dispatching rules were utilized in the Mini-Fab, demonstrated that ADP yielded policies that provided a good performance in average Work-In-Process and average Cycle Time with respect to the dispatching rules considered.

José A. Ramírez-Hernández | Emmanuel Fernandez | J. Ramírez-Hernández | E. Fernandez

[1] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[2] Sunil Kumar,et al. Fluctuation smoothing policies are stable for stochastic re-entrant lines , 1996, Discret. Event Dyn. Syst..

[3] Panos M. Pardalos,et al. Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[4] J. Little. A Proof for the Queuing Formula: L = λW , 1961 .

[5] Emmanuel Fernandez,et al. Simulation-Based Approximate Dynamic Programming for Near-Optimal Control of Re-entrant Line Manufacturing Models , 2009 .

[6] Warren B. Powell,et al. Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics) , 2007 .

[7] Sunil Kumar,et al. Queueing network models in the design and analysis of semiconductor wafer fabs , 2001, IEEE Trans. Robotics Autom..

[8] J.A. Ramirez-Hernandez,et al. A Case Study in Scheduling Reentrant Manufacturing Lines: Optimal and Simulation-Based Approaches , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[9] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[10] Sean P. Meyn. Control Techniques for Complex Networks: Workload , 2007 .

[11] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[12] Sean P. Meyn. The policy iteration algorithm for average reward Markov decision processes with general state space , 1997, IEEE Trans. Autom. Control..

[13] P.R. Kumar. Scheduling semiconductor manufacturing plants , 1994, IEEE Control Systems.

[14] Charles R. McLean,et al. A framework for standard modular simulation in semiconductor wafer fabrication systems , 2005, Proceedings of the Winter Simulation Conference, 2005..

[15] Jose A. Ramirez-Hernandez,et al. Control of a re-entrant line manufacturing model with a reinforcement learning approach , 2007, ICMLA 2007.

[16] Emmanuel Fernandez,et al. Control of a re-entrant line manufacturing model with a reinforcement learning approach , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[17] Jose A. Ramirez-Hernandez. Optimal and Simulation-Based Approximate Dynamic Programming Approaches for the Control of Re-Entrant Line Manufacturing Models , 2010 .

[18] Lawrence M. Wein,et al. Scheduling semiconductor wafer fabrication , 1988 .

[19] E. Fernandez,et al. An Approximate Dynamic Programming Approach for Job Releasing and Sequencing in a Reentrant Manufacturing Line , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[20] Shang Zhi,et al. A proof of the queueing formula: L=λW , 2001 .

[21] R. C. Leachman,et al. Stochastic wafer fabrication scheduling , 2003 .

[22] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23] John N. Tsitsiklis,et al. Average cost temporal-difference learning , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.

[24] Jacek M. Zurada. Neural Networks with Complex-Valued Neurons for Recurrent and Feedforward Architectures , 2007, ICMLA 2007.

[25] P. R. Kumar,et al. Re-entrant lines , 1993, Queueing Syst. Theory Appl..

[26] A.A. Rodriguez,et al. Hierarchical modeling and control for re-entrant semiconductor fabrication lines: a mini-fab benchmark , 1997, 1997 IEEE 6th International Conference on Emerging Technologies and Factory Automation Proceedings, EFTA '97.

[27] Reha Uzsoy,et al. A review of production planning and scheduling models in the semiconductor industry , 1994 .

[28] Warren B. Powell,et al. Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[29] Warren B. Powell,et al. Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[30] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .