论文信息 - Hierarchical decision making in semiconductor fabs using multi-time scale Markov decision processes

Hierarchical decision making in semiconductor fabs using multi-time scale Markov decision processes

There are different timescales of decision making in semiconductor fabs. While decisions on buying/discarding of machines are made on the slower timescale, those that deal with capacity allocation and switchover are made on the faster timescale. We formulate this problem along the lines of a recently developed multi-time scale Markov decision process (MMDP) framework and present numerical experiments wherein we use TD(0) and Q-learning algorithms with linear approximation architecture, and show comparisons of these with the policy iteration algorithm. We show numerical experiments under two different scenarios. In the first, transition probabilities are computed and used in the algorithms. In the second, transitions are simulated without explicitly computing the transition probabilities. We observe that TD(0) requires less computation than Q-learning. Moreover algorithms that use simulated transitions require significantly less computation than their counterparts that compute transition probabilities.

S. Bhatnagar | J.R. Panigrahi

[1] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[2] Shalabh Bhatnagar,et al. Approximate Policy Iteration for Semiconductor Fab-Level Decision Making - a Case Study , 2000 .

[3] S. Marcus,et al. Multi-time Scale Markov Decision Processes , 2005 .

[4] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[5] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[6] M.C. Fu,et al. A Markov decision process model for capacity expansion and allocation , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).

[7] Mark A. Shayman,et al. Multitime scale Markov decision processes , 2003, IEEE Trans. Autom. Control..

[8] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .