论文信息 - A neuro-dynamic programming approach to retailer inventory management

A neuro-dynamic programming approach to retailer inventory management

We discuss an application of neuro-dynamic programming techniques to the optimization of retailer inventory systems. We describe a specific case study involving a model with thirty-three state variables. The enormity of this state space renders classical algorithms of dynamic programming inapplicable. We compare the performance of solutions generated by neuro-dynamic programming algorithms to that delivered by optimized s-type ("order-up-to") policies. We are able to generate control strategies substantially superior, reducing inventory costs by approximately ten percent.

[1] Gerald Tesauro,et al. Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..

[2] S. Nahmias,et al. Mathematical Models of Retailer Inventory Systems: A Review , 1993 .

[3] Hau L. Lee,et al. Material Management in Decentralized Supply Chains , 1993, Oper. Res..

[4] Steven Nahmias,et al. Optimizing inventory levels in a two-echelon retailer system with partial lost sales , 1994 .

[5] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[6] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[7] Wei Zhang,et al. A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[8] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[9] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.