Neuro-dynamic programming for cooperative inventory control

In multi-retailer inventory control the possibility of sharing set up costs motivates communication and coordination among the retailers. We solve the problem of finding suboptimal distributed reordering policies which minimize set up, ordering, storage and shortage costs, incurred by the retailers over a finite horizon. Neuro-dynamic programming (NDP) reduces the computational complexity of the solution algorithm from exponential to polynomial on the number of retailers.

[1]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[2]  Herbert E. Scarf,et al.  Inventory Theory , 2002, Oper. Res..

[3]  R. Pesenti,et al.  Distributed consensus protocols for coordinating buyers , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[4]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[5]  Sven Axsäter,et al.  A framework for decentralized multi-echelon inventory control , 2001 .

[6]  Jan Fransoo,et al.  Multi-echelon multi-company inventory planning with limited information exchange , 2001, J. Oper. Res. Soc..

[7]  T.C.E. Cheng,et al.  Modelling the benefits of information sharing-based partnerships in a two-level supply chain , 2002, J. Oper. Res. Soc..

[8]  H. Scarf THE OPTIMALITY OF (S,S) POLICIES IN THE DYNAMIC INVENTORY PROBLEM , 1959 .

[9]  R. Murray,et al.  Consensus protocols for networks of dynamic agents , 2003, Proceedings of the 2003 American Control Conference, 2003..

[10]  Rina Dechter,et al.  The optimality of A , 1988 .