Approximate dynamic programming: Lessons from the field

Approximate dynamic programming is emerging as a powerful tool for certain classes of multistage stochastic, dynamic problems that arise in operations research. It has been applied to a wide range of problems spanning complex financial management problems, dynamic routing and scheduling, machine scheduling, energy management, health resource management, and very large-scale fleet management problems. It offers a modeling framework that is extremely flexible, making it possible to combine the strengths of simulation with the intelligence of optimization. Yet it remains a sometimes frustrating algorithmic strategy which requires considerable intuition into the structure of a problem. There are a number of algorithmic choices that have to be made in the design of a complete ADP algorithm. This tutorial describes the author¿s experiences with many of these choices in the course of solving a wide range of problems.

[1]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[2]  Warren B. Powell,et al.  An Approximate Dynamic Programming Algorithm for Large-Scale Fleet Management: A Case Application , 2009, Transp. Sci..

[3]  Julia L. Higle,et al.  Stochastic Decomposition: An Algorithm for Two-Stage Linear Programs with Recourse , 1991, Math. Oper. Res..

[4]  Runze Li,et al.  Statistical Challenges with High Dimensionality: Feature Selection in Knowledge Discovery , 2006, math/0602133.

[5]  Warren B. Powell,et al.  Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming , 2006, Machine Learning.

[6]  Warren B. Powell,et al.  Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics) , 2007 .

[7]  Julia L. Higle,et al.  An Introductory Tutorial on Stochastic Linear Programming Models , 1999, Interfaces.

[8]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[9]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[10]  Warren B. Powell,et al.  The optimizing-simulator: merging simulation and optimization using approximate dynamic programming , 2005, Proceedings of the Winter Simulation Conference, 2005..

[11]  L. Bertazzi,et al.  Optimal and Neuro—Dynamic Programming Solutions for a Stochastic Inventory Transportation Problem , 2001 .

[12]  R. Bellman,et al.  FUNCTIONAL APPROXIMATIONS AND DYNAMIC PROGRAMMING , 1959 .

[13]  John N. Tsitsiklis,et al.  Feature-based methods for large scale dynamic programming , 2004, Machine Learning.

[14]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[15]  U. Rieder,et al.  Markov Decision Processes , 2010 .

[16]  Warren B. Powell,et al.  Dynamic-Programming Approximations for Stochastic Time-Staged Integer Multicommodity-Flow Problems , 2006, INFORMS J. Comput..

[17]  Warren B. Powell,et al.  An Optimal Approximate Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem , 2009, Math. Oper. Res..

[18]  Warrren B Powell,et al.  An Adaptive, Distribution-Free Algorithm for the Newsvendor Problem with Censored Demands, with Applications to Inventory and Distribution , 2001 .

[19]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[20]  Dimitri P. Bertsekas,et al.  A Counterexample to Temporal Differences Learning , 1995, Neural Computation.

[21]  P. Schweitzer,et al.  Generalized polynomial approximations in Markovian decision processes , 1985 .