论文信息 - A unified framework for stochastic and dynamic programming

A unified framework for stochastic and dynamic programming

Stochastic programming and approximate dynamic programming have evolved as competing frameworks for solving sequential stochastic optimization problems, with proponents touting the strengths of their favorite approaches. With less visibility in this particular debate are communities working under names such as reinforcement learning, stochastic control, stochastic search and simulation-optimization, to name just a few. Put it all together and you get what I have come to call the jungle of stochastic optimization. The competing communities working in stochastic optimization reflect the diversity of applications which arise in different problem settings, resulting in the development of parallel concepts, terminology and notation. Problem classes are distinguished by the nature of the decisions (discrete/continuous, scalar/vector), the underlying stochastic process, the transition function (known/unknown) and the objective function (convex? continuous?). Communities have evolved methods that are well suited to the problem classes that interest them. In the process, differences in vocabulary have hidden parallel developments (two communities doing the same thing with different terminology and vocabulary). These differences have hidden important contributions that might help other communities. Computer scientists have ignored the power of convexity to solve problems with vectorvalued actions. At the same time, the stochastic programming community has ignored the power of machine learning to approximate high-dimensional functions [8]. Years ago, I found that combining these two central ideas made it possible to solve a stochastic dynamic program with a decision vector with 50,000 dimensions and a state variable with 1020 dimensions [15]. In another problem, the same methods solved a stochastic, dynamic program with 175,000 time periods [11]. Stochastic programming, dynamic programming, and stochastic search can all be viewed in a unified framework if presented using common terminology and notation. One of the biggest challenges is the lack of a widely accepted modeling framework of the type that has defined the field of deterministic math programming. Misconceptions about the meaning of terms such as “state variable” and “policy” have limited dynamic programming to a relatively narrow problem class. For this reason, I will begin with a proposal for a common modeling framework which is designed to duplicate the elegance of “min cx subject to Ax = b, x ≥ 0” that is so familiar to the operations research community. I then turn to the issue of defining what is meant by the word “policy.” This article draws heavily on the ideas in [10].

Warren B. Powell | Warrren B Powell

[1] Warren B. Powell,et al. An Approximate Dynamic Programming Algorithm for Large-Scale Fleet Management: A Case Application , 2009, Transp. Sci..

[2] Julia L. Higle,et al. Stochastic Decomposition: A Statistical Method for Large Scale Stochastic Linear Programming , 1996 .

[3] Robert Tibshirani,et al. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[4] Warren B. Powell,et al. Approximate dynamic programming in transportation and logistics: a unified framework , 2012, EURO J. Transp. Logist..

[5] Panos M. Pardalos,et al. Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[6] Warren B. Powell,et al. Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[7] Warren B. Powell,et al. Feature Article - Merging AI and OR to Solve High-Dimensional Stochastic Optimization Problems Using Approximate Dynamic Programming , 2010, INFORMS J. Comput..

[8] Warren B. Powell,et al. The optimizing-simulator: An illustration using the military airlift problem , 2009, TOMC.

[9] Warren B. Powell,et al. Dynamic-Programming Approximations for Stochastic Time-Staged Integer Multicommodity-Flow Problems , 2006, INFORMS J. Comput..

[10] Jery R. Stedinger,et al. SOCRATES: A system for scheduling hydroelectric generation under uncertainty , 1995, Ann. Oper. Res..

[11] John M. Wilson,et al. Introduction to Stochastic Programming , 1998, J. Oper. Res. Soc..

[12] Louis Wehenkel,et al. Scenario Trees and Policy Selection for Multistage Stochastic Programming Using Machine Learning , 2011, INFORMS J. Comput..

[13] J. Dupacová,et al. Comparison of multistage stochastic programs with recourse and stochastic dynamic programs with discrete time , 2002 .

[14] Warren B. Powell,et al. SMART: A Stochastic Multiscale Model for the Analysis of Energy Resources, Technology, and Policy , 2012, INFORMS J. Comput..

[15] Alan J. King,et al. Modeling with Stochastic Programming , 2012 .

[16] Alexander Shapiro,et al. Lectures on Stochastic Programming: Modeling and Theory , 2009 .