论文信息 - Approximate Dynamic Programming for Large-Scale Resource Allocation Problems - 字舞流文

Approximate Dynamic Programming for Large-Scale Resource Allocation Problems

We present modeling and solution strategies for large-scale resource allocation prob- lems that take place over multiple time periods under uncertainty. In general, the strategies we present formulate the problem as a dynamic program and replace the value functions with tractable approximations. The approximations of the value func- tions are obtained by using simulated trajectories of the system and iteratively improving on (possibly naive) initial approximations; we propose several improvement algorithms for this purpose. As a result, the resource allocation problem decomposes into time-staged subproblems, where the impact of the current decisions on the future evolution of the system is assessed through value function approximations. Computa- tional experiments indicate that the strategies we present yield high-quality solutions. We also present comparisons with conventional stochastic programming methods.

Warren B. Powell | Huseyin Topaloglu | Warrren B Powell | Huseyin Topaloglu

[1] R. Wets. Stochastic Programs with Fixed Recourse: The Equivalent Deterministic Program , 1974 .

[2] Julia L. Higle,et al. Stochastic Decomposition: An Algorithm for Two-Stage Linear Programs with Recourse , 1991, Math. Oper. Res..

[3] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[4] Warrren B Powell,et al. An Adaptive, Distribution-Free Algorithm for the Newsvendor Problem with Censored Demands, with Applications to Inventory and Distribution , 2001 .

[5] J. Birge,et al. A separable piecewise linear upper bound for stochastic linear programs , 1988 .

[6] G. Dantzig,et al. The Allocation of Aircraft to Routes—An Example of Linear Programming Under Uncertain Demand , 1956 .

[7] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[8] R. Wets. Programming Under Uncertainty: The Equivalent Convex Program , 1966 .

[9] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[10] Warrren B Powell,et al. Convergent Cutting-Plane and Partial-Sampling Algorithm for Multistage Stochastic Linear Programs with Recourse , 1999 .

[11] Warren B. Powell,et al. The Dynamic Assignment Problem , 2004, Transp. Sci..

[12] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[13] Yuri Ermoliev,et al. Numerical techniques for stochastic optimization , 1988 .

[14] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..

[15] Warren B. Powell,et al. A framework for representing and solving dynamic resource transformation problems , 1999 .

[16] R. Wets,et al. L-SHAPED LINEAR PROGRAMS WITH APPLICATIONS TO OPTIMAL CONTROL AND STOCHASTIC PROGRAMMING. , 1969 .

[17] Huseyin Topaloglu. A parallelizable dynamic fleet management model with random travel times , 2006, Eur. J. Oper. Res..

[18] Warren B. Powell,et al. Incorporating Pricing Decisions into the Stochastic Dynamic Fleet Management Problem , 2007, Transp. Sci..

[19] Warren B. Powell,et al. An Adaptive Dynamic Programming Algorithm for the Heterogeneous Resource Allocation Problem , 2002, Transp. Sci..

[20] Warren B. Powell,et al. An Algorithm for Multistage Dynamic Networks with Random Arc Capacities, with an Application to Dynamic Fleet Management , 1996, Oper. Res..

[21] Warrren B Powell,et al. An adaptive dynamic programming algorithm for a stochastic multiproduct batch dispatch problem , 2003 .

[22] Warren B. Powell,et al. A Representational Paradigm for Dynamic Resource Transformation Problems , 2001, Ann. Oper. Res..

[23] Warren B. Powell,et al. Shape - a Stochastic Hybrid Approximation Procedure for Two-Stage Stochastic Programs , 2000, Oper. Res..

[24] John N. Tsitsiklis,et al. Regression methods for pricing complex American-style options , 2001, IEEE Trans. Neural Networks.

[25] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[26] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.

[27] Warren B. Powell,et al. Learning Algorithms for Separable Approximations of Discrete Stochastic Optimization Problems , 2004, Math. Oper. Res..

[28] Warren B. Powell,et al. Stochastic programs over trees with random arc capacities , 1994, Networks.

[29] Stein W. Wallace. A piecewise linear upper bound on the network recourse function , 1987, Math. Program..

[30] Warren B. Powell,et al. Dynamic Control of Logistics Queueing Networks for Large-Scale Fleet Management , 1998, Transp. Sci..

[31] Warren B. Powell,et al. Models and Algorithms for Distribution Problems with Uncertain Demands , 1996, Transp. Sci..

[32] Warren B. Powell,et al. Dynamic-Programming Approximations for Stochastic Time-Staged Integer Multicommodity-Flow Problems , 2006, INFORMS J. Comput..

[33] Andrew G. Barto,et al. Reinforcement learning , 1998 .

[34] Warren B. Powell,et al. Sensitivity Analysis of a Dynamic Fleet Management Model Using Approximate Dynamic Programming , 2007, Oper. Res..

[35] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[36] Warren B. Powell,et al. An algorithm for approximating piecewise linear concave functions from sample gradients , 2003, Oper. Res. Lett..

[37] R. Wets,et al. Stochastic programming , 1989 .

[38] Warren B. Powell,et al. A Distributed Decision-Making Structure for Dynamic Resource Allocation Using Nonlinear Functional Approximations , 2005, Oper. Res..

[39] B. WETSt,et al. STOCHASTIC PROGRAMS WITH FIXED RECOURSE : THE EQUIVALENT DETERMINISTIC PROGRAM , 2022 .

[40] Warren B. Powell,et al. A Review of Sensitivity Results for Linear Networks and a New Approximation to Reduce the Effects of Degeneracy , 1989, Transp. Sci..

[41] Warren B. Powell,et al. Dynamic control of multicommodity fleet management problems , 1997 .

[42] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[43] Daniel Adelman,et al. A Price-Directed Approach to Stochastic Inventory/Routing , 2004, Oper. Res..