论文信息 - Revisiting Approximate Linear Programming Using a Saddle Point Based Reformulation and Root Finding Solution Approach

Revisiting Approximate Linear Programming Using a Saddle Point Based Reformulation and Root Finding Solution Approach

Approximate linear programs (ALPs) are well-known models for computing value function approximations (VFAs) for high dimensional Markov decision processes (MDPs) arising in business applications. VFAs from ALPs have desirable theoretical properties, define an operating policy, and provide a lower bound on the optimal policy cost, which can be used to assess the suboptimality of heuristic policies. However, solving ALPs near optimally remains challenging, for instance, in applications where the MDP includes cost functions or transition dynamics that are nonlinear or when rich basis functions are required to obtain a good VFA. We address this tension between ALP theory and solvability by (i) proposing a saddle point based reformulation of an ALP that endogenizes a state-action density function as a dual decision variable to avoid non-convexities, and (ii) developing a solution approach, ALP-Secant, that combines root finding and saddle point methods to solve this reformulation. We establish that ALP-Secant returns a near optimal ALP solution and a lower bound on the optimal policy cost with high probability in a finite number of iterations. We numerically compare ALP-Secant with the commonly used constraint sampling approach to solve ALP and a look-ahead heuristic on inventory control and energy storage applications, where using row generation is not a viable option. We find that ALP-Secant is more effective than constraint sampling for solving ALPs and delivers high quality policies and lower bounds, with its policies outperforming those from the other two heuristics. Our ALP reformulation and solution approach broaden the applicability of approximate linear programming.

Qihang Lin | Negar Soheili | Selvaprabu Nadarajah

[1] John C. Duchi. Introductory lectures on stochastic optimization , 2018, IAS/Park City Mathematics Series.

[2] Nicola Secomandi,et al. Relationship between least squares Monte Carlo and approximate linear programming , 2017, Oper. Res. Lett..

[3] Dmitriy Drusvyatskiy,et al. Level-set methods for convex optimization , 2016, Mathematical Programming.

[4] Dan Zhang,et al. Reductions of Approximate Linear Programs for Network Revenue Management , 2015, Oper. Res..

[5] Nicola Secomandi,et al. Relaxations of Approximate Linear Programs for the Real Option Management of Commodity Storage , 2015, Manag. Sci..

[6] James Renegar,et al. A Framework for Applying Subgradient Methods to Conic Optimization Problems , 2015, 1503.02611.

[7] Nicola Secomandi,et al. Real Options and Merchant Operations of Energy and Other Commodities , 2014, Found. Trends Technol. Inf. Oper. Manag..

[8] Sébastien Bubeck. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[9] Christiane Barz,et al. A Unifying Approximate Dynamic Programming Model for the Economic Lot Scheduling Problem , 2014, Math. Oper. Res..

[10] Andrea Zanella,et al. Optimal and Compact Control Policies for Energy Storage Units With Single and Multiple Batteries , 2014, IEEE Transactions on Smart Grid.

[11] Xin Chen,et al. Coordinating Inventory Control and Pricing Strategies for Perishable Products , 2014, Oper. Res..