Information Relaxations and Duality in Stochastic Dynamic Programs

We describe a general technique for determining upper bounds on maximal values (or lower bounds on minimal costs) in stochastic dynamic programs. In this approach, we relax the nonanticipativity constraints that require decisions to depend only on the information available at the time a decision is made and impose a “penalty” that punishes violations of nonanticipativity. In applications, the hope is that this relaxed version of the problem will be simpler to solve than the original dynamic program. The upper bounds provided by this dual approach complement lower bounds on values that may be found by simulating with heuristic policies. We describe the theory underlying this dual approach and establish weak duality, strong duality, and complementary slackness results that are analogous to the duality results of linear programming. We also study properties of good penalties. Finally, we demonstrate the use of this dual approach in an adaptive inventory control problem with an unknown and changing demand distribution and in valuing options with stochastic volatilities and interest rates. These are complex problems of significant practical interest that are quite difficult to solve to optimality. In these examples, our dual approach requires relatively little additional computation and leads to tight bounds on the optimal values.

[1]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[2]  L. Rogers Monte Carlo valuation of American options , 2002 .

[3]  S. Heston A Closed-Form Solution for Options with Stochastic Volatility with Applications to Bond and Currency Options , 1993 .

[4]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[5]  C. R. Sox,et al.  Adaptive Inventory Control for Nonstationary Demand and Partial Information , 2002, Manag. Sci..

[6]  Alexander Shapiro,et al.  Lectures on Stochastic Programming: Modeling and Theory , 2009 .

[7]  Nicola Secomandi,et al.  An Approximate Dynamic Programming Approach to Benchmark Practice-Based Heuristics for Natural Gas Storage Valuation , 2010, Oper. Res..

[8]  Benjamin Van Roy,et al.  On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming , 2004, Math. Oper. Res..

[9]  Daniel Adelman,et al.  Relaxations of Weakly Coupled Stochastic Dynamic Programs , 2008, Oper. Res..

[10]  P. Billingsley,et al.  Probability and Measure , 1980 .

[11]  R. Tyrrell Rockafellar,et al.  Scenarios and Policy Aggregation in Optimization Under Uncertainty , 1991, Math. Oper. Res..

[12]  L. C. G. Rogers,et al.  Pathwise Stochastic Optimal Control , 2007, SIAM J. Control. Optim..

[13]  Alexander Shapiro,et al.  Optimality and Duality in Stochastic Programming , 2003 .

[14]  Patrick Billingsley,et al.  Probability and Measure. , 1986 .

[15]  Paul Glasserman,et al.  Monte Carlo Methods in Financial Engineering , 2003 .

[16]  Martin B. Haugh,et al.  Pricing American Options: A Duality Approach , 2001, Oper. Res..

[17]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[18]  Mark Broadie,et al.  A Primal-Dual Simulation Algorithm for Pricing Multi-Dimensional American Options , 2001 .

[19]  M. Dufwenberg Game theory. , 2011, Wiley interdisciplinary reviews. Cognitive science.

[20]  Benjamin Van Roy,et al.  The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..

[21]  Warren B. Powell,et al.  Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics) , 2007 .

[22]  O. Scaillet,et al.  Pricing American Options under Stochastic Volatility and Stochastic Interest Rates , 2009 .