Information Relaxation and Dual Formulation of Controlled Markov Diffusions

Information relaxation and duality in Markov decision processes have been studied recently by several researchers with the goal to derive dual bounds on the value function. In this paper we extend this dual formulation to controlled Markov diffusions: in a similar way we relax the constraint that the decision should be made based on the current information and impose a penalty to punish the access to the information in advance. We establish the weak duality, strong duality and complementary slackness results in a parallel way as those in Markov decision processes. We further explore the structure of the optimal penalties and expose the connection between the optimal penalties for Markov decision processes and controlled Markov diffusions. We demonstrate the use of this dual representation in a classic dynamic portfolio choice problem through a new class of penalties, which require little extra computation and produce small duality gap on the optimal value.

[1]  Mark Broadie,et al.  A Primal-Dual Simulation Algorithm for Pricing Multi-Dimensional American Options , 2001 .

[2]  Paul Glasserman,et al.  Robust Portfolio Control with Stochastic Factor Dynamics , 2012 .

[3]  Mark H. A. Davis Anticipative LQG Control , 1989 .

[4]  James E. Smith,et al.  Information Relaxations, Duality, and Convex Stochastic Dynamic Programs , 2014, Oper. Res..

[5]  Yang Wang,et al.  Fast Computation of Upper Bounds for American-Style Options Without Nested Simulation , 2010 .

[6]  R. C. Merton,et al.  Lifetime Portfolio Selection under Uncertainty: The Continuous-Time Case , 1969 .

[7]  K. Judd Numerical methods in economics , 1998 .

[8]  Ying He,et al.  Simulation-Based Algorithms for Markov Decision Processes , 2002 .

[9]  Peng Sun,et al.  Information Relaxations and Duality in Stochastic Dynamic Programs , 2010, Oper. Res..

[10]  R. C. Merton,et al.  Optimum consumption and portfolio rules in a continuous - time model Journal of Economic Theory 3 , 1971 .

[11]  Garleanu Nicolae,et al.  Dynamic Trading with Predictable Returns and Transaction Costs , 2009 .

[12]  Enlu Zhou,et al.  Fast estimation of true bounds on Bermudan option prices under jump-diffusion processes , 2013, 1305.4321.

[13]  Ciamac C. Moallemi,et al.  Dynamic Portfolio Choice with Linear Rebalancing Rules , 2015, Journal of Financial and Quantitative Analysis.

[14]  James E. Smith,et al.  Dynamic Portfolio Optimization with Transaction Costs: Heuristics and Dual Bounds , 2011, Manag. Sci..

[15]  Paul Glasserman,et al.  Monte Carlo Methods in Financial Engineering , 2003 .

[16]  Vijay V. Desai,et al.  Bounds for Markov Decision Processes , 2013 .

[17]  J. Cockcroft Investment in Science , 1962, Nature.

[18]  Jakša Cvitanić,et al.  Convex Duality in Constrained Portfolio Optimization , 1992 .

[19]  D. Belomestny,et al.  TRUE UPPER BOUNDS FOR BERMUDAN PRODUCTS VIA NON‐NESTED MONTE CARLO , 2009 .

[20]  Benjamin Van Roy,et al.  Control of Diffusions via Linear Programming , 2010 .

[21]  Enlu Zhou,et al.  Parameterized penalties in the dual representation of Markov decision processes , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[22]  Mark H. A. Davis,et al.  A Deterministic Approach To Stochastic Optimal Control With Application To Anticipative Control , 1992 .

[23]  L. C. G. Rogers,et al.  Pathwise Stochastic Optimal Control , 2007, SIAM J. Control. Optim..

[24]  D. Nualart The Malliavin Calculus and Related Topics , 1995 .

[25]  Stephen P. Boyd Plenary talk: Performance bounds and suboptimal policies for multi-period investment , 2013, MED.

[26]  Warren B. Powell,et al.  “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.

[27]  Sripad K. Devalkar Essays in Optimization of Commodity Procurement, Processing and Trade Operations. , 2011 .

[28]  R. C. Merton,et al.  Optimum Consumption and Portfolio Rules in a Continuous-Time Model* , 1975 .

[29]  Jun Liu Portfolio Selection in Stochastic Environments , 2007 .

[30]  George Tauchen,et al.  Quadrature-Based Methods for Obtaining Approximate Solutions to Nonlinear Asset Pricing Models , 1991 .

[31]  Nicola Secomandi,et al.  An Approximate Dynamic Programming Approach to Benchmark Practice-Based Heuristics for Natural Gas Storage Valuation , 2010, Oper. Res..

[32]  Pierluigi Balduzzi,et al.  Transaction costs and predictability: some utility cost calculations , 1999 .

[33]  David Lamper,et al.  Monte Carlo valuation of American Options , 2004 .

[34]  Paul Glasserman,et al.  Robust Portfolio Control with Stochastic Factor Dynamics , 2012, Oper. Res..

[35]  Mark H. A. Davis,et al.  Anticipative stochastic control , 1991, [1991] Proceedings of the 30th IEEE Conference on Decision and Control.

[36]  Mihail Zervos,et al.  A new proof of the discrete-time LQG optimal control theorems , 1995, IEEE Trans. Autom. Control..

[37]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[38]  J. Quadrat Numerical methods for stochastic control problems in continuous time , 1994 .

[39]  D. Ocone,et al.  A generalized Itô-Ventzell formula. Application to a class of anticipating stochastic differential equations , 1989 .

[40]  Benjamin Van Roy,et al.  The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..

[41]  Fernando Zapatero,et al.  Monte Carlo computation of optimal portfolios in complete markets , 2003 .

[42]  Neil D. Pearson,et al.  Consumption and portfolio policies with incomplete markets and short-sale constraints: The infinite dimensional case , 1991 .

[43]  Jiang Wang,et al.  Evaluating Portfolio Policies: A Duality Approach , 2003, Oper. Res..

[44]  Andrew E. B. Lim,et al.  Linear-quadratic control and information relaxations , 2012, Oper. Res. Lett..

[45]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[46]  Dimitri P. Bertsekas,et al.  Dynamic programming and optimal control, 3rd Edition , 2005 .

[47]  P. Samuelson LIFETIME PORTFOLIO SELECTION BY DYNAMIC STOCHASTIC PROGRAMMING , 1969 .

[48]  Martin B. Haugh,et al.  Pricing American Options: A Duality Approach , 2001, Oper. Res..

[49]  W. Fleming,et al.  Controlled Markov processes and viscosity solutions , 1992 .