Chance-constrained formulation of MDPs under total reward criteria: an application to advertisement model

We consider a constrained Markov decision process (CMDP) with total reward criterion under random reward and cost parameters, and known transition probabilities. The CMDP problem is formulated as a joint chance-constrained Markov decision process (JCCMDP) which captures the situation where the decision maker is interested in the payoff function that is obtained with certain confidence and the random constraints are jointly satisfied with a given probability level. When the reward and cost vectors follow elliptically symmetric distributions and dependence among constraints is driven by a Gumbel-Hougaard copula, we show that the upper and lower bounds to the optimal value of the JCCMDP problem is given by the optimal values of two second-order cone programming problems. As an application, we consider a budget optimization of the advertising platforms and perform numerical experiments on randomly generated instances.

[1]  A. Lisser,et al.  Joint chance-constrained Markov decision processes , 2022, Annals of Operations Research.

[2]  A. Lisser,et al.  Constrained Markov decision processes with uncertain costs , 2022, Oper. Res. Lett..

[3]  Marco Pavone,et al.  Risk aversion in finite Markov Decision Processes using total cost criteria and average value at risk , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Seyedshams Feyzabadi,et al.  HCMDP: A hierarchical solution to Constrained Markov Decision Processes , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Daniel Kuhn,et al.  Robust Markov Decision Processes , 2013, Math. Oper. Res..

[6]  Vahab S. Mirrokni,et al.  Budget Optimization for Online Campaigns with Positive Carryover Effects , 2012, WINE.

[7]  Abdel Lisser,et al.  A second-order cone programming approach for linear programs with joint probabilistic constraints , 2012, Oper. Res. Lett..

[8]  Craig Boutilier,et al.  Robust Online Optimization of Reward-Uncertain MDPs , 2011, IJCAI.

[9]  George L. Nemhauser,et al.  An integer programming approach for linear programs with probabilistic constraints , 2007, Math. Program..

[10]  Laurent El Ghaoui,et al.  Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..

[11]  E. Altman Constrained Markov Decision Processes , 1999 .

[12]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[13]  S. Kotz,et al.  Symmetric Multivariate and Related Distributions , 1989 .

[14]  M. Haugh,et al.  An Introduction to Copulas , 2016 .

[15]  Queue Erim Kardes A Robust Constrained Markov Decision Process Model for Admission Control in a Single Server , 2013 .

[16]  Ahmed Syed Irshad,et al.  Markov Decision Process , 2011 .

[17]  Shie Mannor,et al.  Percentile Optimization for Markov Decision Processes with Parameter Uncertainty , 2010, Oper. Res..

[18]  Fabrizio Durante,et al.  Copula Theory and Its Applications , 2010 .

[19]  R. Nelsen An Introduction to Copulas (Springer Series in Statistics) , 2006 .

[20]  A. Nemirovski,et al.  Lectures on modern convex optimization - analysis, algorithms, and engineering applications , 2001, MPS-SIAM series on optimization.