A Linearly Relaxed Approximate Linear Program for Markov Decision Processes

Approximate linear programming (ALP) and its variants have been widely applied to Markov decision processes (MDPs) with a large number of states. A serious limitation of ALP is that it has an intractable number of constraints, as a result of which constraint approximations are of interest. In this paper, we define a linearly relaxed approximation linear program (LRALP) that has a tractable number of constraints, obtained as positive linear combinations of the original constraints of the ALP. The main contribution is a novel performance bound for LRALP.

[1]  P. Schweitzer,et al.  Generalized polynomial approximations in Markovian decision processes , 1985 .

[2]  D. J. White,et al.  A Survey of Applications of Markov Decision Processes , 1993 .

[3]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[4]  John Rust Using Randomization to Break the Curse of Dimensionality , 1997 .

[5]  John Rust Numerical dynamic programming in economics , 1996 .

[6]  Sean P. Meyn,et al.  Value iteration and optimization of multiclass queueing networks , 1999, Queueing Syst. Theory Appl..

[7]  John N. Tsitsiklis,et al.  A survey of computational complexity results in systems and control , 2000, Autom..

[8]  Dale Schuurmans,et al.  Direct value-approximation for factored MDPs , 2001, NIPS.

[9]  Csaba Szepesvári,et al.  Efficient approximate planning in continuous space Markovian Decision Problems , 2001, AI Commun..

[10]  A. Shwartz,et al.  Handbook of Markov decision processes : methods and applications , 2002 .

[11]  Shobha Venkataraman,et al.  Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..

[12]  Benjamin Van Roy,et al.  The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..

[13]  Milos Hauskrecht,et al.  Heuristic Refinements of Approximate Linear Programming for Factored Continuous-State Markov Decision Processes , 2004, ICAPS.

[14]  Benjamin Van Roy,et al.  On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming , 2004, Math. Oper. Res..

[15]  Yishay Mansour,et al.  A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.

[16]  Q. Hu,et al.  Markov decision processes with their applications , 2007 .

[17]  Groupe Pdmia Markov Decision Processes In Artificial Intelligence , 2009 .

[18]  Marek Petrik,et al.  Constraint relaxation in approximate linear programs , 2009, ICML '09.

[19]  Vivek F. Farias,et al.  A Smoothed Approximate Linear Program , 2009, NIPS.

[20]  Marek Petrik,et al.  Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes , 2010, ICML.

[21]  Jason Pazis,et al.  Non-Parametric Approximate Linear Programming for MDPs , 2011, AAAI.

[22]  U. Rieder,et al.  Markov Decision Processes with Applications to Finance , 2011 .

[23]  Vivek F. Farias,et al.  Non-parametric Approximate Dynamic Programming via the Kernel Method , 2012, NIPS.

[24]  Frank L. Lewis,et al.  Reinforcement Learning And Approximate Dynamic Programming For Feedback Control , 2016 .

[25]  Peter L. Bartlett,et al.  Linear Programming for Large-Scale Markov Decision Problems , 2014, ICML.

[26]  Hwee Pink Tan,et al.  Markov Decision Processes With Applications in Wireless Sensor Networks: A Survey , 2015, IEEE Communications Surveys & Tutorials.

[27]  Shalabh Bhatnagar,et al.  A Generalized Reduced Linear Program for Markov Decision Processes , 2015, AAAI.

[28]  Richard J. Boucherie,et al.  Markov decision processes in practice , 2017 .