论文信息 - Robustness in Markov Decision Problems with Uncertain Transition Matrices

Robustness in Markov Decision Problems with Uncertain Transition Matrices

Optimal solutions to Markov Decision Problems (MDPs) are very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of those probabilities is far from accurate. Hence, estimation errors are limiting factors in applying MDPs to real-world problems. We propose an algorithm for solving finite-state and finite-action MDPs, where the solution is guaranteed to be robust with respect to estimation errors on the state transition probabilities. Our algorithm involves a statistically accurate yet numerically efficient representation of uncertainty, via Kullback-Leibler divergence bounds. The worst-case complexity of the robust algorithm is the same as the original Bellman recursion. Hence, robustness can be added at practically no extra computing cost.

Laurent El Ghaoui | Arnab Nilim | L. Ghaoui | A. Nilim

[1] E. L. Lehmann,et al. Theory of point estimation , 1950 .

[2] J. K. Satia,et al. Markovian Decision Processes with Uncertain Transition Probabilities , 1973, Oper. Res..

[3] T. Ferguson. Prior Distributions on Spaces of Probability Measures , 1974 .

[4] Chelsea C. White,et al. Markov Decision Processes with Imprecise Transition Probabilities , 1994, Oper. Res..

[5] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[6] Robert Givan,et al. Bounded Parameter Markov Decision Processes , 1997, ECP.

[7] A. Shwartz,et al. Handbook of Markov decision processes : methods and applications , 2002 .

[8] Alexander Shapiro,et al. Minimax analysis of stochastic problems , 2002, Optim. Methods Softw..

[9] Eugene A. Feinberg,et al. Handbook of Markov Decision Processes , 2002 .

[10] C. Macdonald. Casella , 2004, Tempo.

[11] Laurent El Ghaoui,et al. Robust Solutions to Markov Decision Problems with Uncertain Transition Matrices , 2005 .