论文信息 - Influence and variance of a Markov chain: application to adaptive discretization in optimal control

Influence and variance of a Markov chain: application to adaptive discretization in optimal control

This paper addresses the difficult problem of deciding where to refine the resolution of adaptive discretizations for solving continuous time-and-space deterministic optimal control problems. We introduce two measures, influence and variance of a Markov chain. Influence measures the extent to which changes of some state affect the value function at other states. Variance measures the heterogeneity of the future cumulated active rewards (whose mean is the value function). We combine these two measures to derive a nonlocal efficient splitting criterion that takes into account the impact of a state on other states when deciding whether to split. We illustrate this method on the non-linear, two dimensional "Car on the Hill" and the 4d "space-shuttle" and "airplane-meeting" control problems.

R. Munos | A. Moore

[1] J. Ross Quinlan,et al. Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .

[2] W. Fleming,et al. Controlled Markov processes and viscosity solutions , 1992 .

[3] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .

[4] A. Michael,et al. A Linear Programming Approach toSolving Stochastic Dynamic Programs , 1993 .

[5] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[6] Michael A. Trick,et al. A Linear Programming Approach to Solving Stochastic Dynamic Programming , 1993 .

[7] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[8] Tyrone E. Duncan,et al. Numerical Methods for Stochastic Control Problems in Continuous Time (Harold J. Kushner and Paul G. Dupuis) , 1994, SIAM Rev..

[9] J. Quadrat. Numerical methods for stochastic control problems in continuous time , 1994 .

[10] Andrew W. Moore,et al. Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems , 1999, IJCAI.

[11] Geoffrey J. Gordon,et al. Approximate solutions to markov decision processes , 1999 .

[12] H. Kushner. Numerical Methods for Stochastic Control Problems in Continuous Time , 2000 .