论文信息 - Exponential Family PCA for Belief Compression in POMDPs

Exponential Family PCA for Belief Compression in POMDPs

Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are intractable for large models. The intractability of these algorithms is due to a great extent to their generating an optimal policy over the entire belief space. However, in real POMDP problems most belief states are unlikely, and there is a structured, low-dimensional manifold of plausible beliefs embedded in the high-dimensional belief space. We introduce a new method for solving large-scale POMDPs by taking advantage of belief space sparsity. We reduce the dimensionality of the belief space by exponential family Principal Components Analysis [1], which allows us to turn the sparse, high-dimensional belief space into a compact, low-dimensional representation in terms of learned features of the belief state. We then plan directly on the low-dimensional belief features. By planning in a low-dimensional space, we can find policies for POMDPs that are orders of magnitude larger than can be handled by conventional techniques. We demonstrate the use of this algorithm on a synthetic problem and also on a mobile robot navigation task.

Nicholas Roy | Geoffrey J. Gordon | N. Roy

[1] Sebastian Thrun,et al. Coastal Navigation with Mobile Robots , 1999, NIPS.

[2] Wolfram Burgard,et al. Probabilistic Algorithms and the Interactive Museum Tour-Guide Robot Minerva , 2000, Int. J. Robotics Res..

[3] Amos Storkey,et al. Advances in Neural Information Processing Systems 20 , 2007 .

[4] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..

[5] Geoffrey E. Hinton,et al. Global Coordination of Local Linear Models , 2001, NIPS.

[6] Sanjoy Dasgupta,et al. A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.

[7] I. Jolliffe. Principal Component Analysis , 2002 .

[8] Leslie Pack Kaelbling,et al. Acting under uncertainty: discrete Bayesian models for mobile-robot navigation , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[9] Sebastian Thrun,et al. Monte Carlo POMDPs , 1999, NIPS.

[10] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[11] Geoffrey J. Gordon. Generalized^2 Linear^2 Models , 2002, NIPS 2002.