论文信息 - An online algorithm for constrained POMDPs

An online algorithm for constrained POMDPs

This work seeks to address the problem of planning in the presence of uncertainty and constraints. Such problems arise in many situations, including the basis of this work, which involves planning for a team of first responders (both humans and robots) operating in an urban environment. The problem is framed as a Partially-Observable Markov Decision Process (POMDP) with constraints, and it is shown that even in a relatively simple planning problem, modeling constraints as large penalties does not lead to good solutions. The main contribution of the work is a new online algorithm that explicitly ensures constraint feasibility while remaining computationally tractable. Its performance is demonstrated on an example problem and it is demonstrated that our online algorithm generates policies comparable to an offline constrained POMDP algorithm.

Jonathan P. How | Aditya Undurti

[1] Joshua B. Hurwltz. Real-Time Decision Making , 2001 .

[2] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[3] James T. Bartis,et al. Protecting Emergency Responders, Volume 3: Safety Management in Disaster and Terrorism Response , 2004 .

[4] Brahim Chaib-draa,et al. Real-Time Decision Making for Large POMDPs , 2005, Canadian Conference on AI.

[5] Joelle Pineau,et al. Online Planning Algorithms for POMDPs , 2008, J. Artif. Intell. Res..

[6] Dimitri P. Bertsekas,et al. Rollout Algorithms for Stochastic Scheduling Problems , 1999, J. Heuristics.

[7] Richard D. Braatz,et al. Piecewise Linear Dynamic Programming for Constrained POMDPs , 2008, AAAI.

[8] Derek Long,et al. Proceedings of the Twenty-third AAAI Conference on Artificial Intelligence and the Twentieth Innovative Applications of Artificial Intelligence Conference , 2008, AAAI 2008.

[9] Nikos A. Vlassis,et al. Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..

[10] David Hsu,et al. SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.