论文信息 - An Online POMDP Algorithm Used by the PoliceForce Agents in the RoboCupRescue Simulation

An Online POMDP Algorithm Used by the PoliceForce Agents in the RoboCupRescue Simulation

In the RoboCupRescue simulation, the PoliceForce agents have to decide which roads to clear to help other agents to navigate in the city. In this article, we present how we have modelled their environment as a POMDP and more importantly we present our new online POMDP algorithm enabling them to make good decisions in real-time during the simulation. Our algorithm is based on a look-ahead search to find the best action to execute at each cycle. We thus avoid the overwhelming complexity of computing a policy for each possible situation. To show the efficiency of our algorithm, we present some results on standard POMDPs and in the RoboCupRescue simulation environment.

[1] Hiroaki Kitano,et al. RoboCup Rescue: a grand challenge for multi-agent systems , 2000, Proceedings Fourth International Conference on MultiAgent Systems.

[2] D. Aberdeen,et al. A ( Revised ) Survey of Approximate Methods for Solving Partially Observable Markov Decision Processes , 2003 .

[3] P. Poupart. Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .

[4] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.

[5] Craig Boutilier,et al. Stochastic Local Search for POMDP Controllers , 2004, AAAI.

[6] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[7] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.

[8] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[9] Hector Geffner,et al. Solving Large POMDPs using Real Time Dynamic Programming , 1998 .

[10] Craig Boutilier,et al. Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.

[11] Nikos A. Vlassis,et al. A point-based POMDP algorithm for robot planning , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[12] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..