Solving Decentralized Continuous Markov Decision Problems with Structured Reward

We present an approximation method that solves a class of Decentralized hybrid Markov Decision Processes (DEC-HMDPs). These DEC-HMDPs have both discrete and continuous state variables and represent individual agents with continuous measurable state-space, such as resources. Adding to the natural complexity of decentralized problems, continuous state variables lead to a blowup in potential decision points. Representing value functions as Rectangular Piecewise Constant (RPWC) functions, we formalize and detail an extension to the Coverage Set Algorithm (CSA) [1] that solves transition independent DEC-HMDPs with controlled error. We apply our algorithm to a range of multi-robot exploration problems with continuous resource constraints.

[1]  Shlomo Zilberstein,et al.  Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.

[2]  François Charpillet,et al.  MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs , 2005, UAI.

[3]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[4]  Subbarao Kambhampati,et al.  Effective Approaches for Partial Satisfaction (Over-Subscription) Planning , 2004, AAAI.

[5]  Ronen I. Brafman,et al.  Planning with Continuous Resources in Stochastic Domains , 2005, IJCAI.

[6]  David E. Smith Choosing Objectives in Over-Subscription Planning , 2004, ICAPS.

[7]  Zhengzhu Feng,et al.  Dynamic Programming for Structured Continuous Markov Decision Problems , 2004, UAI.

[8]  David E. Smith,et al.  Planning Under Continuous Time and Resource Uncertainty: A Challenge for AI , 2002, AIPS Workshop on Planning for Temporal Domains.

[9]  Lihong Li,et al.  Lazy Approximation for Solving Continuous Finite-Horizon MDPs , 2005, AAAI.

[10]  François Charpillet,et al.  Point-based Dynamic Programming for DEC-POMDPs , 2006, AAAI.

[11]  Milind Tambe,et al.  A Fast Analytical Algorithm for Solving Markov Decision Processes with Real-Valued Resources , 2007, IJCAI.

[12]  John Amanatides,et al.  Merging BSP trees yields polyhedral set operations , 1990, SIGGRAPH.

[13]  Austin Tate,et al.  Proceedings of the Nineteenth National Conference on Artificial Intelligence, Sixteenth Conference on Innovative Applications of Artificial Intelligence, July 25-29, 2004, San Jose, California, USA , 2004, AAAI 2004.

[14]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[15]  Claudia V. Goldman,et al.  Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..