A Bilinear Programming Approach for Multiagent Planning

Multiagent planning and coordination problems are common and known to be computationally hard. We show that a wide range of two-agent problems can be formulated as bilinear programs. We present a successive approximation algorithm that significantly outperforms the coverage set algorithm, which is the state-of-the-art method for this class of multiagent problems. Because the algorithm is formulated for bilinear programs, it is more general and simpler to implement. The new algorithm can be terminated at any time and-unlike the coverage set algorithm-it facilitates the derivation of a useful online performance bound. It is also much more efficient, on average reducing the computation time of the optimal solution by about four orders of magnitude. Finally, we introduce an automatic dimensionality reduction method that improves the effectiveness of the algorithm, extending its applicability to new domains and providing a new way to analyze a subclass of bilinear programs.

[1]  Makoto Yokoo,et al.  Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[2]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[3]  Marek Petrik,et al.  Anytime Coordination Using Separable Bilinear Programs , 2007, AAAI.

[4]  Makoto Yokoo,et al.  Communications for improving policy computation in distributed POMDPs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[5]  Shlomo Zilberstein,et al.  Formal models and algorithms for decentralized decision making under uncertainty , 2008, Autonomous Agents and Multi-Agent Systems.

[6]  Claudia V. Goldman,et al.  Communication-Based Decomposition Mechanisms for Decentralized MDPs , 2008, J. Artif. Intell. Res..

[7]  R. Horst,et al.  Global Optimization: Deterministic Approaches , 1992 .

[8]  Craig Boutilier,et al.  Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[9]  Jeffrey C. Trinkle,et al.  A complementarity approach to a quasistatic multi-rigid-body contact problem , 1996, Comput. Optim. Appl..

[10]  Makoto Yokoo,et al.  Exploiting Locality of Interaction in Networked Distributed POMDPs , 2006, AAAI Spring Symposium: Distributed Plan and Schedule Management.

[11]  Bernhard von Stengel,et al.  Fast algorithms for finding randomized strategies in game trees , 1994, STOC '94.

[12]  Kristin P. Bennett,et al.  Bilinear separation of two sets inn-space , 1993, Comput. Optim. Appl..

[13]  N. Zhang,et al.  Algorithms for partially observable markov decision processes , 2001 .

[14]  Richard W. Cottle,et al.  Linear Complementarity Problem , 2009, Encyclopedia of Optimization.

[15]  J. Júdice Linear complementarity, linear and nonlinear programming: Heldermann Verlag, 1988 , 1989 .

[16]  Victor R. Lesser,et al.  Decentralized Markov decision processes with event-driven interactions , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[17]  Hanif D. Sherali,et al.  A finitely convergent algorithm for bilinear programming problems using polar cuts and disjunctive face cuts , 1980, Math. Program..

[18]  G. W. Wornell,et al.  Decentralized control of a multiple access broadcast channel: performance bounds , 1996, Proceedings of 35th IEEE Conference on Decision and Control.

[19]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[20]  François Charpillet,et al.  Mixed Integer Linear Programming for Exact Finite-Horizon Planning in Decentralized Pomdps , 2007, ICAPS.

[21]  Claudia V. Goldman,et al.  Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..

[22]  Zvi Rosberg Optimal decentralized control in a multiaccess channel with partial information , 1983 .

[23]  Victor Lesser,et al.  Exploiting structure in decentralized markov decision processes , 2006 .

[24]  S. Zilberstein,et al.  Interaction Structure and Dimensionality in Decentralized ProblemSolving , 2008 .

[25]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[26]  Katta G. Murty,et al.  Linear complementarity, linear and nonlinear programming , 1988 .

[27]  Robert J. Vanderbei,et al.  Linear Programming: Foundations and Extensions , 1998, Kluwer international series in operations research and management service.

[28]  Jeff G. Schneider,et al.  Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[29]  Antonio J. Conejo,et al.  A practical approach to approximate bilinear functions in mathematical programming problems by using Schur's decomposition and SOS type 2 variables , 2006, J. Oper. Res. Soc..

[30]  Marek Petrik,et al.  Interaction Structure and Dimensionality Reduction in Decentralized MDPs , 2008, AAAI.

[31]  A. Rubinstein Modeling Bounded Rationality , 1998 .

[32]  Douglas J. White A linear programming approach to solving bilinear programmes , 1992, Math. Program..

[33]  Olvi L. Mangasarian,et al.  The linear complementarity problem as a separable bilinear program , 1995, J. Glob. Optim..

[34]  Claudia V. Goldman,et al.  Transition-independent decentralized markov decision processes , 2003, AAMAS '03.

[35]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[36]  E. D. Smith,et al.  Increased Flexibility and Robustness of Mars Rovers , 1999 .

[37]  Shlomo Zilberstein,et al.  Memory-Bounded Dynamic Programming for DEC-POMDPs , 2007, IJCAI.

[38]  Raghav Aras,et al.  Mathematical programming methods for decentralized POMDPs , 2008 .

[39]  Marek Petrik,et al.  Average-Reward Decentralized Markov Decision Processes , 2007, IJCAI.