Linear support for multi-objective coordination graphs

Many real-world decision problems require making trade-offs among multiple objectives. However, in some cases, the relative importance of these objectives is not known when the problem is solved, precluding the use of single-objective methods. Instead, multi-objective methods, which compute the set of all potentially useful solutions, are required. This paper proposes variable elimination linear support (VELS), a new multi-objective algorithm for multi-agent coordination that exploits loose couplings to compute the convex coverage set (CCS): the set of optimal solutions for all possible weights for linearly weighted objectives. Unlike existing methods, VELS exploits insights from POMDP solution methods to build the CCS incrementally. We prove the correctness of VELS and show that for moderate numbers of objectives its complexity is better than that of previous methods. Furthermore, we present empirical results showing that VELS can tackle both random and realistic problems with many more agents than was previously feasible. The incremental nature of VELS also makes it an anytime algorithm, i.e., its intermediate results constitute e-optimal approximations of the CCS, with e decreasing the longer it runs. Our empirical results show that, by allowing even very small e, VELS can enable large additional speedups.

[1]  Shlomo Zilberstein,et al.  Region-Based Incremental Pruning for POMDPs , 2004, UAI.

[2]  Claudia V. Goldman,et al.  Transition-independent decentralized markov decision processes , 2003, AAMAS '03.

[3]  Shimon Whiteson,et al.  Computing Convex Coverage Sets for Multi-objective Coordination Graphs , 2013, ADT.

[4]  Susan A. Murphy,et al.  Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis , 2010, ICML.

[5]  Nicholas R. Jennings,et al.  Bounded decentralised coordination over multiple objectives , 2011, AAMAS.

[6]  P. McMullen The maximum numbers of faces of a convex polytope , 1970 .

[7]  Javier Larrosa,et al.  Bucket elimination for multiobjective optimization problems , 2006, J. Heuristics.

[8]  Claudia V. Goldman,et al.  Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..

[9]  Evan Dekker,et al.  Empirical evaluation methods for multiobjective reinforcement learning algorithms , 2011, Machine Learning.

[10]  Carlos Guestrin,et al.  Multiagent Planning with Factored MDPs , 2001, NIPS.

[11]  Rina Dechter,et al.  The Relationship Between AND/OR Search and Variable Elimination , 2005, UAI.

[12]  John N. Tsitsiklis,et al.  Introduction to linear optimization , 1997, Athena scientific optimization and computation series.

[13]  Luc Devroye,et al.  Estimating the number of vertices of a polyhedron , 2000, Inf. Process. Lett..

[14]  N. Zhang,et al.  Algorithms for partially observable markov decision processes , 2001 .

[15]  Shimon Whiteson,et al.  Multi-objective variable elimination for collaborative graphical games , 2013, AAMAS.

[16]  Suzana Dragicevic,et al.  GIS and Intelligent Agents for Multiobjective Natural Resource Allocation: A Reinforcement Learning Approach , 2009, Trans. GIS.

[17]  David Levine,et al.  Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning , 2007, NIPS.

[18]  Nikos A. Vlassis,et al.  Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..

[19]  Shimon Whiteson,et al.  A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..

[20]  Nic Wilson,et al.  Multi-objective Influence Diagrams , 2012, UAI.

[21]  Patrice Perny,et al.  Choquet Optimization Using GAI Networks for Multiagent/Multicriteria Decision-Making , 2009, ADT.

[22]  Emma Rollón,et al.  Multi-objective optimization in graphical models , 2008 .

[23]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[24]  Peichen Gong,et al.  Multiobjective dynamic programming for forest resource management , 1992 .

[25]  Francesco Maria,et al.  Multi-objective Decentralised Coordination for Teams of Robotic Agents , 2011 .

[26]  Srini Narayanan,et al.  Learning all optimal policies with multiple criteria , 2008, ICML '08.

[27]  Marc E. Pfetsch,et al.  Some Algorithmic Problems in Polytope Theory , 2003, Algebra, Geometry, and Software Systems.

[28]  Günter M. Ziegler,et al.  Basic Properties of Convex Polytopes , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[29]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[30]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[31]  Shimon Whiteson,et al.  Bounded Approximations for Linear Multi-Objective Planning Under Uncertainty , 2014, ICAPS.