Real-Time Solving of Quantified CSPs Based on Monte-Carlo Game Tree Search

We develop a real-time algorithm based on a Monte-Carlo game tree search for solving a quantified constraint satisfaction problem (QCSP), which is a CSP where some variables are universally quantified. A universally quantified variable represents a choice of nature or an adversary. The goal of a QCSP is to make a robust plan against an adversary. However, obtaining a complete plan off-line is intractable when the size of the problem becomes large. Thus, we need to develop a realtime algorithmthat sequentially selects a promising value at each deadline. Such a problem has been considered in the field of game tree search. In a standard game tree search algorithm, developing a good static evaluation function is crucial. However, developing a good static evaluation function for a QCSP is very difficult since it must estimate the possibility that a partially assigned QCSP is solvable. Thus, we apply a Monte-Carlo game tree search technique called UCT. However, the simple application of the UCT algorithm does not work since the player and the adversary are asymmetric, i.e., finding a game sequence where the player wins is very rare. We overcome this difficulty by introducing constraint propagation techniques. We experimentally compare the winning probability of our UCT-based algorithm and the state-of-the-art alpha-beta search algorithm. Our results show that our algorithm outperforms the state-of-the-art algorithm in large-scale problems.

[1]  Milind Tambe,et al.  Taking DCOP to the real world: efficient complete solutions for distributed multi-event scheduling , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[2]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[3]  Hubie Chen The Computational Complexity of Quantified Constraint Satisfaction , 2004 .

[4]  Makoto Yokoo,et al.  The distributed breakout algorithms , 2005, Artif. Intell..

[5]  Roman Barták,et al.  Constraint Processing , 2009, Encyclopedia of Artificial Intelligence.

[6]  Meritxell Vinyals,et al.  Divide-and-coordinate: DCOPs by agreement , 2010, AAMAS.

[7]  Tommi S. Jaakkola,et al.  Clusters and Coarse Partitions in LP Relaxations , 2008, NIPS.

[8]  Nikos Komodakis,et al.  MRF Optimization via Dual Decomposition: Message-Passing Revisited , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  Martin J. Wainwright,et al.  Message-passing for graph-structured linear programs: proximal projections, convergence and rounding schemes , 2008, ICML '08.

[10]  Kenneth N. Brown,et al.  Realtime Online Solving of Quantified CSPs , 2009, CP.

[11]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[12]  Fahiem Bacchus,et al.  Solution Directed Backjumping for QCSP , 2007, CP.

[13]  Milind Tambe,et al.  Quality Guarantees on k-Optimal Solutions for Distributed Constraint Optimization Problems , 2007, IJCAI.

[14]  Makoto Yokoo,et al.  The Distributed Constraint Satisfaction Problem: Formalization and Algorithms , 1998, IEEE Trans. Knowl. Data Eng..

[15]  Makoto Yokoo,et al.  Adopt: asynchronous distributed constraint optimization with quality guarantees , 2005, Artif. Intell..

[16]  Martin J. Wainwright,et al.  MAP estimation via agreement on (hyper)trees: Message-passing and linear programming , 2005, ArXiv.

[17]  Dmitry M. Malioutov,et al.  Lagrangian Relaxation for MAP Estimation in Graphical Models , 2007, ArXiv.

[18]  Weixiong Zhang,et al.  Distributed stochastic search and distributed breakout: properties, comparison and applications to constraint optimization problems in sensor networks , 2005, Artif. Intell..

[19]  Yair Weiss,et al.  Linear Programming Relaxations and Belief Propagation - An Empirical Study , 2006, J. Mach. Learn. Res..

[20]  Martin J. Wainwright,et al.  Tree consistency and bounds on the performance of the max-product algorithm and its generalizations , 2004, Stat. Comput..

[21]  Boi Faltings,et al.  Distributed constraint optimization with structured resource constraints , 2009, AAMAS.

[22]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[23]  D. Sontag 1 Introduction to Dual Decomposition for Inference , 2010 .

[24]  Nicholas R. Jennings,et al.  Decentralised coordination of low-power embedded devices using the max-sum algorithm , 2008, AAMAS.

[25]  Tomás Werner,et al.  What is decreased by the max-sum arc consistency algorithm? , 2007, ICML '07.

[26]  A. Petcu,et al.  Optimal Solution Stability in Dynamic, Distributed Constraint Optimization , 2007, 2007 IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT'07).

[27]  Stephen Fitzpatrick,et al.  Distributed Coordination through Anarchic Optimization , 2003 .

[28]  Yngvi Björnsson,et al.  Simulation-Based Approach to General Game Playing , 2008, AAAI.

[29]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[30]  Sven Koenig,et al.  Incremental DCOP search algorithms for solving dynamic DCOPs , 2011, AAMAS.

[31]  Kostas Stergiou,et al.  QCSP-Solve: A Solver for Quantified Constraint Satisfaction Problems , 2005, IJCAI.

[32]  Boi Faltings,et al.  A Scalable Method for Multiagent Constraint Optimization , 2005, IJCAI.

[33]  Tommi S. Jaakkola,et al.  Tightening LP Relaxations for MAP using Message Passing , 2008, UAI.

[34]  Makoto Yokoo,et al.  Coalition Structure Generation based on Distributed Constraint Optimization , 2010, AAAI.

[35]  Tommi S. Jaakkola,et al.  Fixing Max-Product: Convergent Message Passing Algorithms for MAP LP-Relaxations , 2007, NIPS.

[36]  Sven Koenig,et al.  BnB-ADOPT: an asynchronous branch-and-bound DCOP algorithm , 2008, AAMAS.

[37]  Pradeep Ravikumar,et al.  Quadratic programming relaxations for metric labeling and Markov random field MAP estimation , 2006, ICML.

[38]  Peter William Nightingale,et al.  Consistency and the Quantified Constraint Satisfaction Problem , 2007 .

[39]  Eric Monfroy,et al.  Beyond NP: Arc-Consistency for Quantified Constraints , 2002, CP.

[40]  Peter Nightingale Consistency and the quantified constraint satisfaction problem , 2007 .

[41]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[42]  B. Faltings,et al.  Superstabilizing , Fault-containing Multiagent Combinatorial Optimization , 2022 .

[43]  Victor R. Lesser,et al.  Solving distributed constraint optimization problems using cooperative mediation , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[44]  Alan K. Mackworth Constraint Satisfaction , 1985 .

[45]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Tomás Werner,et al.  A Linear Programming Approach to Max-Sum Problem: A Review , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Milind Tambe,et al.  Quality guarantees for region optimal DCOP algorithms , 2011, AAMAS.

[48]  Makoto Yokoo,et al.  Cooperative problem solving against adversary: quantified distributed constraint satisfaction problem , 2010, AAMAS.

[49]  Robert J. McEliece,et al.  Iterative min-sum decoding of tail-biting codes , 1998, 1998 Information Theory Workshop (Cat. No.98EX131).

[50]  Shlomo Zilberstein,et al.  MAP Estimation for Graphical Models by Likelihood Maximization , 2010, NIPS.

[51]  Milind Tambe,et al.  Asynchronous algorithms for approximate distributed constraint optimization with quality bounds , 2010, AAMAS.