论文信息 - Real-Time Solving of Quantified CSPs Based on Monte-Carlo Game Tree Search

Real-Time Solving of Quantified CSPs Based on Monte-Carlo Game Tree Search

We develop a real-time algorithm based on a Monte-Carlo game tree search for solving a quantified constraint satisfaction problem (QCSP), which is a CSP where some variables are universally quantified. A universally quantified variable represents a choice of nature or an adversary. The goal of a QCSP is to make a robust plan against an adversary. However, obtaining a complete plan off-line is intractable when the size of the problem becomes large. Thus, we need to develop a realtime algorithmthat sequentially selects a promising value at each deadline. Such a problem has been considered in the field of game tree search. In a standard game tree search algorithm, developing a good static evaluation function is crucial. However, developing a good static evaluation function for a QCSP is very difficult since it must estimate the possibility that a partially assigned QCSP is solvable. Thus, we apply a Monte-Carlo game tree search technique called UCT. However, the simple application of the UCT algorithm does not work since the player and the adversary are asymmetric, i.e., finding a game sequence where the player wins is very rare. We overcome this difficulty by introducing constraint propagation techniques. We experimentally compare the winning probability of our UCT-based algorithm and the state-of-the-art alpha-beta search algorithm. Our results show that our algorithm outperforms the state-of-the-art algorithm in large-scale problems.

[1] Milind Tambe,et al. Taking DCOP to the real world: efficient complete solutions for distributed multi-event scheduling , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[2] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[3] Hubie Chen. The Computational Complexity of Quantified Constraint Satisfaction , 2004 .

[4] Makoto Yokoo,et al. The distributed breakout algorithms , 2005, Artif. Intell..

[5] Roman Barták,et al. Constraint Processing , 2009, Encyclopedia of Artificial Intelligence.

[6] Meritxell Vinyals,et al. Divide-and-coordinate: DCOPs by agreement , 2010, AAMAS.

[7] Tommi S. Jaakkola,et al. Clusters and Coarse Partitions in LP Relaxations , 2008, NIPS.

[8] Nikos Komodakis,et al. MRF Optimization via Dual Decomposition: Message-Passing Revisited , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9] Martin J. Wainwright,et al. Message-passing for graph-structured linear programs: proximal projections, convergence and rounding schemes , 2008, ICML '08.

[10] Kenneth N. Brown,et al. Realtime Online Solving of Quantified CSPs , 2009, CP.

[11] Judea Pearl,et al. Probabilistic reasoning in intelligent systems , 1988 .

[12] Fahiem Bacchus,et al. Solution Directed Backjumping for QCSP , 2007, CP.

[13] Milind Tambe,et al. Quality Guarantees on k-Optimal Solutions for Distributed Constraint Optimization Problems , 2007, IJCAI.

[14] Makoto Yokoo,et al. The Distributed Constraint Satisfaction Problem: Formalization and Algorithms , 1998, IEEE Trans. Knowl. Data Eng..

[15] Makoto Yokoo,et al. Adopt: asynchronous distributed constraint optimization with quality guarantees , 2005, Artif. Intell..

[16] Martin J. Wainwright,et al. MAP estimation via agreement on (hyper)trees: Message-passing and linear programming , 2005, ArXiv.

[17] Dmitry M. Malioutov,et al. Lagrangian Relaxation for MAP Estimation in Graphical Models , 2007, ArXiv.

[18] Weixiong Zhang,et al. Distributed stochastic search and distributed breakout: properties, comparison and applications to constraint optimization problems in sensor networks , 2005, Artif. Intell..

[19] Yair Weiss,et al. Linear Programming Relaxations and Belief Propagation - An Empirical Study , 2006, J. Mach. Learn. Res..

[20] Martin J. Wainwright,et al. Tree consistency and bounds on the performance of the max-product algorithm and its generalizations , 2004, Stat. Comput..

[21] Boi Faltings,et al. Distributed constraint optimization with structured resource constraints , 2009, AAMAS.

[22] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[23] D. Sontag. 1 Introduction to Dual Decomposition for Inference , 2010 .

[24] Nicholas R. Jennings,et al. Decentralised coordination of low-power embedded devices using the max-sum algorithm , 2008, AAMAS.

[25] Tomás Werner,et al. What is decreased by the max-sum arc consistency algorithm? , 2007, ICML '07.

[26] A. Petcu,et al. Optimal Solution Stability in Dynamic, Distributed Constraint Optimization , 2007, 2007 IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT'07).

[27] Stephen Fitzpatrick,et al. Distributed Coordination through Anarchic Optimization , 2003 .

[28] Yngvi Björnsson,et al. Simulation-Based Approach to General Game Playing , 2008, AAAI.

[29] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[30] Sven Koenig,et al. Incremental DCOP search algorithms for solving dynamic DCOPs , 2011, AAMAS.

[31] Kostas Stergiou,et al. QCSP-Solve: A Solver for Quantified Constraint Satisfaction Problems , 2005, IJCAI.

[32] Boi Faltings,et al. A Scalable Method for Multiagent Constraint Optimization , 2005, IJCAI.

[33] Tommi S. Jaakkola,et al. Tightening LP Relaxations for MAP using Message Passing , 2008, UAI.

[34] Makoto Yokoo,et al. Coalition Structure Generation based on Distributed Constraint Optimization , 2010, AAAI.

[35] Tommi S. Jaakkola,et al. Fixing Max-Product: Convergent Message Passing Algorithms for MAP LP-Relaxations , 2007, NIPS.

[36] Sven Koenig,et al. BnB-ADOPT: an asynchronous branch-and-bound DCOP algorithm , 2008, AAMAS.

[37] Pradeep Ravikumar,et al. Quadratic programming relaxations for metric labeling and Markov random field MAP estimation , 2006, ICML.

[38] Peter William Nightingale,et al. Consistency and the Quantified Constraint Satisfaction Problem , 2007 .

[39] Eric Monfroy,et al. Beyond NP: Arc-Consistency for Quantified Constraints , 2002, CP.

[40] Peter Nightingale. Consistency and the quantiﬁed constraint satisfaction problem , 2007 .

[41] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[42] B. Faltings,et al. Superstabilizing , Fault-containing Multiagent Combinatorial Optimization , 2022 .

[43] Victor R. Lesser,et al. Solving distributed constraint optimization problems using cooperative mediation , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[44] Alan K. Mackworth. Constraint Satisfaction , 1985 .

[45] Vladimir Kolmogorov,et al. Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46] Tomás Werner,et al. A Linear Programming Approach to Max-Sum Problem: A Review , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47] Milind Tambe,et al. Quality guarantees for region optimal DCOP algorithms , 2011, AAMAS.

[48] Makoto Yokoo,et al. Cooperative problem solving against adversary: quantified distributed constraint satisfaction problem , 2010, AAMAS.

[49] Robert J. McEliece,et al. Iterative min-sum decoding of tail-biting codes , 1998, 1998 Information Theory Workshop (Cat. No.98EX131).

[50] Shlomo Zilberstein,et al. MAP Estimation for Graphical Models by Likelihood Maximization , 2010, NIPS.

[51] Milind Tambe,et al. Asynchronous algorithms for approximate distributed constraint optimization with quality bounds , 2010, AAMAS.