An UCT Approach for Anytime Agent-Based Planning

In this paper, we introduce a new heuristic search algorithm based on mean values for anytime planning, called MHSP. It consists in associating the principles of UCT, a bandit-based algorithm which gave very good results in computer games, and especially in Computer Go, with heuristic search in order to obtain an anytime planner that provides partial plans before finding a solution plan, and furthermore finding an optimal plan. The algorithm is evaluated in different classical planning problems and compared to some major planning algorithms. Finally, our results highlight the capacity of MHSP to return partial plans which tend to an optimal plan over the time.

[1]  Richard E. Korf,et al.  Real-Time Heuristic Search , 1990, Artif. Intell..

[2]  S. Angus,et al.  1 – EXPERIMENTAL RESULTS , 1985 .

[3]  Yixin Chen,et al.  Handling Soft Constraints and Goals Preferences in SGPlan∗ , 2006 .

[4]  Daniel S. Weld Recent Advances in AI Planning , 1999, AI Mag..

[5]  Ivan Serina,et al.  LPG: A Planner Based on Local Search for Planning Graphs with Action Costs , 2002, AIPS.

[6]  H. Jaap van den Herik,et al.  Progressive Strategies for Monte-Carlo Tree Search , 2008 .

[7]  Olivier Teytaud,et al.  Modification of UCT with Patterns in Monte-Carlo Go , 2006 .

[8]  Barbara Hayes-Roth,et al.  An Architecture for Adaptive Intelligent Systems , 1995, Artif. Intell..

[9]  Neil D. Lawrence,et al.  Missing Data in Kernel PCA , 2006, ECML.

[10]  Bernhard Nebel,et al.  Extending Planning Graphs to an ADL Subset , 1997, ECP.

[11]  Yixin Chen,et al.  Fast Planning by Search in Domain Transition Graph , 2008, AAAI.

[12]  Robert P. Goldman,et al.  Deliberation scheduling strategies for adaptive mission planning in real-time environments , 2005, AAAI Spring Symposium: Metacognition in Computation.

[13]  Blai Bonet,et al.  Planning as heuristic search , 2001, Artif. Intell..

[14]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[15]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[16]  Stéphane Grandcolas,et al.  Filtering, Decomposition and Search Space Reduction for Optimal Sequential Planning , 2007, AAAI.

[17]  Bart Selman,et al.  Unifying SAT-based and Graph-based Planning , 1999, IJCAI.

[18]  Bernhard Nebel,et al.  The FF Planning System: Fast Plan Generation Through Heuristic Search , 2011, J. Artif. Intell. Res..

[19]  Eric A. Hansen,et al.  Anytime Heuristic Search , 2011, J. Artif. Intell. Res..