The hierarchical task network planning method based on Monte Carlo Tree Search

Abstract Since the hierarchical task network (HTN) planning depends on the domain knowledge of the problem, the planning result relies on the writing order of the decomposition method. Besides, the solution obtained by planning is usually a general feasible solution, which means there are shortcomings in the ability of finding the optimal solution. In order to reduce the dependence of HTN planning on domain knowledge and obtain a better planning solution, Pyhop-m, an HTN planning algorithm based on Monte Carlo Tree Search(MCTS) is proposed. In the planning process, a planning tree is built by MCTS to guide the HTN planner to choose the best decomposition method. Experiments illustrates that whether in the static or dynamic environment, Pyhop-m is superior to the existing Pyhop and heuristic-based Pyhop-h in plan length, planning success rate and optimal solution rate. Under the 95% confidence level, the confidence intervals of Pyhop-m algorithm to achieve the planning success rate and the optimal solution rate in the dynamic environment are [75.82%,89.18%] and [88.67%,93.95%], which are significantly higher than those of Pyhop-h with [58.19%,77.81%] and [69.91%,80.69%], respectively. Moreover, it can solve the planning problem with uncertain action executions by repeatedly simulating and evaluating the leaf nodes of the planning tree. It can be concluded that Pyhop-m can not only make the planning result independent of the writing order of the decomposition methods, but also search out the global optimal solution.

[1]  Ron Alford,et al.  Search Complexities for HTN Planning , 2016, KI - Künstliche Intelligenz.

[2]  Mei Yang,et al.  Adaptive CGF Commander Behavior Modeling Through HTN Guided Monte Carlo Tree Search , 2018 .

[3]  Ugur Kuter,et al.  Combining Heuristic Search with Hierarchical Task-Network Planning: A Preliminary Report , 2008, FLAIRS.

[4]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[5]  Dana S. Nau,et al.  Integrating Acting, Planning and Learning in Hierarchical Operational Models , 2020, ICAPS.

[6]  Dana S. Nau,et al.  SHOP2: An HTN Planning System , 2003, J. Artif. Intell. Res..

[7]  Humbert Fiorino,et al.  HDDL: An Extension to PDDL for Expressing Hierarchical Planning Problems , 2020, AAAI.

[8]  Earl D. Sacerdoti,et al.  The Nonlinear Nature of Plans , 1975, IJCAI.

[9]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[10]  Ivan Serina,et al.  Planning Through Stochastic Local Search and Temporal Action Graphs in LPG , 2003, J. Artif. Intell. Res..

[11]  Susanne Biundo-Stephan,et al.  On Guiding Search in HTN Planning with Classical Planning Heuristics , 2019, IJCAI.

[12]  Paolo Traverso,et al.  Blended Planning and Acting: Preliminary Approach, Research Challenges , 2015, AAAI.

[13]  Liu Wu,et al.  Improving hierarchical task network planning performance by the use of domain-independent heuristic search , 2017, Knowl. Based Syst..

[14]  Susanne Biundo-Stephan,et al.  Finding Optimal Solutions in HTN Planning - A SAT-based Approach , 2019, IJCAI.

[15]  Okhtay Ilghami Documentation for JSHOP2 , 2006 .

[16]  Marco Aiello,et al.  HTN planning: Overview, comparison, and beyond , 2015, Artif. Intell..

[17]  Paolo Traverso,et al.  Acting and Planning Using Operational Models , 2019, AAAI.

[18]  Bruno Volckaert,et al.  Dynamic Composition of Semantically Annotated Web Services through QoS-Aware HTN Planning Algorithms , 2009, 2009 Fourth International Conference on Internet and Web Applications and Services.

[19]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[20]  Alfonso Gerevini,et al.  Combining Domain-Independent Planning and HTN Planning: The Duet Planner , 2008, ECAI.

[21]  Glen Robertson,et al.  A Review of Real-Time Strategy Game AI , 2014, AI Mag..

[22]  Paolo Traverso,et al.  APE: An Acting and Planning Engine , 2019 .

[23]  Susanne Biundo-Stephan,et al.  HTN Planning as Heuristic Progression Search , 2020, J. Artif. Intell. Res..

[24]  Susanne Biundo-Stephan,et al.  Improving Hierarchical Planning Performance by the Use of Landmarks , 2012, AAAI.

[25]  Minglei Li,et al.  A novel HTN planning approach for handling disruption during plan execution , 2017, Appl. Intell..

[26]  Thibault Gateau,et al.  HiDDeN: Cooperative plan execution and repair for heterogeneous robots in dynamic environments , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  Hector Muñoz-Avila,et al.  Applications of SHOP and SHOP2 , 2005, IEEE Intelligent Systems.

[28]  Mathijs de Weerdt,et al.  Plan Repair as an Extension of Planning , 2005, ICAPS.