Learning Hierarchical Task Networks for Nondeterministic Planning Domains

This paper describes how to learn Hierarchical Task Networks (HTNs) in nondeterministic planning domains, where actions may have multiple possible outcomes. We discuss several desired properties that guarantee that the resulting HTNs will correctly handle the nondeterminism in the domain. We developed a new learning algorithm, called HTN-MAKERND, that exploits these properties. We implemented HTN-MAKERND in the recently-proposed HTN-MAKER system, a goal-regression based HTN learning approach. In our theoretical study, we show that HTN-MAKERND soundly produces HTN planning knowledge in low-order polynomial times, despite the nondeterminism. In our experiments with two nondeterministic planning domains, ND-SHOP2, a well-known HTN planning algorithm for nondeterministic domains, significantly outperformed (in some cases, by about 3 orders of magnitude) the well-known planner MBP using the learned HTNs.

[1]  Hector Muñoz-Avila,et al.  HTN-MAKER: Learning HTNs with Minimal Additional Knowledge Engineering Required , 2008, AAAI.

[2]  Marco Pistore,et al.  Task decomposition on abstract states, for planning under nondeterminism , 2009, Artif. Intell..

[3]  Pat Langley,et al.  Learning hierarchical task networks by observation , 2006, ICML.

[4]  M. Veloso,et al.  OBDD-Based Optimistic and Strong Cyclic Adversarial Planning , 2014 .

[5]  Srdjan Kovacevic,et al.  Defining interfaces at a high level of abstraction , 1989, IEEE Software.

[6]  Jeffrey Nichols,et al.  Generating remote control interfaces for complex appliances , 2002, UIST '02.

[7]  S. Griffis EDITOR , 1997, Journal of Navigation.

[8]  Marco Pistore,et al.  Planning as Model Checking for Extended Goals in Non-deterministic Domains , 2001, IJCAI.

[9]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.

[10]  Oren Etzioni,et al.  Adaptive Web Sites: an AI Challenge , 1997, IJCAI.

[11]  Qiang Yang,et al.  Learning recursive HTN-method structures for planning , 2007 .

[12]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[13]  Henry Lieberman,et al.  Watch what I do: programming by demonstration , 1993 .

[14]  Prasad Tadepalli,et al.  Learning Goal-Decomposition Rules Using Exercises , 1997, AAAI/IAAI.

[15]  Oren Etzioni,et al.  Sound and Efficient Closed-World Reasoning for Planning , 1997, Artif. Intell..

[16]  Pedro M. Domingos,et al.  Mixed initiative interfaces for learning tasks: SMARTedit talks back , 2001, IUI '01.

[17]  Donald A. Norman,et al.  The invisible computer , 1998 .

[18]  Oren Etzioni,et al.  A softbot-based interface to the Internet , 1994, CACM.

[19]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[20]  Pedro M. Domingos,et al.  Dynamic Probabilistic Relational Models , 2003, IJCAI.

[21]  Daniel S. Weld Planning-Based Control of Software Agents , 1996, AIPS.

[22]  Pedro M. Domingos,et al.  Personalizing web sites for mobile users , 2001, WWW '01.

[23]  David Ruby,et al.  SteppingStone: An Empirical and Analytical Evaluation , 1991, AAAI.

[24]  James A. Landay,et al.  Damask: A Tool for Early-Stage Design and Prototyping of Multi-Device User Interfaces , 2002 .

[25]  Leslie Pack Kaelbling,et al.  Learning Probabilistic Relational Planning Rules , 2004, ICAPS.

[26]  Pattie Maes,et al.  Learning Interface Agents , 1993, AAAI.

[27]  David W. Aha,et al.  Learning approximate preconditions for methods in hierarchical plans , 2005, ICML.

[28]  Dana S. Nau,et al.  Forward-Chaining Planning in Nondeterministic Domains , 2004, AAAI.

[29]  Wendy E. Mackay,et al.  Triggers and barriers to customizing software , 1991, CHI.

[30]  Ke Xu,et al.  A Domain-Independent System for Case-Based Task Decomposition without Domain Theories , 2005, AAAI.

[31]  Eric Horvitz,et al.  Principles of mixed-initiative user interfaces , 1999, CHI '99.

[32]  Kellogg S. Booth,et al.  An evaluation of a multiple interface design solution for bloated software , 2002, CHI.

[34]  Marco Pistore,et al.  Weak, strong, and strong cyclic planning via symbolic model checking , 2003, Artif. Intell..

[35]  Paolo Traverso,et al.  Automated Planning: Theory & Practice , 2004 .

[36]  Keith Golden,et al.  Representing Sensing Actions: The Middle Ground Revisited , 1996, KR.

[37]  Oren Etzioni,et al.  A reliable natural language interface to household appliances , 2003, IUI '03.

[38]  Oren Etzioni,et al.  Towards adaptive Web sites: Conceptual framework and case study , 1999, Artif. Intell..

[39]  Pedro M. Domingos,et al.  Adaptive Web Navigation for Wireless Devices , 2001, IJCAI.

[40]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[41]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[42]  Clayton Lewis,et al.  TASK-CENTERED USER INTERFACE DESIGN A Practical Introduction , 2006 .

[43]  Dana S. Nau,et al.  Planning for Interactions among Autonomous Agents , 2009, ProMAS.

[44]  Angel R. Puerta,et al.  The MECANO Project: Comprehensive and Integrated Support for Model-Based Interface Development , 1996, CADUI.

[45]  Pedro M. Domingos,et al.  Version Space Algebra and its Application to Programming by Demonstration , 2000, ICML.

[46]  Ping Luo,et al.  Beyond interface builders: model-based interface tools , 1993, INTERCHI.

[47]  Leslie Pack Kaelbling,et al.  Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[48]  Jean Vanderdonckt,et al.  Applying model-based techniques to the development of UIs for mobile computers , 2001, IUI '01.

[49]  Lars Karlsson,et al.  Conditional progressive planning under uncertainty , 2001, IJCAI.

[50]  Paolo Traverso,et al.  Automated planning - theory and practice , 2004 .

[51]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[52]  Pat Langley,et al.  Learning Recursive Control Programs from Problem Solving , 2006, J. Mach. Learn. Res..