Learning High-Level Planning from Text

Comprehending action preconditions and effects is an essential step in modeling the dynamics of the world. In this paper, we express the semantics of precondition relations extracted from text in terms of planning operations. The challenge of modeling this connection is to ground language at the level of relations. This type of grounding enables us to create high-level plans based on language abstractions. Our model jointly learns to predict precondition relations from text and to perform high-level planning guided by those relations. We implement this idea in the reinforcement learning framework using feedback automatically obtained from plan execution attempts. When applied to a complex virtual world and text describing that world, our relation extraction technique performs on par with a supervised baseline, yielding an F-measure of 66% compared to the baseline's 65%. Additionally, we show that a high-level planner utilizing these extracted relations significantly outperforms a strong, text unaware baseline -- successfully completing 80% of planning tasks as compared to 69% for the baseline.

[1]  Jeffrey Mark Siskind,et al.  Grounding the Lexical Semantics of Verbs in Visual Perception using Force Dynamics and Event Logic , 2001, J. Artif. Intell. Res..

[2]  Dan I. Moldovan,et al.  Text Mining for Causal Relations , 2002, FLAIRS.

[3]  Dan Roth,et al.  Minimally Supervised Event Causality Identification , 2011, EMNLP.

[4]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5]  Luke S. Zettlemoyer,et al.  Reading between the Lines: Learning to Map High-Level Instructions to Commands , 2010, ACL.

[6]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[7]  Dan Klein,et al.  Learning Semantic Correspondences with Less Supervision , 2009, ACL.

[8]  Raymond J. Mooney,et al.  Learning Language from Perceptual Context , 2012, EACL.

[9]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[10]  Avirup Sil,et al.  Extracting STRIPS Representations of Actions and Events , 2011, RANLP.

[11]  Andrew G. Barto,et al.  A causal approach to hierarchical decomposition of factored MDPs , 2005, ICML.

[12]  H. Simon,et al.  The Processes of Creative Thinking , 1959 .

[13]  Deb Roy,et al.  Intentional Context in Situated Natural Language Learning , 2005, CoNLL.

[14]  Thomas G. Dietterich,et al.  Automatic discovery and transfer of MAXQ hierarchies , 2008, ICML '08.

[15]  Daniel Jurafsky,et al.  Learning to Follow Navigational Directions , 2010, ACL.

[16]  Qiang Yang,et al.  Downward Refinement and the Efficiency of Hierarchical Problem Solving , 1994, Artif. Intell..

[17]  Luke S. Zettlemoyer,et al.  Reinforcement Learning for Mapping Instructions to Actions , 2009, ACL.

[18]  Raymond J. Mooney,et al.  Learning to Connect Language and Perception , 2008, AAAI.

[19]  Stefan Edelkamp,et al.  Automated Planning: Theory and Practice , 2007, Künstliche Intell..

[20]  Maria Fox,et al.  PDDL2.1: An Extension to PDDL for Expressing Temporal Planning Domains , 2003, J. Artif. Intell. Res..

[21]  Du-Seong Chang,et al.  Incremental cue phrase learning and bootstrapping method for causality extraction using cue phrase and word pair probabilities , 2006, Inf. Process. Manag..

[22]  SiskindJeffrey Mark,et al.  Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic , 1999 .

[23]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[24]  Leslie Pack Kaelbling,et al.  DetH*: Approximate Hierarchical Solution of Large Markov Decision Processes , 2011, IJCAI.

[25]  Bernhard Nebel,et al.  The FF Planning System: Fast Plan Generation Through Heuristic Search , 2011, J. Artif. Intell. Res..

[26]  Paolo Traverso,et al.  Automated Planning: Theory & Practice , 2004 .

[27]  Avirup Sil,et al.  Extracting Action and Event Semantics from Web Text , 2010, AAAI Fall Symposium: Commonsense Knowledge.

[28]  Roxana Girju,et al.  Using a Bigram Event Model to Predict Causal Potential , 2009, CICLing.

[29]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[30]  Alicia P. Wolfe,et al.  Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.

[31]  Chen Yu,et al.  On the Integration of Grounding Language and Learning Objects , 2004, AAAI.

[32]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[33]  Dan I. Moldovan,et al.  Causal Relation Extraction , 2008, LREC.

[34]  Pavol Návrat,et al.  Expressivity of STRIPS-Like and HTN-Like Planning , 2007, KES-AMSTA.

[35]  Regina Barzilay,et al.  Learning to Win by Reading Manuals in a Monte-Carlo Framework , 2011, ACL.

[36]  Paul R. Cohen,et al.  Grounding knowledge in sensors: unsupervised learning for language and planning , 2001 .