A Relational Approach to Tool-Use Learning in Robots

We present a robot agent that learns to exploit objects in its environment as tools, allowing it to solve problems that would otherwise be impossible to achieve. The agent learns by watching a single demonstration of tool use by a teacher and then experiments in the world with a variety of available tools. A variation of explanation-based learning (EBL) first identifies the most important sub-goals the teacher achieved using the tool. The action model constructed from this explanation is then refined by trial-and-error learning with a novel Inductive Logic Programming (ILP) algorithm that generates informative experiments while containing the search space to a practical number of experiments. Relational learning generalises across objects and tasks to learn the spatial and structural constraints that describe useful tools and how they should be employed. The system is evaluated in a simulated robot environment.

[1]  Jane Goodall Tool using in primates and other vertebrates , 1970 .

[2]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[3]  Tom Michael Mitchell Version spaces: an approach to concept learning. , 1979 .

[4]  John McCarthy,et al.  SOME PHILOSOPHICAL PROBLEMS FROM THE STANDPOINT OF ARTI CIAL INTELLIGENCE , 1987 .

[5]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[6]  Alan D. Christiansen,et al.  Experiments in Robot Learning , 1989, ML.

[7]  Stephen Muggleton,et al.  Efficient Induction of Logic Programs , 1990, ALT.

[8]  Tsuneo Yoshikawa,et al.  Indentification of the center of friction from pushing an object by a mobile robot , 1991, Proceedings IROS '91:IEEE/RSJ International Workshop on Intelligent Robots and Systems '91.

[9]  Kevin M. Lynch,et al.  Estimating the friction parameters of pushed objects , 1993, Proceedings of 1993 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '93).

[10]  Giulio Sandini,et al.  A Vision-Based Learning Method for Pushing Manipulation , 1993 .

[11]  Donald Michie,et al.  Knowledge, Learning and Machine Intelligence , 1993 .

[12]  Yolanda Gil,et al.  Learning by Experimentation: Incremental Refinement of Incomplete Planning Domains , 1994, International Conference on Machine Learning.

[13]  Saso Dzeroski,et al.  Inductive Logic Programming: Techniques and Applications , 1993 .

[14]  Masayuki Inaba,et al.  Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..

[15]  Wei-Min Shen,et al.  Autonomous learning from the environment , 1994 .

[16]  Paul R. Cohen,et al.  Searching for Planning Operators with Context-Dependent and Probabilistic Effects , 1996, AAAI/IAAI, Vol. 1.

[17]  Scott Sherwood Benson,et al.  Learning action models for reactive autonomous agents , 1996 .

[18]  Ivan Bratko,et al.  Skill Reconstruction as Induction of LQ Controllers with Subgoals , 1997, IJCAI.

[19]  S. Pattinson,et al.  Learning to fly. , 1998 .

[20]  Steven M. LaValle,et al.  Randomized Kinodynamic Planning , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[21]  Steven M. LaValle,et al.  RRT-connect: An efficient approach to single-query path planning , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[22]  Ramón P. Otero,et al.  Learning to Reason About Actions , 2000, ECAI.

[23]  Drew McDermott,et al.  The 1998 AI Planning Systems Competition , 2000, AI Mag..

[24]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[25]  Paul R. Cohen,et al.  Learning Planning Operators in Real-World, Partially Observable Environments , 2000, AIPS.

[26]  Monica N. Nicolescu,et al.  Learning and interacting in human-robot domains , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[27]  Bernhard Nebel,et al.  The FF Planning System: Fast Plan Generation Through Heuristic Search , 2011, J. Artif. Intell. Res..

[28]  Jessica K. Hodgins,et al.  Generalizing Demonstrated Manipulation Tasks , 2002, WAFR.

[29]  Ergun Bicici,et al.  Reasoning About the Functionality of Tools and Physical Artifacts , 2003 .

[30]  D. Povinelli Folk physics for apes : the chimpanzee's theory of how the world works , 2003 .

[31]  Monica N. Nicolescu,et al.  Natural methods for robot task learning: instructive demonstrations, generalization and practice , 2003, AAMAS '03.

[32]  Masayuki Inaba,et al.  Motion Planning for Humanoid Robots , 2003, ISRR.

[33]  Eduardo F. Morales,et al.  Learning to fly by combining reinforcement learning with behavioural cloning , 2004, ICML.

[34]  Claude Sammut,et al.  Hierarchical reinforcement learning: a hybrid approach , 2004 .

[35]  Mubarak Shah,et al.  View-Invariant Representation and Recognition of Actions , 2002, International Journal of Computer Vision.

[36]  Tom M. Mitchell,et al.  Explanation-Based Generalization: A Unifying View , 1986, Machine Learning.

[37]  D. Levey,et al.  Animal behaviour: Use of dung as a tool by burrowing owls , 2004, Nature.

[38]  Andrew Howard,et al.  Design and use paradigms for Gazebo, an open-source multi-robot simulator , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[39]  Alexander Stoytchev,et al.  Behavior-Grounded Representation of Tool Affordances , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[40]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[41]  Eyal Amir,et al.  Learning Partially Observable Deterministic Action Models , 2005, IJCAI.

[42]  A.B. Wood,et al.  Effective tool use in a habile agent , 2005, 2005 IEEE Design Symposium, Systems and Information Engineering.

[43]  J. Mann,et al.  Cultural transmission of tool use in bottlenose dolphins. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[44]  J. Andrew Bagnell,et al.  Maximum margin planning , 2006, ICML.

[45]  Krzysztof R. Apt,et al.  Constraint logic programming using Eclipse , 2007 .

[46]  Gerald DeJong,et al.  Explanation-Based Acquisition of Planning Operators , 2006, ICAPS.

[47]  Ronald C. Arkin,et al.  Robot tool behavior: a developmental approach to autonomous tool use , 2007 .

[48]  Tak Fai Yik,et al.  Trial-and-Error Learning of a Biped Gait Constrained by Qualitative Reasoning , 2007 .

[49]  L. P. Kaelbling,et al.  Learning Symbolic Models of Stochastic Domains , 2007, J. Artif. Intell. Res..

[50]  Tamim Asfour,et al.  Manipulation Planning Among Movable Obstacles , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[51]  J. Sinapov,et al.  Detecting the functional similarities between tools using a hierarchical representation of outcomes , 2008, 2008 7th IEEE International Conference on Development and Learning.

[52]  Michael Beetz,et al.  Refining the Execution of Abstract Actions with Learned Action Models , 2008, J. Artif. Intell. Res..

[53]  Harini Veeraraghavan,et al.  Teaching Sequential Tasks with Repetition through Demonstration (Short Paper) , 2008 .

[54]  Anthony G. Cohn,et al.  Learning Functional Object-Categories from a Relational Spatio-Temporal Representation , 2008, ECAI.

[55]  Gordon Plotkin,et al.  A Note on Inductive Generalization , 2008 .

[56]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[57]  Edward Grant,et al.  Learning for Control , 2019 .

[58]  H. Pasula Learning Probabilistic Planning Rules , .