Learning action models for the improved execution of navigation plans

Abstract Most state-of-the-art navigation systems for autonomous service robots decompose navigation into global navigation planning and local reactive navigation. While the methods for navigation planning and local navigation themselves are well understood, the plan execution problem, the problem of how to generate and parameterize local navigation tasks from a given navigation plan is largely unsolved. This paper describes how a robot can autonomously learn to execute navigation plans. We formalize the problem as a Markov Decision Process (MDP) and derive a decision theoretic action selection function from it. The action selection function employs models of the robot’s navigation actions, which are autonomously acquired from experience using neural networks or regression tree learning algorithms. We show, both in simulation and on an RWI B21 mobile robot, that the learned models together with the derived action selection function achieve competent navigation behavior.

[1]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[2]  Christopher M. Bishop,et al.  Classification and regression , 1997 .

[3]  Jean-Claude Latombe,et al.  Robot motion planning , 1970, The Kluwer international series in engineering and computer science.

[4]  Sebastian Thrun,et al.  An approach to learning mobile robot navigation , 1995, Robotics Auton. Syst..

[5]  Wolfram Burgard,et al.  Active Mobile Robot Localization , 1997, IJCAI.

[6]  Leslie Pack Kaelbling,et al.  Acting under uncertainty: discrete Bayesian models for mobile-robot navigation , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[7]  Michael Beetz,et al.  Environment and Task Adaptation for Robotic Agents , 2000 .

[8]  W. Burgard,et al.  Markov Localization for Mobile Robots in Dynamic Environments , 1999, J. Artif. Intell. Res..

[9]  Manuela M. Veloso,et al.  Team-Partitioned, Opaque-Transition Reinforced Learning , 1998, RoboCup.

[10]  Ronald C. Arkin,et al.  An Behavior-based Robotics , 1998 .

[11]  Wolfram Burgard,et al.  The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..

[12]  Daniel M. Gaines,et al.  Using regression trees to learn action models , 2000, Smc 2000 conference proceedings. 2000 ieee international conference on systems, man and cybernetics. 'cybernetics evolving to systems, humans, organizations, and their complex interactions' (cat. no.0.

[13]  Wolfram Burgard,et al.  Integrating active localization into high-level robot control systems , 1998, Robotics Auton. Syst..

[14]  Reid G. Simmons,et al.  The curvature-velocity method for local obstacle avoidance , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[15]  Reid G. Simmons,et al.  Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.

[16]  Manu Sridharan,et al.  Multi-agent Q-learning and regression trees for automated pricing decisions , 2000, Proceedings Fourth International Conference on MultiAgent Systems.

[17]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[18]  Ashwin Ram,et al.  Continuous Case-Based Reasoning , 1997, Artif. Intell..

[19]  S.J.J. Smith,et al.  Empirical Methods for Artificial Intelligence , 1995 .

[20]  Reid G. Simmons,et al.  Unsupervised learning of probabilistic models for robot navigation , 1996, Proceedings of IEEE International Conference on Robotics and Automation.