Reinforcement learning for sensing strategies

Since sensors have limited range and coverage, mobile robots often have to make decisions on where to point their sensors. A good sensing strategy allows a robot to collect information that is useful for its tasks. Most existing solutions to this active sensing problem choose the direction that maximally reduces the uncertainty in a single state variable. In more complex problem domains, however, uncertainties exist in multiple state variables, and they affect the performance of the robot in different ways. The robot thus needs to have more sophisticated sensing strategies in order to decide which uncertainties to reduce, and to make the correct trade-offs. In this work, we apply a least squares reinforcement learning method to solve this problem. We implemented and tested the learning approach in the RoboCup domain, where the robot attempts to reach a ball and accurately kick it into the goal. We present experimental results that suggest our approach is able to learn highly effective sensing strategies.

[1]  Wolfram Burgard,et al.  Active Markov localization for mobile robots , 1998, Robotics Auton. Syst..

[2]  Justin A. Boyan,et al.  Least-Squares Temporal Difference Learning , 1999, ICML.

[3]  Sebastian Thrun,et al.  Coastal Navigation with Mobile Robots , 1999, NIPS.

[4]  Michail G. Lagoudakis,et al.  Model-Free Least-Squares Policy Iteration , 2001, NIPS.

[5]  Thiagalingam Kirubarajan,et al.  Estimation with Applications to Tracking and Navigation , 2001 .

[6]  Peter Stone,et al.  Scaling Reinforcement Learning toward RoboCup Soccer , 2001, ICML.

[7]  Nicholas Roy,et al.  Exponential Family PCA for Belief Compression in POMDPs , 2002, NIPS.

[8]  Dídac Busquets,et al.  Reinforcement learning for landmark-based robot navigation , 2002, AAMAS '02.

[9]  Joachim Denzler,et al.  Information Theoretic Sensor Data Selection for Active Object Recognition and State Estimation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Dieter Fox,et al.  An experimental comparison of localization methods continued , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Minoru Asada,et al.  Visual attention control for a legged mobile robot based on information criterion , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Dieter Fox,et al.  Bayesian Filtering for Location Estimation , 2003, IEEE Pervasive Comput..

[13]  Manuela M. Veloso,et al.  Simultaneous Adversarial Multi-Robot Learning , 2003, IJCAI.

[14]  Minoru Asada,et al.  Incremental purposive behavior acquisition based on self-interpretation of instructions by coach , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[15]  Masaki Ogino,et al.  Reinforcement learning of humanoid rhythmic walking parameters based on visual information , 2004, Adv. Robotics.

[16]  Dieter Fox,et al.  Map-Based Multiple Model Tracking of a Moving Object , 2004, RoboCup.

[17]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[18]  Alfred O. Hero,et al.  Sensor management using an active sensing approach , 2005, Signal Process..