Reinforcement Learning-SLAM for finding minimum cost path and mapping

In this work, we propose the integration of two of the most widely used approaches for the implementation of autonomous navigation systems: the reinforcement learning for path finding, along with SLAM (Simultaneous Localization and Mapping) type algorithms for localization and mapping of the environment. These two approaches are integrated to address the problem of how a robot should explore an unknown and dynamic environment while it collects perception features in order to locate itself and, at the same time, to obtain information clues about cost traversability of an area. So, when a robot is exploring and mapping with a SLAM algorithm it is also learning to associate perception features with costs and actions to find optimal paths from the starting point to the goal point in dynamical environments.

[1]  Anthony Stentz,et al.  Optimal and efficient path planning for partially-known environments , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[2]  Hugh Durrant-Whyte,et al.  Simultaneous Localisation and Mapping ( SLAM ) : Part I The Essential Algorithms , 2006 .

[3]  Sebastian Thrun,et al.  FastSLAM: a factored solution to the simultaneous localization and mapping problem , 2002, AAAI/IAAI.

[4]  Alexander G. Loukianov,et al.  Discrete-Time Adaptive Backstepping Nonlinear Control via High-Order Neural Networks , 2007, IEEE Transactions on Neural Networks.

[5]  Gamini Dissanayake,et al.  Near minimum time path planning for bearing-only localisation and mapping , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Nicholas Roy,et al.  Trajectory Optimization using Reinforcement Learning for Map Exploration , 2008, Int. J. Robotics Res..

[7]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  Michael J. Grimble IEEE conference on decision and control , 1987 .

[10]  Nicholas Roy,et al.  Efficient Optimization of Information-Theoretic Exploration in SLAM , 2008, AAAI.

[11]  Nancy Arana-Daniel,et al.  Reinforced-SLAM for path planing and mapping in dynamic environments , 2011, 2011 8th International Conference on Electrical Engineering, Computing Science and Automatic Control.

[12]  Gamini Dissanayake,et al.  Time optimal robot motion control in simultaneous localization and map building (SLAM) problem , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[13]  Mance E. Harmon,et al.  Reinforcement Learning: A Tutorial. , 1997 .

[14]  Alexander G. Loukianov,et al.  Discrete-time recurrent high order neural networks for nonlinear identification , 2010, J. Frankl. Inst..

[15]  Ian D. Reid,et al.  Simultaneous Localisation and Mapping in Dynamic Environments (SLAMIDE) with Reversible Data Associa , 2007, Robotics: Science and Systems.

[16]  Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 1992, Raleigh, NC, USA, 7-10 Jul 1992 , 1992, IROS.

[17]  Eric VanWyk,et al.  Path Planning with Phased Array SLAM and Voronoi Tessellation , 2006 .

[18]  Hugh Durrant-Whyte,et al.  Simultaneous localization and mapping (SLAM): part II , 2006 .

[19]  Kamal K. Gupta,et al.  RRT-SLAM for motion planning with motion and map uncertainty for robot exploration , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.