Reinforcement learning-based mobile robot navigation

In recent decades, reinforcement learning (RL) has been widely used in different research fields ranging from psychology to computer science. The unfeasibility of sampling all possibilities for continuous-state problems and the absence of an explicit teacher make RL algorithms preferable for supervised learning in the machine learning area, as the optimal control problem has become a popular subject of research. In this study, a system is proposed to solve mobile robot navigation by opting for the most popular two RL algorithms, Sarsa($\lambda )$ and Q($\lambda )$. The proposed system, developed in MATLAB, uses state and action sets, defined in a novel way, to increase performance. The system can guide the mobile robot to a desired goal by avoiding obstacles with a high success rate in both simulated and real environments. Additionally, it is possible to observe the effects of the initial parameters used by the RL methods, e.g., $\lambda $, on learning, and also to make comparisons between the performances of Sarsa($\lambda )$ and Q($\lambda )$ algorithms.

[1]  Lazhar Khriji,et al.  Mobile Robot Navigation Based on Q-Learning Technique , 2011 .

[2]  Fan,et al.  Temporal Difference Learning with Piecewise Linear Basis , 2014 .

[3]  R. Bellman Dynamic programming. , 1957, Science.

[4]  Manuel Graña,et al.  STATE-ACTION VALUE FUNCTION MODELED BY ELM IN REINFORCEMENT LEARNING FOR HOSE CONTROL PROBLEMS , 2013 .

[5]  Subramaniam Parasuraman,et al.  Mobile robot navigation: neural Q-learning , 2013 .

[6]  Amit Konar,et al.  A Deterministic Improved Q-Learning for Path Planning of a Mobile Robot , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[7]  Alborz Geramifard,et al.  Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning , 2013, J. Intell. Robotic Syst..

[8]  Marília Curado,et al.  A reinforcement learning-based routing for delay tolerant networks , 2013, Eng. Appl. Artif. Intell..

[9]  Jawad Muhammad,et al.  An improved Q-learning algorithm for an autonomous mobile robot navigation problem , 2013, 2013 The International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE).

[10]  Haibo He,et al.  Goal Representation Heuristic Dynamic Programming on Maze Navigation , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Andrew McCallum,et al.  Instance-Based State Identification for Reinforcement Learning , 1994, NIPS.

[12]  R. Bellman A Markovian Decision Process , 1957 .

[13]  N. Peric,et al.  A reinforcement learning approach to obstacle avoidance of mobile robots , 2002, 7th International Workshop on Advanced Motion Control. Proceedings (Cat. No.02TH8623).

[14]  Zoran Miljkovic,et al.  Neural network Reinforcement Learning for visual control of robot manipulators , 2013, Expert Syst. Appl..

[15]  Yang Gao,et al.  Online Selective Kernel-Based Temporal Difference Learning , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Peter Norvig,et al.  Artificial intelligence - a modern approach, 2nd Edition , 2003, Prentice Hall series in artificial intelligence.

[17]  Tzuu-Hseng S. Li,et al.  Backward Q-learning: The combination of Sarsa algorithm and Q-learning , 2013, Eng. Appl. Artif. Intell..

[18]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[19]  H. Martín,et al.  Ex〈α〉: An effective algorithm for continuous actions Reinforcement Learning problems , 2009 .

[20]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  Michèle Sebag,et al.  A tour of machine learning: An AI perspective , 2014, AI Commun..

[23]  Kao-Shing Hwang,et al.  ADAPTIVE MODEL LEARNING BASED ON DYNA-Q LEARNING , 2013, Cybern. Syst..

[24]  Chunming Liu,et al.  A hierarchical reinforcement learning approach for optimal path tracking of wheeled mobile robots , 2012, Neural Computing and Applications.

[25]  Jürgen Schmidhuber,et al.  Metric State Space Reinforcement Learning for a Vision-Capable Mobile Robot , 2006, IAS.

[26]  Javier de Lope Asiaín,et al.  Coordination of communication in robot teams by reinforcement learning , 2013, Robotics Auton. Syst..

[27]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[28]  Kao-Shing Hwang,et al.  Policy Improvement by a Model-Free Dyna Architecture , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Ali Hamzeh,et al.  Using reinforcement learning to find an optimal set of features , 2013, Comput. Math. Appl..

[30]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[31]  Subramaniam Parasuraman,et al.  Mobile robot navigation: neural Q-learning , 2012, Int. J. Comput. Appl. Technol..

[32]  Haibo He,et al.  Kernel-Based Approximate Dynamic Programming for Real-Time Online Learning Control: An Experimental Study , 2014, IEEE Transactions on Control Systems Technology.