Reinforcement Learning Algorithms in Global Path Planning for Mobile Robot

The paper is devoted to the research of two approaches for global path planning for mobile robots, based on Q-Learning and Sarsa algorithms. The study has been done with different adjustments of two algorithms that made it possible to learn faster. The implementation of two Reinforcement Learning algorithms showed differences in learning time and the methods of building path to avoid obstacles and to reach a destination point. The analysis of obtained results made it possible to select optimal parameters of the considered algorithms for the tested environments. Experiments were performed in virtual environments where algorithms learned which steps to choose in order to get a maximum payoff and reach the goal avoiding obstacles.

[1]  Ian D. Watson,et al.  Applying reinforcement learning to small scale combat in the real-time strategy game StarCraft:Broodwar , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[2]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Ser-Nam Lim,et al.  A Reinforcement Learning Approach to the View Planning Problem , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Wouter Caarls,et al.  Parameters Tuning and Optimization for Reinforcement Learning Algorithms Using Evolutionary Computing , 2018, 2018 International Conference on Information Systems and Computer Science (INCISCOS).

[5]  William Sause Coordinated Reinforcement Learning Agents in a Multi-agent Virtual Environment , 2013, 2013 12th International Conference on Machine Learning and Applications.

[6]  Lei Cao,et al.  A study of count-based exploration and bonus for reinforcement learning , 2017, 2017 IEEE 2nd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA).

[7]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[8]  Wei-Fu Lu,et al.  Playing Mastermind Game by Using Reinforcement Learning , 2017, 2017 First IEEE International Conference on Robotic Computing (IRC).

[9]  Koshy George,et al.  A comparison of reinforcement learning based approaches to appliance scheduling , 2016, 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I).

[10]  P. Sopp Cluster analysis. , 1996, Veterinary immunology and immunopathology.

[11]  Tom Ziemke,et al.  Exploring the relationship of reward and punishment in reinforcement learning , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[12]  Boris Igelnik,et al.  Kolmogorov's spline network , 2003, IEEE Trans. Neural Networks.

[13]  Parag Kulkarni,et al.  New Approach for Advanced Cooperative Learning Algorithms using RL Methods (ACLA) , 2016 .

[14]  Dong Xu,et al.  Path Planning Method Combining Depth Learning and Sarsa Algorithm , 2017, 2017 10th International Symposium on Computational Intelligence and Design (ISCID).

[15]  Parag Kulkarni,et al.  Enhanced Cooperative Multi-agent Learning Algorithms (ECMLA) using Reinforcement Learning , 2016, 2016 International Conference on Computing, Analytics and Security Trends (CAST).

[16]  William M. Pottenger,et al.  Higher order Q-Learning , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[17]  Vali Derhami,et al.  Control of humanoid robot walking by Fuzzy Sarsa Learning , 2015, 2015 3rd RSI International Conference on Robotics and Mechatronics (ICROM).

[18]  Akif Durdu,et al.  Path planning of mobile robots with Q-learning , 2014, 2014 22nd Signal Processing and Communications Applications Conference (SIU).

[19]  Michael Fairbank,et al.  The divergence of reinforcement learning algorithms with value-iteration and function approximation , 2011, The 2012 International Joint Conference on Neural Networks (IJCNN).

[20]  Xia Yang,et al.  A survey of reinforcement learning research and its application for multi-robot systems , 2012, Proceedings of the 31st Chinese Control Conference.

[21]  Rüstem Özakar,et al.  Ball-cradling using reinforcement algorithms , 2016, 2016 National Conference on Electrical, Electronics and Biomedical Engineering (ELECO).

[22]  Koshy George,et al.  Comparison of reinforcement learning algorithms applied to the cart-pole problem , 2017, 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[23]  Majid Nili Ahmadabadi,et al.  A Study on Expertise of Agents and Its Effects on Cooperative $Q$-Learning , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[24]  Yishay Mansour,et al.  Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..

[25]  Rubo Zhang,et al.  An adaptive obstacle avoidance algorithm for unmanned surface vehicle in complicated marine environments , 2014, IEEE/CAA Journal of Automatica Sinica.

[26]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[27]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[28]  Arafat Habib,et al.  Optimal route selection in complex multi-stage supply chain networks using SARSA(λ) , 2016, 2016 19th International Conference on Computer and Information Technology (ICCIT).

[29]  Marco Wiering,et al.  Reinforcement learning in the game of Othello: Learning against a fixed opponent and learning from self-play , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[30]  Huang Jian,et al.  Research of reinforcement learning based share control of walking-aid robot , 2013, Proceedings of the 32nd Chinese Control Conference.

[31]  Y. Kuroe,et al.  Swarm reinforcement learning algorithms based on Sarsa method , 2008, 2008 SICE Annual Conference.