Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment

Mobile robot path planning in an unknown environment is a fundamental and challenging problem in the field of robotics. Dynamic window approach (DWA) is an effective method of local path planning, however some of its evaluation functions are inadequate and the algorithm for choosing the weights of these functions is lacking, which makes it highly dependent on the global reference and prone to fail in an unknown environment. In this paper, an improved DWA based on Q-learning is proposed. First, the original evaluation functions are modified and extended by adding two new evaluation functions to enhance the performance of global navigation. Then, considering the balance of effectiveness and speed, we define the state space, action space and reward function of the adopted Q-learning algorithm for the robot motion planning. After that, the parameters of the proposed DWA are adaptively learned by Q-learning and a trained agent is obtained to adapt to the unknown environment. At last, by a series of comparative simulations, the proposed method shows higher navigation efficiency and successful rate in the complex unknown environment. The proposed method is also validated in experiments based on XQ-4 Pro robot to verify its navigation capability in both static and dynamic environment.

[1]  Milan Simic,et al.  Sampling-Based Robot Motion Planning: A Review , 2014, IEEE Access.

[2]  Karl Iagnemma,et al.  Stochastic mobility prediction of ground vehicles over large spatial regions: a geostatistical approach , 2017, Auton. Robots.

[3]  Oliver Brock,et al.  High-speed navigation using the global dynamic window approach , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[4]  B. K. Panigrahi,et al.  Intelligent-based multi-robot path planning inspired by improved classical Q-learning and improved particle swarm optimization with perturbed velocity , 2016 .

[5]  Weiliang Xu,et al.  Energy Efficient Dynamic Window Approach for Local Path Planning in Mobile Service Robotics , 2016 .

[6]  Anthony Stentz,et al.  The Focussed D* Algorithm for Real-Time Replanning , 1995, IJCAI.

[7]  Mingyue Ding,et al.  Route Planning for Unmanned Aerial Vehicle (UAV) on the Sea Using Hybrid Differential Evolution and Quantum-Behaved Particle Swarm Optimization , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[8]  Cristina Urdiales,et al.  A Biomimetical Dynamic Window Approach to Navigation for Collaborative Control , 2017, IEEE Transactions on Human-Machine Systems.

[9]  Mohammad A. Jaradat,et al.  Reinforcement based mobile robot navigation in dynamic environment , 2011 .

[10]  Leo Hartman,et al.  Randomized path planning with preferences in highly complex dynamic environments , 2013, Robotica.

[11]  Volkan Sezer,et al.  Follow the Gap with Dynamic Window Approach , 2018, Int. J. Semantic Comput..

[12]  An Zhang,et al.  Rectangle expansion A∗ pathfinding for grid maps , 2016 .

[13]  Zhiguo Shi,et al.  The optimization of path planning for multi-robot system using Boltzmann Policy based Q-learning algorithm , 2013, 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[14]  Nancy M. Amato,et al.  FIRM: Sampling-based feedback motion-planning under motion uncertainty and imperfect measurements , 2014, Int. J. Robotics Res..

[15]  Khaled Nouri,et al.  An advanced potential field method proposed for mobile robot path planning , 2019, Trans. Inst. Meas. Control.

[16]  Marco A. Contreras-Cruz,et al.  Mobile robot path planning using artificial bee colony and evolutionary programming , 2015, Appl. Soft Comput..

[17]  Jan Faigl,et al.  Online planning for multi-robot active perception with self-organising maps , 2018, Auton. Robots.

[18]  Chou,et al.  An improved potential field method for mobile robot navigation , 2016 .

[19]  Ji Zhang,et al.  Low-drift and real-time lidar odometry and mapping , 2017, Auton. Robots.

[20]  Faruk Polat,et al.  Limited-Damage A*: A path search algorithm that considers damage as a feasibility criterion , 2011, Knowl. Based Syst..

[21]  Vikas Kumar,et al.  Application of Deep Q-Learning for Wheel Mobile Robot Navigation , 2017, 2017 3rd International Conference on Computational Intelligence and Networks (CINE).

[22]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[23]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[24]  Gábor Tevesz,et al.  Advanced dynamic window based navigation approach using model predictive control , 2012, 2012 17th International Conference on Methods & Models in Automation & Robotics (MMAR).

[25]  Md. Arafat Hossain,et al.  Autonomous robot path planning in dynamic environment using a new optimization technique inspired by Bacterial Foraging technique , 2014, 2013 International Conference on Electrical Information and Communication Technology (EICT).

[26]  Li Meng,et al.  Route planning for unmanned aerial vehicle based on rolling RRT in unknown environment , 2016, 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC).

[27]  Sebastian Thrun,et al.  Anytime search in dynamic graphs , 2008, Artif. Intell..

[28]  Hesamoddin Monfared,et al.  Generalized intelligent Water Drops algorithm by fuzzy local search and intersection operators on partitioning graph for path planning problem , 2015, J. Intell. Fuzzy Syst..

[29]  Ivan Petrovic,et al.  Dynamic window based approach to mobile robot motion control in the presence of moving obstacles , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[30]  Emilio Frazzoli,et al.  Sampling-based algorithms for optimal motion planning , 2011, Int. J. Robotics Res..

[31]  José Pedro Sousa,et al.  A cable-driven robot for architectural constructions: a visual-guided approach for motion control and path-planning , 2017, Auton. Robots.

[32]  Hugh Durrant-Whyte Where am I? A tutorial on mobile vehicle localization , 1994 .

[33]  Petter Ögren,et al.  A convergent dynamic window approach to obstacle avoidance , 2005, IEEE Transactions on Robotics.

[34]  Rajesh Kumar,et al.  Model based path planning using Q-Learning , 2017, 2017 IEEE International Conference on Industrial Technology (ICIT).

[35]  Ehud Rivlin,et al.  TangentBug: A Range-Sensor-Based Navigation Algorithm , 1998, Int. J. Robotics Res..

[36]  Gheorghe Mogan,et al.  Neural networks based reinforcement learning for mobile robots obstacle avoidance , 2016, Expert Syst. Appl..

[37]  Lynda Dib,et al.  E-Bug: New Bug Path-planning algorithm for autonomous robot in unknown environment , 2015 .

[38]  Yuewei Dai,et al.  The Path Planning of Mobile Robots Based on an Improved A* Algorithm , 2019, 2019 IEEE 16th International Conference on Networking, Sensing and Control (ICNSC).

[39]  Peter Korondi,et al.  A novel potential field method for path planning of mobile robots by adapting animal motion attributes , 2016, Robotics Auton. Syst..

[40]  Yasar Ayaz,et al.  Potential functions based sampling heuristic for optimal path planning , 2015, Autonomous Robots.

[41]  Reid G. Simmons,et al.  The curvature-velocity method for local obstacle avoidance , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[42]  Vladimir J. Lumelsky,et al.  Incorporating range sensing in the robot navigation function , 1990, IEEE Trans. Syst. Man Cybern..

[43]  Oussama Khatib,et al.  Real-Time Obstacle Avoidance for Manipulators and Mobile Robots , 1986 .

[44]  Gábor Tevesz,et al.  A Receding Horizon Control Approach to Navigation in Virtual Corridors , 2012 .

[45]  Leslie Pack Kaelbling,et al.  Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[46]  Oliver Brock,et al.  Balancing exploration and exploitation in motion planning , 2008, 2008 IEEE International Conference on Robotics and Automation.

[47]  Wolfram Burgard,et al.  The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..

[48]  Sungbok Kim,et al.  Optimally overlapped ultrasonic sensor ring design for minimal positional uncertainty in obstacle detection , 2010, ICCAS 2010.

[49]  Leandro dos Santos Coelho,et al.  K-Bug, A New Bug Approach for Mobile Robot's Path Planning , 2007, 2007 IEEE International Conference on Control Applications.

[50]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[51]  G. Tevesz,et al.  Investigation of Dynamic Window based navigation algorithms on a real robot , 2013, 2013 IEEE 11th International Symposium on Applied Machine Intelligence and Informatics (SAMI).

[52]  Vladimir J. Lumelsky,et al.  Path-planning strategies for a point mobile automaton moving amidst unknown obstacles of arbitrary shape , 1987, Algorithmica.

[53]  Wang Dianjun,et al.  Indoor mobile-robot path planning based on an improved A~* algorithm , 2012 .

[54]  Chaymaa Lamini,et al.  Collaborative Q-learning path planning for autonomous robots based on holonic multi-agent system , 2015, 2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA).

[55]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..