An Advance Q Learning (AQL) Approach for Path Planning and Obstacle Avoidance of a Mobile Robot

The goal of this paper is to improve the performance of the well known Q learning algorithm, the robust technique of Machine learning to facilitate path planning in an environment. Until this time the Q learning algorithms like Classical Q learning(CQL)algorithm and Improved Q learning (IQL) algorithm deal with an environment without obstacles, while in a real environment an agent has to face obstacles very frequently. Hence this paper considers an environment with number of obstacles and has coined a new parameter, called ‘immediate penalty’ due to collision with an obstacle. Further the proposed technique has replaced the scalar ‘immediate reward’ function by ‘effective immediate reward’ function which consists of two fuzzy parameters named as, ‘immediate reward’ and ‘immediate penalty’. The fuzzification of these two important parameters not only improves the learning technique, it also strikes a balance between exploration and exploitation, the most challenging problem of Reinforcement Learning. The proposed algorithm stores the Q value for the best possible action at a state; as well it saves significant path planning time by suggesting the best action to adopt at each state to move to the next state. Eventually, the agent becomes more intelligent as it can smartly plan a collision free path avoiding obstacles from distance. The validation of the algorithm is studied through computer simulation in a maze like environment and also on KheperaII platform in real time. An analysis reveals that the Q Table, obtained by the proposed Advanced Q learning (AQL) algorithm, when used for path-planning application of mobile robots outperforms the classical and improved Q-learning. An Advance Q Learning (AQL) Approach for Path Planning and Obstacle Avoidance of a Mobile Robot

[1]  Pradipta KDas,et al.  An Improved Q-learning Algorithm for Path-Planning of a Mobile Robot , 2012 .

[2]  Zbigniew Michalewicz,et al.  Adaptive evolutionary planner/navigator for mobile robots , 1997, IEEE Trans. Evol. Comput..

[3]  Seong-Won Lee,et al.  Quarter-pel Interpolation Architecture in H.264/AVC Decoder , 2007 .

[4]  Kyungeun Cho,et al.  A Production Technique for a Q-table with an Influence Map for Speeding up Q-learning , 2007 .

[5]  Lydia E. Kavraki,et al.  Path planning for minimal energy curves of constant length , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[6]  Michele L. McNeal,et al.  Chatbots: Automating Reference in Public Libraries , 2013 .

[7]  Edward Iglesias Robots in Academic Libraries: Advancements in Library Automation , 2013 .

[8]  R. Rodriguez,et al.  Navigation of Autonomous Vehicles in Unknown Environments using Reinforcement Learning , 2007, 2007 IEEE Intelligent Vehicles Symposium.

[9]  Jihong Lee,et al.  A minimum-time trajectory planning method for two robots , 1992, IEEE Trans. Robotics Autom..

[10]  S. S. Masoumzadeh,et al.  Deep Blue: A Fuzzy Q-Learning Enhanced Active Queue Management Scheme , 2009, 2009 International Conference on Adaptive and Intelligent Systems.

[11]  Hiroki Takada,et al.  Stylus-Based Tele-Touch System using a Surface Acoustic Wave Tactile Display , 2012, Int. J. Intell. Mechatronics Robotics.

[12]  Bart De Schutter,et al.  Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[13]  Eva Hudlicka,et al.  Guidelines for Designing Computational Models of Emotions , 2011, Int. J. Synth. Emot..

[14]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[15]  Punit Pandey,et al.  Approximate Q-Learning: An Introduction , 2010, 2010 Second International Conference on Machine Learning and Computing.

[16]  Hsin-Yi Lin,et al.  A CMAC-Q-Learning based Dyna agent , 2008, 2008 SICE Annual Conference.

[17]  Thomas Dean,et al.  Reinforcement Learning for Planning and Control , 1993 .

[18]  Jie Wang,et al.  Hybrid Q-learning algorithm about cooperation in MAS , 2009, 2009 Chinese Control and Decision Conference.

[19]  Paul Levi,et al.  Cooperative Multi-Robot Path Planning by Heuristic Priority Adjustment , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  P. Korondi,et al.  Hierarchical Reinforcement Learning for Robot Navigation using the Intelligent Space Concept , 2007, 2007 11th International Conference on Intelligent Engineering Systems.

[21]  Meng Joo Er,et al.  A Novel Q-Learning Approach with Continuous States and Actions , 2007, 2007 IEEE International Conference on Control Applications.

[22]  Jae-Bok Song,et al.  Path Planning for a Robot Manipulator based on Probabilistic Roadmap and Reinforcement Learning , 2007 .

[23]  H. Hoyer,et al.  Planning of optimal paths for autonomous agents moving in inhomogeneous environments , 1997, 1997 8th International Conference on Advanced Robotics. Proceedings. ICAR'97.