Conditional Q-learning algorithm for path-planning of a mobile robot

In classical Q-learning, the Q-table is updated after each state-transition of the agent. This is not always economic. This paper provides an alternative approach to Q-learning, where the Q-value of a grid is updated until a Boolean variable Lock associated with the cell is set. Thus the proposed algorithm saves unnecessary updating in the Q-table. Complexity analysis reveals that there is a significant saving in time- and space-complexity of the proposed algorithm with respect to the classical Q-learning.

[1]  Jihong Lee,et al.  A minimum-time trajectory planning method for two robots , 1992, IEEE Trans. Robotics Autom..

[2]  Amit Konar,et al.  Computational Intelligence: Principles, Techniques and Applications , 2005 .

[3]  Lydia E. Kavraki,et al.  Path planning for minimal energy curves of constant length , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[4]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[5]  Bart De Schutter,et al.  Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[6]  Paul Levi,et al.  Cooperative Multi-Robot Path Planning by Heuristic Priority Adjustment , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  H. Hoyer,et al.  Planning of optimal paths for autonomous agents moving in inhomogeneous environments , 1997, 1997 8th International Conference on Advanced Robotics. Proceedings. ICAR'97.

[8]  Amit Konar,et al.  Cooperative multi-robot path planning using differential evolution , 2009, J. Intell. Fuzzy Syst..

[9]  R. Bellman Dynamic programming. , 1957, Science.

[10]  Zbigniew Michalewicz,et al.  Adaptive evolutionary planner/navigator for mobile robots , 1997, IEEE Trans. Evol. Comput..

[11]  Thomas Dean,et al.  Reinforcement Learning for Planning and Control , 1993 .