Human-like gradual learning of a Q-learning based Light exploring robot

Machine learning is an important issue to researchers for several years. Reinforcement learning is a type of unsupervised learning which uses state-action combinations and rewards to interact with the environment. Q-learning a further, sub-division of reinforcement learning is now-a-days well-accepted algorithm for robots (machine) learning. However human beings learn in different ways. One of such learning is gradual learning which is mostly continuous in nature. This present paper uses gradual learning combined with Q-learning for light exploration. The first Q-table is randomly generated, but the next Q-tables are inter-dependent and gradually refined. Initial learning time may be high, but final learning time is lower and this proves the efficiency of this learning technique. Apart the convergence of the Q-learning is also established.

[1]  Yasuhiro Tanaka,et al.  Reinforcement Learning for a Real Robot in a Real Environment , 1996, ECAI.

[2]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[3]  Stella Vosniadou,et al.  How children learn , 2001 .

[4]  Wenwei Yu,et al.  Obstacle avoidance learning for a multi-agent linked robot in the real world , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[5]  Chien-Hsing Chou,et al.  A reinforcement-learning approach to robot navigation , 2004, IEEE International Conference on Networking, Sensing and Control, 2004.

[6]  Jun Li,et al.  Q-RAN: A Constructive Reinforcement Learning Approach for Robot Behavior Learning , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Yantao Tian,et al.  Obstacle avoidance of multi mobile robots based on behavior decomposition reinforcement learning , 2007, 2007 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[8]  Kagan Tumer,et al.  Quicker Q-Learning in Multi-Agent Systems , 2005 .

[9]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[10]  Hiroyuki Miyamoto,et al.  Trajectory Generation for a Mobile Robot by Reinforcement Learning , 2005, AMiRE.

[11]  David A. Bell,et al.  Multi-Agent Reinforcement Learning - An Exploration Using Q-Learning , 2009, SGAI Conf..

[12]  Astrophysics Departm Reinforcement Learning of Behaviors in Mobile Robots Using Noisy Infrared Sensing , 2008 .

[13]  Sridhar Mahadevan,et al.  Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..

[14]  Ben J. A. Kröse,et al.  Neural Q-learning , 2003, Neural Computing & Applications.

[15]  Ying Wang,et al.  Multi-robot Box-pushing: Single-Agent Q-Learning vs. Team Q-Learning , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Bojan Jerbić,et al.  Simulation Of Intelligent Robot BehaviorBased On Reinforcement Learning AndNeural Network Approach , 1970 .

[17]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[18]  N. Peric,et al.  A reinforcement learning approach to obstacle avoidance of mobile robots , 2002, 7th International Workshop on Advanced Motion Control. Proceedings (Cat. No.02TH8623).

[19]  Ronald C. Arkin,et al.  Robot behavioral selection using q-learning , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  G. SommerChristian Embedding Knowledge in Reinforcement Learning , 1998 .

[23]  Jing Peng,et al.  Incremental multi-step Q-learning , 1994, Machine Learning.

[24]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[25]  A. Benyettou,et al.  Seek of an Optimal Way by Q-Learning , 2005 .

[26]  Luke Fletcher,et al.  Reinforcement learning for a vision based mobile robot , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[27]  Bir Bhanu,et al.  Real-time robot learning , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).