Self-generating method of behavioral evaluation for reinforcement learning among multiple coordinated robots

In this paper, we present a novel self-generating algorithm for behavioral evaluation, which is used to evaluate self-selected behaviour in a reinforcement learning system. This behavioral evaluation is composed of rewards and self-evaluated standards. Rewards are given by the operator as one of the methods for understanding the purpose of tasks; and self-evaluated standards are obtained as the result of executions. Each robot can generate the evaluation depending on its situations by using the proposed method, and therefore the robots can create cooperative behaviours even if the number of robots or tasks is changed dynamically. We performed simulation experiments to study the effectiveness of the proposed method. The experimental results confirm that each robot can generate evaluations for creating cooperative behaviours without changing the algorithm during the simulation experiments.

[1]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[2]  Hajime Asama,et al.  Collision avoidance among multiple mobile robots based on rules and communication , 1991, Proceedings IROS '91:IEEE/RSJ International Workshop on Intelligent Robots and Systems '91.

[3]  Fumihito Arai,et al.  Cooperative path planning and navigation based on distributed sensing , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[4]  Paul Levi,et al.  Asynchronous and Synchronous Cooperation , 1996, DARS.

[5]  Jukka Riekki,et al.  Vision-based behaviors for multi-robot cooperation , 1994, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'94).

[6]  Kazuo Tanie,et al.  Spontaneous Coordinated Behavior of Robots through Reinforcement Learning , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[7]  Eiichi Yoshida,et al.  Cooperative sweeping by multiple mobile robots , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[8]  Minoru Asada,et al.  Vision-based reinforcement learning for purposive behavior acquisition , 1995, Proceedings of 1995 IEEE International Conference on Robotics and Automation.