Emergent Robot Differentiation for Distributed Multi-Robot Task Allocation

We present a distributed mechanism for automatically allocating tasks to robots in a manner sensitive to each robot’s performance level without handcoding these levels in advance. This mechanism is an important part of improving multi-robot task allocation (MRTA) in systems where communication is restricted or where the complexity of the group dynamics makes it necessary to make allocation decisions locally. The general mechanism is demonstrated as an improvement on our previously published task allocation through vacancy chains (TAVC) algorithm for distributed MRTA. The TAVC algorithm uses individual reinforcement learning of task utilities and relies on the specializing abilities of the members of the group to produce dedicated optimal allocations. Through experiments with realistic simulator we evaluate the improved algorithm by comparing it to random allocation. We conclude that using softmax action selection functions on task utility values makes algorithms responsive to different performance levels in a group of heterogeneous robots.

[1]  Wilfried Brauer,et al.  Multi-machine scheduling-a multi-agent learning approach , 1998, Proceedings International Conference on Multi Agent Systems (Cat. No.98EX160).

[2]  A. Ijspeert,et al.  A Macroscopic Analytical Model of Collaboration in Distributed Robotic Systems , 2002, Artificial Life.

[3]  Maja J. Mataric,et al.  General spatial features for analysis of multi-robot and human activities from raw position data , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Maja J. Mataric,et al.  Sold!: auction methods for multirobot coordination , 2002, IEEE Trans. Robotics Autom..

[5]  Tucker R. Balch The impact of diversity on performance in multi-robot foraging , 1999, AGENTS '99.

[6]  Lynne E. Parker,et al.  L-ALLIANCE: Task-oriented multi-robot learning in behavior-based systems , 1996, Adv. Robotics.

[7]  Gaurav S. Sukhatme,et al.  Multi-robot task-allocation through vacancy chains , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[8]  I. Chase,et al.  The vacancy chain process: a new mechanism of resource distribution in animals with application to hermit crabs , 1988, Animal Behaviour.

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  Maja J. Mataric,et al.  Behaviour-based control: examples from navigation, learning, and group behaviour , 1997, J. Exp. Theor. Artif. Intell..

[11]  Maja J. Mataric,et al.  Learning Multiple Models for Reward Maximization , 2000, ICML.

[12]  Gaurav S. Sukhatme,et al.  Scheduling with Group Dynamics: A Multi-Robot Task-Allocation Algorithm based on Vacancy Chains , 2002 .

[13]  Tucker Balch,et al.  Reward and Diversity in Multirobot Foraging , 1999, IJCAI 1999.

[14]  Maja J. Matari,et al.  Behavior-based Control: Examples from Navigation, Learning, and Group Behavior , 1997 .