Task Allocation through Vacancy Chains: Action Selection in Multi-Robot Learning

We present an adaptive multi-robot task allocation algorithm based on vacancy chains, a resource distribution process common in animal and human societies. The algorithm uses individual reinforcement learning of task utilities and relies on the specializing abilities of the members of the group to promote dedicated optimal allocation patterns. We demonstrate through experiments in simulation, the difference between the allocation patterns emerging when robots used greedy and softmax action selection functions. We conclude that using softmax functions makes the vacancy chain algorithm sensitive to different levels of ability in a group of heterogeneous robots as well as to the effects of the underlying group dynamics such as interference and synergy.

[1]  Michael Sampels,et al.  Ant colony optimization for FOP shop scheduling: a case study on different pheromone representations , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[2]  Dani Goldberg Design and Evaluation of Robust Behavior-Based Controllers for Distributed Multi-Robot Collection Tasks , 2001 .

[3]  Wilfried Brauer,et al.  Multi-machine scheduling-a multi-agent learning approach , 1998, Proceedings International Conference on Multi Agent Systems (Cat. No.98EX160).

[4]  Manuela M. Veloso,et al.  Task Decomposition, Dynamic Role Assignment, and Low-Bandwidth Communication for Real-Time Strategic Teamwork , 1999, Artif. Intell..

[5]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[6]  Maja J. Mataric,et al.  Sold!: auction methods for multirobot coordination , 2002, IEEE Trans. Robotics Autom..

[7]  I. Chase,et al.  The vacancy chain process: a new mechanism of resource distribution in animals with application to hermit crabs , 1988, Animal Behaviour.

[8]  Gaurav S. Sukhatme,et al.  Scheduling with Group Dynamics: A Multi-Robot Task-Allocation Algorithm based on Vacancy Chains , 2002 .

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  Rachid Alami,et al.  M+: a scheme for multi-robot cooperation through negotiated task allocation and achievement , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[11]  Maja J. Mataric,et al.  Broadcast of Local Elibility for Multi-Target Observation , 2000, DARS.

[12]  Tucker R. Balch The impact of diversity on performance in multi-robot foraging , 1999, AGENTS '99.

[13]  Chris Melhuish,et al.  Stigmergy, Self-Organization, and Sorting in Collective Robotics , 1999, Artificial Life.

[14]  Lynne E. Parker,et al.  L-ALLIANCE: Task-oriented multi-robot learning in behavior-based systems , 1996, Adv. Robotics.

[15]  Maja J. Matari,et al.  Behavior-based Control: Examples from Navigation, Learning, and Group Behavior , 1997 .

[16]  Tucker Balch,et al.  Reward and Diversity in Multirobot Foraging , 1999, IJCAI 1999.