A Machine Learning Method for Improving Task Allocation in Distributed Multi-Robot Transportation

Machine learning is a means of automatically generating solutions that perform better than those that are hand-coded by human programmers. We present a general behavior-based algorithm that uses reinforcement learning to improve the spatio-temporal organization of a homogeneous group of robots. In this algorithm each robot applies the learning at the level of individual behavior selection. We demonstrate how the interactions within the group affect the individual learning in a way that produces group-level effects, such as lane-formation and specialization, and improves the group's performance. We also present a model of multi-robot task allocation as resource distribution through vacancy chains, a distribution method common in human and animal societies, and an algorithm for multi-robot task allocation based on that model. The model explains and predicts the task allocation achieved by our algorithm and highlights its limitations. We present experimental results that validate our model and show that our algorithm outperforms pre-programmed solutions. Last, we present an extension of our algorithm that makes it applicable to heterogeneous groups of robots

[1]  Gaurav S. Sukhatme,et al.  Adaptive spatio-temporal organization in groups of robots , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Maja J. Mataric,et al.  General spatial features for analysis of multi-robot and human activities from raw position data , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Maja J. Mataric,et al.  Broadcast of Local Elibility for Multi-Target Observation , 2000, DARS.

[4]  Manuela M. Veloso,et al.  Automatically tracking and analyzing the behavior of live insect colonies , 2001, AGENTS '01.

[5]  Maja J. Mataric,et al.  Sold!: auction methods for multirobot coordination , 2002, IEEE Trans. Robotics Autom..

[6]  Rachid Alami,et al.  M+: a scheme for multi-robot cooperation through negotiated task allocation and achievement , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[7]  Maja J. Mataric,et al.  Behaviour-based control: examples from navigation, learning, and group behaviour , 1997, J. Exp. Theor. Artif. Intell..

[8]  Lynne E. Parker,et al.  L-ALLIANCE: Task-oriented multi-robot learning in behavior-based systems , 1996, Adv. Robotics.

[9]  Tucker R. Balch The impact of diversity on performance in multi-robot foraging , 1999, AGENTS '99.

[10]  Gaurav S. Sukhatme,et al.  Multi-robot task-allocation through vacancy chains , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[11]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[12]  Peter Brucker,et al.  Scheduling Algorithms , 1995 .

[13]  I. Chase,et al.  The vacancy chain process: a new mechanism of resource distribution in animals with application to hermit crabs , 1988, Animal Behaviour.

[14]  Maja J. Mataric,et al.  Interaction and intelligent behavior , 1994 .

[15]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[16]  Maja J. Matarić,et al.  A Framework for Studying Multi-Robot Task Allocation , 2003 .

[17]  Maja J. Mataric,et al.  Learning Multiple Models for Reward Maximization , 2000, ICML.

[18]  Manuela Veloso,et al.  Automated Robot Behavior Recognition Applied to Robotic Soccer , 1999 .

[19]  Roger B. Myerson,et al.  Game theory - Analysis of Conflict , 1991 .

[20]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[21]  Manuela M. Veloso,et al.  Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.

[22]  Maja J. Mataric,et al.  Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[23]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[24]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[25]  Gaurav S. Sukhatme,et al.  Most valuable player: a robot device server for distributed control , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[26]  Gaurav S. Sukhatme,et al.  Scheduling with Group Dynamics: A Multi-Robot Task-Allocation Algorithm based on Vacancy Chains , 2002 .

[27]  Maja J. Matarić,et al.  A formal framework for the study of task allocation in multi-robot systems , 2003 .

[28]  A. Ijspeert,et al.  A Macroscopic Analytical Model of Collaboration in Distributed Robotic Systems , 2002, Artificial Life.

[29]  Anil K. Seth,et al.  Modeling Group Foraging: Individual Suboptimality, Interference, and a Kind of Matching , 2001, Adapt. Behav..

[30]  Maja J. Matari,et al.  Behavior-based Control: Examples from Navigation, Learning, and Group Behavior , 1997 .

[31]  Lynne E. Parker,et al.  Robot Teams: From Diversity to Polymorphism , 2002 .