论文信息 - Effects of Response Threshold Distribution on Dynamic Division of Labor in Decentralized Swarms

Effects of Response Threshold Distribution on Dynamic Division of Labor in Decentralized Swarms

In this paper, we investigate how the distribution of response threshold values affects the ability of decentralized swarms to dynamically achieve appropriate division of labor in response to changing task demands. Inter-agent variation of response thresholds is a common method for de-synchronizing decentralized agents, which can result in more effective division of labor. We present a systematic study of three different distributions that are relevant to natural and artificial swarms. We use each of these distributions to generate the agent response thresholds in a swarm and examine the accuracy and stability of the swarm’s performance on a collective control problem. Introduction We investigate how the distribution of response threshold values affects the ability of decentralized swarms to dynamically achieve appropriate division of labor in response to changing task demands. We focus on problems in which task stimuli are globally available to all agents in the swarm. The response threshold method for task allocation is a commonly used approach for generating division of labor in decentralized swarms. This method is a reactive method in which agents dynamically decide which task to respond to based on the task stimuli sensed at any give time. Decentralized swarms are robust because the lack of a central controller means that there is not a single point of failure. The lack of a central controller, however, also makes it more difficult to coordinate agents such that the group as a whole responds intelligently and efficiently to multiple task demands. For a multi-agent system (MAS) to be able to respond intelligently to different situations or states, the individual agents that make up the system must be able to respond differently to the same input state (Ashby 1958). For problems in which task stimuli are sensed locally, such as deciding what resource to retrieve based on locally observed distributions (Jones and Mataric 2003; Lee and Kim 2017; Lerman et al. 2006), it is easier for decentralized agents to distribute themselves among different tasks because agents are likely to sense different stimuli at any given time and, thus, respond differently. For problems in which task stimuli are somewhat Copyright c © 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. or completely global, such as deciding whether or not to forage based on the level of a common food store (Castello et al. 2013; Krieger and Billeter 2000), ensuring that all agents do not behave identically can be a challenge as all agents sense the same task stimuli. In such problems, variable agent response is generally accomplished in one of two ways: agents respond probabilistically to task stimuli (Bonabeau, Theraulaz, and Deneubourg 1996; Kalra and Martinoli 2006; Price and Tino 2004) or agents are assigned different thresholds for the same task (Campbell, Riggs, and Wu 2011; Krieger and Billeter 2000; Riggs and Wu 2012). Either of these approaches may be extended such that agent thresholds dynamically adapt over time (Castello et al. 2013; Theraulaz, Bonabeau, and Deneubourg 1998). We are interested in the second approach, where agents are assigned different thresholds for the same task, for three reasons. First, variable thresholds make some agents more responsive to certain stimuli than other agents. If there is a cost for agents to switch tasks, variable thresholds can reduce such costs because the same (most responsive) agents for a given task are the first to respond and most likely to remain on a task. While probabilistic response can produce variable agent behavior, qualitatively, all agents are still responding to any stimuli in the same way: with the same probabilistic response. Thus, probabilistic response alone provides no mechanism for specialization or reducing task switching. Second, understanding what threshold distributions are best for what situations in static threshold swarms will allow us to more effectively evaluate how well a swarm with dynamically adapted thresholds is performing. Third, while dynamically adapted thresholds theoretically allow a swarm to adjust its threshold distributions to whatever distribution is appropriate for the task demands at any given moment, previous work finds that once a dynamic system has adapted to a set of task demands, it can be difficult for the system to re-adapt to new demands (Kazakova and Wu 2018; Theraulaz, Bonabeau, and Deneubourg 1998). Dynamically adapted thresholds typically use a positive feedback loop which commonly take a system to states that are difficult to subsequently escape. As a result, systems that use dynamically adapted thresholds may not as adaptable as expected, and they may be biased by their first task. (Meyer et al. 2015) provide an example problem where static variable thresholds are more effective than dynamically adapted thresholds. Studies on both natural (Jones et al. 2004; Weidenmüller 2004) and artificial (Krieger and Billeter 2000; Riggs and Wu 2012) decentralized swarms have established that interagent variation of response thresholds can result in more stable and effective division of labor. To our knowledge, however, there is little work studying how different threshold distributions affect swarm behavior (Campbell, Riggs, and Wu 2011). The distribution of threshold values among agents determines the rate at which agents enter and leave the workforce, which can potentially impact the responsiveness and stability of the swarm. As a result, the choice of distribution to use in an artificial swarm may have a significant effect on how the swarm responds to changes in task demand. We perform a comparative study on how three distributions of response thresholds affect swarm behavior on problems with both gradually and abruptly changing task demands. We perform this study on a collective tracking problem which allows us to systematically define problems with gradually changing and abruptly changing task demands. Specific questions that we investigate are: • Does the distribution of agent thresholds affect how well the system responds to task needs? • Are different distributions better for different situations? Situations of interest include gradually changing versus abruptly changing task demand. Our model The testbed that we use is a collective tracking problem. The problem consists of a target that moves in various prescribed paths in a 2D space and a tracker that is controlled collectively by a swarm of decentralized non-communicating agents. In each timestep, the target moves a fixed distance in a direction that is specified by the selected path, and the agents in the swarm attempt to collectively move the tracker to match the target’s movement. Agents can choose from one of four possible tasks – PUSH NORTH, PUSH EAST, PUSH SOUTH, PUSH WEST – or remain IDLE, and the actions of all agents in the swarm are aggregated to determine the tracker movement in that timestep. As a result, the swarm’s goal is to ensure that an appropriate number of agents are allocated to each task at any given time. Each run consists of a fixed number of timesteps or moves. Over the course of a run, we measure the average and maximum distance between the target and tracker, the lengths of the paths each travels, and the number of times agents switch tasks. The authors recognize and concur that there are more efficient and effective ways to achieve tracking. We use this problem as our testbed because it defines a clear set of tasks on which agents must self-organize, and the target paths provide a systematic way to define and study dynamically changing task demands. Target movement creates dynamic task demands In each run, the target moves at a fixed speed of Starget units per timestep along a selected path. The movement of the target and the resulting relative positions of the target and tracker determines the task demands perceived by the swarm in each timestep. For example, if the target is due east of the tracker, the swarm will detect a non-zero task demand for pushing east and zero demand for all other directions. If the target is northwest of the tracker, the swarm will detect nonzero task demands for pushing north and west and zero task demand for south and west. The distance between the target and tracker determines the magnitude of the task demands. A separate task demand value is calculated for each of the four directions by subtracting the tracker position from the target position. In the north and south directions, this is the difference between the target’s and tracker’s y-coordinate values; in the east and west directions, the x-coordinate values. Negative differences are set to zero. In order to examine different task demand scenarios, we test on target paths with both gradually and abruptly changing task demands. The former are paths in which the target heading changes in small increments from one timestep to the next. These paths include four serpentine paths with increasing period and amplitude (s-curve10, s-curve20, scurve30, s-curve40) and one path in which the target makes small random changes in its heading in each timestep (random). The latter are paths in which the target heading may change by a large amount. These paths include four sawtooth paths with increasing period and amplitude (zigzag10, zigzag20, zigzag30, zigzag40) and one path in which the target probabilistically changes its heading to a random new heading in each timestep (sharp). Tracker movement The maximum distance that the tracker can move in one timestep is SMax = Ratio × Starget where Ratio ≥ 1.0. Thus, the tracker’s maximum speed is as fast or faster than the target speed. Each agent has a unique threshold for each of the four tasks that it can choose. Thresholds are assigned at the start of each run and remain fixed throughout the run. When the task demand exceeds an agent’s threshold for that task, the agent will consider acting on t

Annie S. Wu | H. David Mathias | Anthony Hevia | Joseph P. Giordano

[1] S. Frank. The common patterns of nature , 2009, Journal of evolutionary biology.

[2] W. Ashby,et al. Requisite Variety and Its Implications for the Control of Complex Systems , 1991 .

[3] G. Theraulaz,et al. Response threshold reinforcements and division of labour in insect societies , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[4] Michael J. B. Krieger,et al. The call of duty: Self-organised task allocation in a population of up to twelve mobile robots , 2000, Robotics Auton. Syst..

[5] Peter Tiño,et al. Evaluation of Adaptive Nature Inspired Task Allocation Against Alternate Decentralised Multiagent Strategies , 2004, PPSN.

[6] Maja J. Mataric,et al. Adaptive division of labor in large-scale minimalist multi-robot systems , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[7] Nidhi Kalra,et al. Comparative Study of Market-Based and Threshold-Based Task Allocation , 2006, DARS.

[8] DaeEun Kim,et al. History-Based Response Threshold Model for Division of Labor in Multi-Agent Systems , 2017, Sensors.

[9] S. Graham,et al. Honey Bee Nest Thermoregulation: Diversity Promotes Stability , 2004, Science.

[10] G. Robinson. Regulation of division of labor in insect societies. , 1992, Annual review of entomology.

[11] Annie S. Wu,et al. Variation as an element in multi-agent control for target tracking , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12] Annie S. Wu,et al. On the Impact of Variation on Self-Organizing Systems , 2011, 2011 IEEE Fifth International Conference on Self-Adaptive and Self-Organizing Systems.

[13] Annie S. Wu,et al. Specialization versus Re-Specialization: Effects of Hebbian Learning in a Dynamic Environment , 2018, FLAIRS.

[14] Rui Chen,et al. Collective Homeostasis and Time-resolved Models of Self-organised Task Allocation , 2016, BICT.

[15] Yutaka Nakamura,et al. Task Allocation for a robotic swarm based on an Adaptive Response Threshold Model , 2013, 2013 13th International Conference on Control, Automation and Systems (ICCAS 2013).

[16] Anja Weidenmüller,et al. The control of nest climate in bumblebee (Bombus terrestris) colonies: interindividual variability and self reinforcement in fanning response , 2004 .

[17] Kristina Lerman,et al. Analysis of Dynamic Task Allocation in Multi-Robot Systems , 2006, Int. J. Robotics Res..