论文信息 - Learning and relearning of target decision strategies in continuous coordinated cleaning tasks with shallow coordination

Learning and relearning of target decision strategies in continuous coordinated cleaning tasks with shallow coordination

We propose a method of autonomous learning of target decision strategies for coordination in the continuous cleaning domain. With ongoing advances in computer and sensor technologies, we can expect robot applications for covering large areas that often require coordinated/cooperative activities by multiple robots. We focus on the cleaning tasks by multiple robots or by agents which are programs to control the robots in this paper. We assumed situations where agents did not directly exchange deep and complicated internal information and reasoning results such as plans, strategies and long-term targets for their sophisticated coordinated activities, but rather exchanged superficial information such as the locations of other agents (using the equipment de- ployed) for their shallow coordination and individually learned appropriate strategies by observing how much dirt/dust had been vacuumed up in multi-agent system environments. We will first discuss the preliminary method of improving the coordinated activities by autonomously learning to select cleaning strategies to determine which targets to move to clear them. Although we could have improved the efficiency of cleaning, we observed a phenomenon where performance degraded if agents continued to learn strategies. This is because so many agents overly selected the same strategy (over-selection) by using autonomous learning. In addition, the preliminary method assumed information given about which regions in the environment easily became dirty. Thus, we propose a method that was extended by incorporating the preliminary method with (1) environmental learning to iden- tify which places were likely to be dirty and (2) autonomous relearning through self-monitoring the amount of vacuumed dirt to avoid strategies from being over-selected. We experimentally evaluated the proposed method by comparing its performance with those obtained by the regimes of agents with a single strategy and obtained with the preliminary method. The experimental results revealed that the proposed method enabled agents to select target decision strategies and, if necessary, to abandon the current strategies from their own perspectives, resulting in appropriate combinations of multiple strategies. We also found that environmental learning on dirt accumulation was effectively learned.

[1] Eiichi Yoshida,et al. Cooperative sweeping by multiple mobile robots , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[2] Karl Tuyls,et al. Multi-robot collision avoidance with localization uncertainty , 2012, AAMAS.

[3] Peter K. Allen,et al. Planning complex physical tasks for disaster response with a humanoid robot , 2013, 2013 IEEE Conference on Technologies for Practical Robot Applications (TePRA).

[4] Wolfram Burgard,et al. An efficient fastSLAM algorithm for generating maps of large-scale cyclic environments from raw laser range measurements , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[5] James D. McCaffrey,et al. Graph partitioning using a Simulated Bee Colony algorithm , 2011, 2011 IEEE International Conference on Information Reuse & Integration.

[6] Roland Siegwart,et al. Voronoi coverage of non-convex environments with a group of networked robots , 2010, 2010 IEEE International Conference on Robotics and Automation.

[7] Nicholas R. Jennings,et al. Learning when and how to coordinate , 2003, Web Intell. Agent Syst..

[8] Gerhard Weiss,et al. A Multi-robot Coverage Approach Based on Stigmergic Communication , 2012, MATES.

[9] Alfred M. Bruckstein,et al. Multi-a(ge)nt Graph Patrolling and Partitioning , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[10] Gaurav S. Sukhatme,et al. Mobile Robot Simultaneous Localization and Mapping in Dynamic Environments , 2005, Auton. Robots.

[11] Vijay Kumar,et al. Hybrid control of formations of robots , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[12] J. Barata,et al. A multi-robot system for landmine detection , 2005, IEEE Conference on Emerging Technologies and Factory Automation.

[13] Patrícia C. A. R. Tedesco,et al. The Gravitational Strategy for the Timed Patrolling , 2010, 2010 22nd IEEE International Conference on Tools with Artificial Intelligence.

[14] Francesco Bullo,et al. Esaim: Control, Optimisation and Calculus of Variations Spatially-distributed Coverage Optimization and Control with Limited-range Interactions , 2022 .

[15] Toshiharu Sugawara,et al. Decentralized Area Partitioning for a Cooperative Cleaning Task , 2013, PRIMA.

[16] Jerry B. Weinberg,et al. An Implementation of Robot Formations using Local Interactions , 2007, AAAI.

[17] P. Stone,et al. Continuous area sweeping: a task definition and initial approach , 2005, ICAR '05. Proceedings., 12th International Conference on Advanced Robotics, 2005..

[18] Nicholas R. Jennings,et al. Near-optimal continuous patrolling with teams of mobile information gathering agents , 2013, Artif. Intell..

[19] Peter Stone,et al. A multi-robot system for continuous area sweeping tasks , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[20] Yann Chevaleyre. Theoretical analysis of the multi-agent patrolling problem , 2004 .

[21] Noa Agmon,et al. Fault-tolerant gathering algorithms for autonomous mobile robots , 2004, SODA '04.

[22] Mac Schwager,et al. Unifying geometric, probabilistic, and potential field approaches to multi-robot deployment , 2011, Int. J. Robotics Res..

[23] Noa Agmon,et al. Multi-robot area patrol under frequency constraints , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[24] Toshiharu Sugawara,et al. Autonomous Learning of Target Decision Strategies without Communications for Continuous Coordinated Cleaning Tasks , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[25] Eldert J. van Henten,et al. Path planning for the autonomous collection of eggs on floors , 2014 .