Learning Scheduling Policies for Multi-Robot Coordination With Graph Attention Networks

Increasing interest in integrating advanced robotics within manufacturing has spurred a renewed concentration in developing real-time scheduling solutions to coordinate human-robot collaboration in this environment. Traditionally, the problem of scheduling agents to complete tasks with temporal and spatial constraints has been approached either with exact algorithms, which are computationally intractable for large-scale, dynamic coordination, or approximate methods that require domain experts to craft heuristics for each application. We seek to overcome the limitations of these conventional methods by developing a novel graph attention network-based scheduler to automatically learn features of scheduling problems towards generating high-quality solutions. To learn effective policies for combinatorial optimization problems, we combine imitation learning, which makes use of expert demonstration on small problems, with graph neural networks, in a non-parametric framework, to allow for fast, near-optimal scheduling of robot teams with various sizes, while generalizing to large, unseen problems. Experimental results showed that our network-based policy was able to find high-quality solutions for $\sim$90% of the testing problems involving scheduling 2–5 robots and up to 100 tasks, which significantly outperforms prior state-of-the-art, approximate methods. Those results were achieved with affordable computation cost and up to 100× less computation time compared to exact solvers.

[1]  Julie A. Shah,et al.  Fast Scheduling of Robot Teams Performing Tasks With Temporospatial Constraints , 2018, IEEE Transactions on Robotics.

[2]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[3]  Hema Raghavan,et al.  Active Learning with Feedback on Features and Instances , 2006, J. Mach. Learn. Res..

[4]  Ernesto Nunes,et al.  Multi-Robot Auctions for Allocation of Tasks with Temporal Constraints , 2015, AAAI.

[5]  Ronald G. Askin,et al.  Project selection, scheduling and resource allocation with time dependent returns , 2009, Eur. J. Oper. Res..

[6]  Reza Tavakkoli-Moghaddam,et al.  A hybrid simulated annealing algorithm for location and routing scheduling problems with cross-docking in the supply chain , 2013 .

[7]  I. ClintHeyer Human-Robot Interaction and Future Industrial Robotics Applications , 2010 .

[8]  Hongzi Mao,et al.  Learning scheduling algorithms for data processing clusters , 2018, SIGCOMM.

[9]  Siddharth Mayya,et al.  The Robotarium: Globally Impactful Opportunities, Challenges, and Lessons Learned in Remote-Access, Distributed Control of Multirobot Systems , 2020, IEEE Control Systems.

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[12]  Manuela M. Veloso,et al.  Online pickup and delivery planning with transfers for mobile robots , 2013, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Rolf H. Möhring,et al.  Resource-constrained project scheduling: Notation, classification, models, and methods , 1999, Eur. J. Oper. Res..

[14]  Pengcheng Zhang,et al.  A novel multi-agent reinforcement learning approach for job scheduling in Grid computing , 2011, Future Gener. Comput. Syst..

[15]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[16]  Martha E. Pollack,et al.  Efficient solution techniques for disjunctive temporal reasoning problems , 2003, Artif. Intell..

[17]  Andrea Lodi,et al.  Exact Combinatorial Optimization with Graph Convolutional Neural Networks , 2019, NeurIPS.

[18]  Anthony Stentz,et al.  A comprehensive taxonomy for multi-robot task allocation , 2013, Int. J. Robotics Res..

[19]  T. N. Wong,et al.  An object-coding genetic algorithm for integrated process planning and scheduling , 2015, Eur. J. Oper. Res..

[20]  Max Welling,et al.  Attention, Learn to Solve Routing Problems! , 2018, ICLR.

[21]  Vijay Kumar,et al.  Towards a swarm of agile micro quadrotors , 2012, Autonomous Robots.

[22]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[23]  Hakim Mitiche,et al.  A taxonomy for task allocation problems with temporal and ordering constraints , 2017, Robotics Auton. Syst..

[24]  Le Song,et al.  2 Common Formulation for Greedy Algorithms on Graphs , 2018 .

[25]  Rina Dechter,et al.  Temporal Constraint Networks , 1989, Artif. Intell..

[26]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[27]  Behrokh Khoshnevis,et al.  A linearized polynomial mixed integer programming model for the integration of process planning and scheduling , 2004, J. Intell. Manuf..

[28]  Matthieu Geist,et al.  Boosted Bellman Residual Minimization Handling Expert Demonstrations , 2014, ECML/PKDD.

[29]  Jacques F. Benders,et al.  Partitioning procedures for solving mixed-variables programming problems , 2005, Comput. Manag. Sci..

[30]  Herbert Hellerman Some Principles of Time-Sharing Scheduler Strategies , 1969, IBM Syst. J..

[31]  Yi-Chi Wang,et al.  Application of reinforcement learning for agent-based production scheduling , 2005, Eng. Appl. Artif. Intell..

[32]  Terry L. Zimmerman,et al.  Distributed coordination of mobile agent teams: the advantage of planning ahead , 2010, AAMAS.