Automatic selection of loop scheduling algorithms using reinforcement learning

This paper presents the design and implementation of a reinforcement learning agent that automatically selects appropriate loop scheduling algorithms for parallel loops embedded in time-stepping scientific applications executing on clusters. There may be a number of such loops in an application, and the loops may have different load balancing requirements. Further, loop characteristics may also change as the application progresses. Following a model-free learning approach, the learning agent assigned to a loop selects from a library the best scheduling algorithm for the loop during the lifetime of the application. The utility of the learning agent is demonstrated by its successful integration into the simulation of wave packets - an application arising from quantum mechanics. Results of statistical analysis using pairwise comparison of means on the running time of the simulation with and without the learning agent validate the effectiveness of the agent in improving the parallel performance of the simulation.

[1]  Sridhar Mahadevan,et al.  Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..

[2]  Ioana Banicescu,et al.  On the Scalability of Dynamic Scheduling Scientific Applications with Adaptive Weighted Factoring , 2003, Cluster Computing.

[3]  Jeanette P. Schmidt,et al.  Load-sharing in heterogeneous systems via weighted factoring , 1996, SPAA '96.

[4]  Stuart I. Reynolds Reinforcement Learning with Exploration , 2002 .

[5]  Richard S. Sutton,et al.  Planning by Incremental Dynamic Programming , 1991, ML.

[6]  L.M. Ni,et al.  Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers , 1993, IEEE Trans. Parallel Distributed Syst..

[7]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[8]  S. Schaal,et al.  Robot juggling: implementation of memory-based learning , 1994, IEEE Control Systems.

[9]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[10]  Ioana Banicescu,et al.  A novel dynamic load balancing library for cluster computing , 2004, Third International Symposium on Parallel and Distributed Computing/Third International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks.

[11]  Jing Peng,et al.  Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..

[12]  Ioana Banicescu,et al.  Message-passing parallel adaptive quantum trajectory method , 2004 .

[13]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[14]  Michail G. Lagoudakis,et al.  Algorithm Selection using Reinforcement Learning , 2000, ICML.

[15]  Evangelos P. Markatos,et al.  Using processor affinity in loop scheduling on shared-memory multiprocessors , 1992, Supercomputing '92.

[16]  CONSTANTINE D. POLYCHRONOPOULOS,et al.  Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.

[17]  Gerald Tesauro,et al.  Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[18]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[19]  Ioana Banicescu,et al.  Solving the hydrodynamic formulation of quantum mechanics: A parallel MLS method , 2001 .

[20]  C. Atkeson,et al.  Prioritized Sweeping : Reinforcement Learning withLess Data and Less Real , 1993 .

[21]  Ioana Banicescu,et al.  Dynamic scheduling parallel loops with variable iterate execution times , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[22]  Maja J. Mataric,et al.  Reward Functions for Accelerated Learning , 1994, ICML.

[23]  I. Banicescu,et al.  Balancing Processor Loads and Exploiting Data Locality in N-Body Simulations , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[24]  Ioana Banicescu,et al.  A Load Balancing Tool for Distributed Parallel Loops , 2005, Cluster Computing.

[25]  Ioana Banicescu,et al.  Parallel adaptive quantum trajectory method for wavepacket simulations , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[26]  Ioana Banicescu,et al.  Load Balancing Parallel Loops on Message-passing Systems , 2002, IASTED PDCS.

[27]  Gerald Tesauro,et al.  TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[28]  Edith Schonberg,et al.  Factoring: a method for scheduling parallel loops , 1992 .

[29]  Andrew W. Moore,et al.  Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.

[30]  R. Wyatt,et al.  Quantum Wave Packet Dynamics with Trajectories , 1999 .

[31]  Alan Weiss,et al.  Allocating Independent Subtasks on Parallel Processors , 1985, IEEE Transactions on Software Engineering.

[32]  Gerald Tesauro,et al.  Practical issues in temporal difference learning , 1992, Machine Learning.

[33]  D. Bohm A SUGGESTED INTERPRETATION OF THE QUANTUM THEORY IN TERMS OF "HIDDEN" VARIABLES. II , 1952 .

[34]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[35]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[36]  Ioana Banicescu,et al.  Load balancing highly irregular computations with the adaptive factoring , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.