Evolving Cut-Off Mechanisms and Other Work-Stealing Parameters for Parallel Programs

Optimizing parallel programs is a complex task because the interference among many different parameters. Work-stealing runtimes, used to dynamically balance load among different processor cores, are no exception. This work explores the automatic configuration of the following runtime parameters: dynamic granularity control algorithms, granularity control cache, work-stealing algorithm, lazy binary splitting parameter, the maximum queue size and the unparking interval. The performance of the program is highly sensible to the granularity control algorithm, which can be a combination of other granularity algorithms. In this work, we address two search-based problems: finding a globally efficient work-stealing configuration, and finding the best configuration just for an individual program. For both problems, we propose the use of a Genetic Algorithm (GA). The genotype of the GA is able to represent combinations of up to three cut-off algorithms, as well as other work-stealing parameters.

[1]  Guy E. Blelloch,et al.  Scheduling threads for constructive cache sharing on CMPs , 2007, SPAA '07.

[2]  Alejandro Duran,et al.  Evaluation of OpenMP Task Scheduling Strategies , 2008, IWOMP.

[3]  David E. Goldberg,et al.  Genetic Algorithms, Tournament Selection, and the Effects of Noise , 1995, Complex Syst..

[4]  Doug Lea,et al.  A Java fork/join framework , 2000, JAVA '00.

[5]  Ishfaq Ahmad,et al.  Efficient Scheduling of Arbitrary TAsk Graphs to Multiprocessors Using a Parallel Genetic Algorithm , 1997, J. Parallel Distributed Comput..

[6]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[7]  Robert H. Halstead,et al.  Lazy task creation: a technique for increasing the granularity of parallel programs , 1990, IEEE Trans. Parallel Distributed Syst..

[8]  Alejandro Duran,et al.  An adaptive cut-off for task parallelism , 2008, HiPC 2008.

[9]  Bruno Cabral,et al.  Evaluation of Runtime Cut-off Approaches for Parallel Programs , 2016, VECPAR.

[10]  Dongrui Fan,et al.  An Evolutionary Technique for Performance-Energy-Temperature Optimized Scheduling of Parallel Tasks on Multi-Core Processors , 2016, IEEE Transactions on Parallel and Distributed Systems.

[11]  Afonso Ferreira,et al.  Scheduling Multiprocessor Tasks with Genetic Algorithms , 1999, IEEE Trans. Parallel Distributed Syst..

[12]  Anthony A. Maciejewski,et al.  Task Matching and Scheduling in Heterogenous Computing Environments Using a Genetic-Algorithm-Based Approach , 1997, J. Parallel Distributed Comput..

[13]  Imtiaz Ahmad,et al.  Multiprocessor Scheduling in a Genetic Paradigm , 1996, Parallel Comput..

[14]  K. Dejong,et al.  An analysis of the behavior of a class of genetic adaptive systems , 1975 .

[15]  Bruno Raffin,et al.  A Work Stealing Scheduler for Parallel Loops on Shared Cache Multicores , 2010, Euro-Par Workshops.

[16]  Stephen L. Olivier,et al.  Evaluating OpenMP 3.0 Run Time Systems on Unbalanced Task Graphs , 2009, IWOMP.

[17]  Vivek Sarkar,et al.  X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.

[18]  Albert Y. Zomaya,et al.  Author manuscript, published in "Journal of Parallel and Distributed Computing (2011)" A Parallel Bi-objective Hybrid Metaheuristic for Energy-aware Scheduling for Cloud Computing Systems , 2011 .

[19]  Sriram Krishnamoorthy,et al.  Solving Large, Irregular Graph Problems Using Adaptive Work-Stealing , 2008, 2008 37th International Conference on Parallel Processing.

[20]  Conor Ryan,et al.  Automatic Parallelization of Arbitrary Programs , 1999, EuroGP.

[21]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[22]  Lei Wang,et al.  An adaptive task creation strategy for work-stealing scheduling , 2010, CGO '10.

[23]  Matteo Frigo,et al.  The implementation of the Cilk-5 multithreaded language , 1998, PLDI.

[24]  Claire Le Goues,et al.  GenProg: A Generic Method for Automatic Software Repair , 2012, IEEE Transactions on Software Engineering.

[25]  Mark Harman,et al.  Genetically Improved CUDA C++ Software , 2014, EuroGP.