A Review of Machine Learning and Meta-heuristic Methods for Scheduling Parallel Computing Systems

Optimized software execution on parallel computing systems demands consideration of many parameters at run-time. Determining the optimal set of parameters in a given execution context is a complex task, and therefore to address this issue researchers have proposed different approaches that use heuristic search or machine learning. In this paper, we undertake a systematic literature review to aggregate, analyze and classify the existing software optimization methods for parallel computing systems. We review approaches that use machine learning or meta-heuristics for scheduling parallel computing systems. Additionally, we discuss challenges and future research directions. The results of this study may help to better understand the state-of-the-art techniques that use machine learning and meta-heuristics to deal with the complexity of scheduling parallel computing systems. Furthermore, it may aid in understanding the limitations of existing approaches and identification of areas for improvement.

[1]  Sanjeev Baskiyar,et al.  A Novel Adaptive Support Vector Machine based Task Scheduling , 2010 .

[2]  Albert Y. Zomaya,et al.  Solutions to Parallel and Distributed Computing Problems , 2001 .

[3]  Carlos Eduardo Pereira,et al.  Sm@rtConfig: A context-aware runtime and tuning system using an aspect-oriented approach for data intensive engineering applications , 2013 .

[4]  Zheng Wang,et al.  Fast Automatic Heuristic Construction Using Active Learning , 2014, LCPC.

[5]  William F. Ogilvie,et al.  CGO: G: Intelligent Heuristic Construction with Active Learning , 2015 .

[6]  Albert Y. Zomaya,et al.  Observations on Using Genetic Algorithms for Dynamic Load-Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[7]  Michael F. P. O'Boyle,et al.  Smart, adaptive mapping of parallelism in the presence of external workload , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[8]  Andrew J. Page,et al.  Framework for Task Scheduling in Heterogeneous Distributed Computing Using Genetic Algorithms , 2005, Artificial Intelligence Review.

[9]  Cédric Augonnet,et al.  PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems , 2011, IEEE Micro.

[10]  Michael F. P. O'Boyle,et al.  Mapping parallelism to multi-cores: a machine learning based approach , 2009, PPoPP '09.

[11]  Byoung-Dai Lee,et al.  Run-time prediction of parallel applications on shared environments , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[12]  Michael F. P. O'Boyle,et al.  Rapidly Selecting Good Compiler Optimizations using Performance Counters , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[13]  Alécio Pedro Delazari Binotto,et al.  A Self-adaptive Auto-scaling Method for Scientific Applications on HPC Environments and Clouds , 2014, ArXiv.

[14]  Siegfried Benkner,et al.  Using explicit platform descriptions to support programming of heterogeneous many-core systems , 2012, Parallel Comput..

[15]  David Padua,et al.  Encyclopedia of Parallel Computing , 2011 .

[16]  Michael F. P. O'Boyle,et al.  A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL , 2011, CC.

[17]  Denis Trystram,et al.  Improving backfilling by using machine learning to predict running times , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[18]  Gagan Agrawal,et al.  A dynamic scheduling framework for emerging heterogeneous systems , 2011, 2011 18th International Conference on High Performance Computing.

[19]  Christoph W. Kessler,et al.  Optimized composition of performance‐aware parallel components , 2012, Concurr. Comput. Pract. Exp..

[20]  Jean-François Méhaut,et al.  Dynamic Thread Mapping Based on Machine Learning for Transactional Memory Applications , 2012, Euro-Par.

[21]  Nancy M. Amato,et al.  A framework for adaptive algorithm selection in STAPL , 2005, PPoPP.

[22]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.

[23]  Jim Jeffers,et al.  High Performance Parallelism Pearls Volume Two: Multicore and Many-core Programming Approaches , 2015 .

[24]  Christoph W. Kessler,et al.  Adaptive Off-Line Tuning for Optimized Composition of Components for Heterogeneous Many-Core Systems , 2012, VECPAR.

[25]  Zbigniew J. Czech,et al.  Introduction to Parallel Computing , 2017 .

[26]  Sabri Pllana,et al.  Combinatorial optimization of DNA sequence analysis on heterogeneous systems , 2017, Concurr. Comput. Pract. Exp..

[27]  William H. Press,et al.  Numerical Recipes 3rd Edition: The Art of Scientific Computing , 2007 .

[28]  Sabri Pllana,et al.  Optimal Worksharing of DNA Sequence Analysis on Accelerated Platforms , 2016, Resource Management for Big Data Platforms.

[29]  Fatos Xhafa,et al.  Genetic algorithm based schedulers for grid computing systems , 2007 .

[30]  Ozcan Ozturk,et al.  Improving application behavior on heterogeneous manycore systems through kernel mapping , 2013, Parallel Comput..

[31]  Ishfaq Ahmad,et al.  Scheduling Parallel Programs Using Genetic Algorithms , 2000 .

[32]  Keith D. Cooper,et al.  ACME: adaptive compilation made efficient , 2005, LCTES '05.

[33]  Jean-François Méhaut,et al.  A machine learning-based approach for thread mapping on transactional memory applications , 2011, 2011 18th International Conference on High Performance Computing.

[34]  Stefanie Rinderle-Ma,et al.  Predicting Resource Allocation and Costs for Business Processes in the Cloud , 2015, 2015 IEEE World Congress on Services.

[35]  Sabri Pllana,et al.  The Potential of the Intel (R) Xeon Phi for Supervised Deep Learning , 2015, 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems.

[36]  Geppino Pucci,et al.  Universality in VLSI Computation , 2011, ParCo 2011.

[37]  Michael Voss,et al.  Runtime empirical selection of loop schedulers on hyperthreaded SMPs , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[38]  Michael F. P. O'Boyle,et al.  A workload-aware mapping approach for data-parallel programs , 2011, HiPEAC.

[39]  Sabri Pllana,et al.  Using a multi-agent system and artificial intelligence for monitoring and improving the cloud performance and security , 2017, Future Gener. Comput. Syst..

[40]  Zheng Wang,et al.  Intelligent Heuristic Construction with Active Learning , 2015 .

[41]  Jesús Labarta,et al.  Performance-driven processor allocation , 2000, IEEE Transactions on Parallel and Distributed Systems.

[42]  Gregory Diamos,et al.  Harmony: an execution model and runtime for heterogeneous many core systems , 2008, HPDC '08.

[43]  Li Zhang,et al.  MRONLINE: MapReduce online performance tuning , 2014, HPDC '14.

[44]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[45]  Christoph W. Kessler,et al.  Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption , 2017, ARMS-CC@PODC.

[46]  Erwin Laure,et al.  A Particle-in-Cell Method for Automatic Load-Balancing with the AllScale Environment , 2016 .

[47]  Sabri Pllana,et al.  A machine learning approach for accelerating DNA sequence analysis , 2018, Int. J. High Perform. Comput. Appl..

[48]  S. N. Sivanandam,et al.  Dynamic task scheduling with load balancing using parallel orthogonal particle swarm optimisation , 2009, Int. J. Bio Inspired Comput..

[49]  Hyesoon Kim,et al.  Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).