List-Scheduling versus Cluster-Scheduling

In scheduling theory and parallel computing practice, programs are often represented as directed acyclic graphs. Finding a makespan-minimising schedule for such a graph on a given number of homogenous processors ( $P|prec,c_{ij}|C_{\max}$ ) is an NP-hard optimisation problem. Among the many proposed heuristics, the two dominant approaches are list-scheduling and cluster-scheduling (based on clustering), whereby clustering targets an unlimited number of processors at its core. Given their heuristic nature, many experimental comparisons exist. However, their overwhelming majority compares algorithms within but not across categories. Hence it is not clear how cluster-scheduling, for a limited number of processors, performs relative to list-scheduling or how list-scheduling, for an unlimited number of processors, performs against clustering. This study addresses these open questions by comparing a large set of representative algorithms from the two approaches in an extensive experimental evaluation. The algorithms are discussed and studied in a modular nature, categorizing algorithms into components. Some of the included algorithms are previously unpublished combinations of these techniques. This approach also permits to study the separate merit of techniques like task insertion or lookahead. The results show that simple low-complexity algorithms are surprisingly competitive and that more sophisticated algorithms only exhibit their strengths under certain conditions.

[1]  Jean-Marc Vincent,et al.  Random graph generation for scheduling simulations , 2010, SimuTools.

[2]  Arjan J. C. van Gemund,et al.  FLB: Fast Load Balancing for distributed-memory machines , 1999, Proceedings of the 1999 International Conference on Parallel Processing.

[3]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[4]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[5]  Nawwaf N. Kharma,et al.  A high performance algorithm for static task scheduling in heterogeneous distributed computing systems , 2008, J. Parallel Distributed Comput..

[6]  Ronald L. Graham,et al.  Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.

[7]  Edward G. Coffman,et al.  Computer and job-shop scheduling theory , 1976 .

[8]  Ishfaq Ahmad,et al.  Analysis, evaluation, and comparison of algorithms for scheduling task graphs on parallel processors , 1996, Proceedings Second International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'96).

[9]  Jing-Chiou Liou,et al.  A comparison of general approaches to multiprocessor scheduling , 1997, Proceedings 11th International Parallel Processing Symposium.

[10]  Jan Janecek,et al.  A high performance, low complexity algorithm for compile-time task scheduling in heterogeneous systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[11]  Denis Trystram,et al.  A new clustering algorithm for large communication delays , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[12]  Yan Alexander Li,et al.  Determining the Execution Time Distribution for a Data Parallel Program in a Heterogeneous Computing Environment , 1997, J. Parallel Distributed Comput..

[13]  Hironori Kasahara,et al.  A standard task graph set for fair evaluation of multiprocessor scheduling algorithms , 2002 .

[14]  Emmanuel Jeannot,et al.  Comparative Evaluation Of The Robustness Of DAG Scheduling Heuristics , 2008, CoreGRID Integration Workshop.

[15]  Ehsan Ullah Munir,et al.  SDBATS: A Novel Algorithm for Task Scheduling in Heterogeneous Computing Systems , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[16]  Daniel Gajski,et al.  Hypertool: A Programming Aid for Message-Passing Systems , 1990, IEEE Trans. Parallel Distributed Syst..

[17]  Wei Du,et al.  Energy-Aware Task Clustering Scheduling Algorithm for Heterogeneous Clusters , 2011, 2011 IEEE/ACM International Conference on Green Computing and Communications.

[18]  Oliver Sinnen,et al.  Task Scheduling for Parallel Systems , 2007, Wiley series on parallel and distributed computing.

[19]  Carolyn McCreary,et al.  A comparison of heuristics for scheduling DAGs on multiprocessors , 1994, Proceedings of 8th International Parallel Processing Symposium.

[20]  Tao Yang,et al.  Scheduling and code generation for parallel architectures , 1993 .

[21]  Ishfaq Ahmad,et al.  Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors , 1996, IEEE Trans. Parallel Distributed Syst..

[22]  Guan Wang,et al.  A Novel Heterogeneous Scheduling Algorithm with Improved Task Priority , 2015, 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems.

[23]  Rizos Sakellariou,et al.  DAG Scheduling Using a Lookahead Variant of the Heterogeneous Earliest Finish Time Algorithm , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[24]  Sung Jo Kim A general approach to multiprocessor scheduling , 1988 .

[25]  Hyunseung Choo,et al.  Decisive path scheduling: a new list scheduling method , 1997, Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162).

[26]  Carolyn McCreary,et al.  Automatic determination of grain size for efficient parallel processing , 1989, CSC '89.

[27]  Ishfaq Ahmad,et al.  Bubble scheduling: A quasi dynamic algorithm for static allocation of tasks to parallel architectures , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.

[28]  Vivek Sarkar,et al.  Partitioning and Scheduling Parallel Programs for Multiprocessing , 1989 .

[29]  Hamid Arabnejad,et al.  List Scheduling Algorithm for Heterogeneous Systems by an Optimistic Cost Table , 2014, IEEE Transactions on Parallel and Distributed Systems.

[30]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[31]  Edward A. Lee,et al.  A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures , 1993, IEEE Trans. Parallel Distributed Syst..

[32]  Hidenori Nakazato,et al.  Clustering-Based Task Scheduling in a Large Number of Heterogeneous Processors , 2016, IEEE Transactions on Parallel and Distributed Systems.

[33]  Tao Yang,et al.  List Scheduling With and Without Communication Delays , 1993, Parallel Comput..

[34]  Arjan J. C. van Gemund,et al.  GLB: a low-cost scheduling algorithm for distributed-memory architectures , 1998, Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238).

[35]  Damla Turgut,et al.  A performance study of multiprocessor task scheduling algorithms , 2007, The Journal of Supercomputing.

[36]  Frank D. Anger,et al.  Scheduling Precedence Graphs in Systems with Interprocessor Communication Times , 1989, SIAM J. Comput..

[37]  Arjan J. C. van Gemund,et al.  On the complexity of list scheduling algorithms for distributed-memory systems , 1999, ICS '99.

[38]  Jorge J. Moré,et al.  Benchmarking optimization software with performance profiles , 2001, Math. Program..

[39]  Shuvra S. Bhattacharyya,et al.  Efficient techniques for clustering and scheduling onto embedded multiprocessors , 2006, IEEE Transactions on Parallel and Distributed Systems.