Memory-Aware List Scheduling for Hybrid Platforms

This paper provides memory-aware heuristics to schedule tasks graphs onto heterogeneous resources, such as a dual-memory cluster equipped with multicores and a dedicated accelerator (FPGA or GPU). Each task has a different processing time for either resource. The optimization objective is to schedule the graph so as to minimize execution time, given the available memory for each resource type. In addition to ordering the tasks, we must also decide on which resource to execute them, given their computation requirement and the memory currently available on each resource. The major contributions of this paper are twofold: (i) the derivation of an intricate integer linear program formulation for this scheduling problem, and (ii) the design of memory-aware heuristics, which outperform the reference heuristics HEFT and MinMin on a wide variety of problem instances. The absolute performance of these heuristics is assessed for small-size graphs, with up to 30 tasks, thanks to the linear program.

[1]  Ian T. Foster,et al.  End-to-end quality of service for high-end applications , 2004, Comput. Commun..

[2]  Z Liu,et al.  Scheduling Theory and its Applications , 1997 .

[3]  Jack Dongarra,et al.  Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects , 2009 .

[4]  Jack Dongarra,et al.  A Class of Hybrid LAPACK Algorithms for Multicore and GPU Architectures , 2011, 2011 Symposium on Application Accelerators in High-Performance Computing.

[5]  Nenad Mladenović,et al.  Towards the Optimal Solution of the Multiprocessor Scheduling Problem with Communication Delays , 2007 .

[6]  Johan Montagnat,et al.  Flexible and Efficient Workflow Deployment of Data-Intensive Applications On Grids With MOTEUR , 2008, Int. J. High Perform. Comput. Appl..

[7]  Ishfaq Ahmad,et al.  On Exploiting Task Duplication in Parallel Program Scheduling , 1998, IEEE Trans. Parallel Distributed Syst..

[8]  Rizos Sakellariou,et al.  Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[9]  Jing-Chiou Liou,et al.  Task Clustering and Scheduling for Distributed Memory Parallel Architectures , 1996, IEEE Trans. Parallel Distributed Syst..

[10]  W. H. Liu,et al.  AN APPLICATION OF GENERALIZED TREE PEBBLING TO SPARSE MATRIX FACTORIZATION , 2022 .

[11]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[12]  Yves Robert,et al.  Model and Complexity Results for Tree Traversals on Hybrid Platforms , 2013, Euro-Par.

[13]  Oliver Sinnen,et al.  Optimal Linear Programming Solutions for Multiprocessor Scheduling with Communication Delays , 2012, ICA3PP.

[14]  Ladislau Bölöni,et al.  A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..

[15]  Joseph W. H. Liu,et al.  On the storage requirement in the out-of-core multifrontal method for sparse factorization , 1986, TOMS.

[16]  Henri Casanova,et al.  A Comparison of Scheduling Approaches for Mixed-Parallel Applications on Heterogeneous Platforms , 2007, Sixth International Symposium on Parallel and Distributed Computing (ISPDC'07).

[17]  Julien Langou,et al.  A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures , 2007, Parallel Comput..

[18]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..