Scheduling Tree-Shaped Task Graphs to Minimize Memory and Makespan

This paper investigates the execution of tree-shaped task graphs using multiple processors. Each edge of such a tree represents a large IO file. A task can only be executed if all input and output files fit into memory, and a file can only be removed from memory after it has been consumed. Such trees arise, for instance, in the multifrontal method of sparse matrix factorization. The maximum amount of memory needed depends on the execution order of the tasks. With one processor the objective of the tree traversal is to minimize the required memory. This problem was well studied and optimal polynomial algorithms were proposed. Here, we extend the problem by considering multiple processors, which is of obvious interest in the application area of matrix factorization. With the multiple processors comes the additional objective to minimize the time needed to traverse the tree, i.e., to minimize the makespan. Not surprisingly, this problem proves to be much harder than the sequential one. We study the computational complexity of this problem and provide an inapproximability result even for unit weight trees. Several heuristics are proposed, each with a different optimization focus, and they are analyzed in an extensive experimental evaluation using realistic trees.

[1]  Rizos Sakellariou,et al.  Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[2]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[3]  Joseph W. H. Liu,et al.  On the storage requirement in the out-of-core multifrontal method for sparse factorization , 1986, TOMS.

[4]  Robert E. Tarjan,et al.  The pebbling problem is complete in polynomial space , 1979, SIAM J. Comput..

[5]  Thomas Rauber,et al.  Memory-optimal evaluation of expression trees involving large objects , 1999, Comput. Lang. Syst. Struct..

[6]  Jeffrey D. Ullman,et al.  The Generation of Optimal Code for Arithmetic Expressions , 1970, JACM.

[7]  Jan Karel Lenstra,et al.  Complexity of machine scheduling problems , 1975 .

[8]  Gary L. Miller,et al.  Geometric mesh partitioning: implementation and experiments , 1995, Proceedings of 9th International Parallel Processing Symposium.

[9]  Jean-Yves L'Excellent,et al.  Memory-based scheduling for a parallel multifrontal solver , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[10]  Ravi Sethi,et al.  Complete register allocation problems , 1973, SIAM J. Comput..

[11]  Frank D. Anger,et al.  Scheduling Precedence Graphs in Systems with Interprocessor Communication Times , 1989, SIAM J. Comput..

[12]  T. C. Hu Parallel Sequencing and Assembly Line Problems , 1961 .

[13]  Ronald L. Graham,et al.  Bounds for certain multiprocessing anomalies , 1966 .

[14]  Yves Robert,et al.  On Optimal Tree Traversals for Sparse Matrix Factorization , 2011, IPDPS.

[15]  Joseph W. H. Liu An application of generalized tree pebbling to sparse matrix factorization , 1987 .

[16]  W. H. Liu,et al.  AN APPLICATION OF GENERALIZED TREE PEBBLING TO SPARSE MATRIX FACTORIZATION , 2022 .