Scheduling a computational dag on a parallel system with communication delays and replication of node execution

The authors consider the problem of optimally scheduling the subtasks of a computational task modeled by a dag (directed acyclic graph) on parallel systems with identical processors. Execution of the subtasks (nodes) must satisfy precedence constraints that are met via data exchanges among processors which introduce communication delays. The optimization criterion used is the minimization of the processing time and the authors assume that there is no restriction on the number of processors needed and that a node may be replicated. They prove that the optimal scheduling problem can be solved in polynomial amount of time when the computational graph is a two-level dag. For a general dag they develop an algorithm that significantly reduces the search space over exhaustive search and can work very fast in many cases (the problem is NP-complete).<<ETX>>