An optimal upper bound on the minimal completion time in distributed supercomputing

We first consider an MIMD multiprocessor configuration with <italic>n</italic> processors. A parallel program, consisting of <italic>n</italic> processes, is executed on this system—one process per processor. The program terminates when all processes are completed. Due to synchronizations, processes may be blocked waiting for events in other processes. Associated with the program is a parallel profile vector <italic>v¯</italic>, index <italic>i</italic> (1≤<italic>i</italic>≤<italic>n</italic>) in this vector indicates the percentage of the total execution time when <italic>i</italic> processes are executing. We then consider a distributed MIMD supercomputer with <italic>k</italic> clusters, containing <italic>u</italic> processors each. The same parallel program, consisting of <italic>n</italic> processes, is executed on this system. Each process can only be executed by processors in the same cluster. Finding a schedule with minimal completion time in this case is NP-hard. We are interested in the gain of using <italic>n</italic> processors compared to using <italic>k</italic> clusters containing <italic>u</italic> processors each. The gain is defined by the ratio between the minimal completion time using processor clusters and the completion time using a schedule with one process per processor. We present the optimal upper bound for this ratio in the form of an analytical expression in <italic>n, v¯, k</italic> and <italic>u</italic>. We also demonstrate how this result can be used when evaluating heuristic scheduling algorithms.