Scheduling divisible loads in multiple rounds

This dissertation studies the class of Divisible Load (DL) applications, which comprises many important applications in science and engineering. A key issue for achieving high performance for these application is: how to schedule them effectively onto distributed computing platforms? This work focuses on solving the Divisible Load Scheduling (DLS) problem, and makes the following contributions. We propose new and more realistic platform models than those used in previous DLS work. We present the first proof that the DLS problem is NP-Complete. We develop novel multi-round DLS algorithms that are more widely applicable than previously proposed algorithms and that, as shown in simulation, are more effective. We address the issue of performance prediction errors, which are expected in real-world settings, and present a novel DLS algorithm that is robust to such errors. Finally, we report on our practical implementation of a DL scheduler, APSTDV, that can be used for scheduling and deploying real applications on real distributed platforms. We demonstrate the ease-of-use and performance of APSTDV via a number of experiments and case-studies.