Preemptable remote execution facilities for loosely-coupled distributed systems (migration, transparency, scheduling)

A loosely-coupled distributed system consisting of a cluster of workstations and server machines represents a large amount of computational power, much of which is frequently idle. Users would like to take advantage of this idle processing power by running one or more jobs in parallel on underutilized workstations. The use of underutilized workstations as computation servers not only increases the processing power available to users, but also improves the utilization of the hardware base. However, this use must not compromise a workstation owner's claim to his machine: A user must be able to quickly reclaim his workstation to avoid interference with personal activities, implying removal of "guest" programs within a few seconds time. In addition, use of remote machines as computation servers should not require programs to be written with special provisions for executing remotely. That is, remote execution should be preemptable and transparent. On the other hand, rather than simply terminate the guest program it should be possible to migrate it to another available workstation. In this thesis, we study the key design and performance issues that affect preemptable remote execution in a loosely-coupled distributed system. Five major topics are addressed in our work: (1) provision of network-transparent execution environments for programs, (2) structuring migration facilities such that they interfere with the normal operation of the system in a minimal manner, (3) elimination of residual dependencies that occur when a program migrates but has state information left in machine-relative servers on the original machine, (4) provision of global scheduling facilities for finding idle/lightly loaded machines for remote execution and migration of programs, and (5) provision of fair access to global resources among the programs and users of a system. In the process of addressing these topics we delineate when remote execution facilities, with or without migration facilities, are useful and under what conditions they are easy (or difficult) to provide.