CRAK: Linux Checkpoint/Restart As a Kernel Module

Process checkpoint/restart is a very useful technology for process migration, load balancing, crash recovery, rollback transaction, job controlling and many other purposes. Although process migration has not yet been widely used and is not widely available commercial systems, the growing shift of computing facilities from supercomputers to networked workstations and distributed systems is increasing the importance and demand for migration technologies. In this paper, we describe the design and implementation of CRAK, an innovative transparent checkpoint/restart package for Linux. CRAK provides transparent migration of Linux networked applications and computing environments without modifying, recompiling, or relinking applications or the operating system. CRAK is the first system for Unix/Linux that provides transparent checkpoint/restart with the following properties: (1) it does not require any modifications of existing operating system or application code and (2) it supports migrating network sockets. Prototype implementations are available for Linux 2.2 and Linux 2.4 kernels.