A Files Checkpointing Approach Based on Virtual File Operations

Checkpointing and rollback recovery of Unix process are the underlying technique of fault tolerance for distributed system and parallel environment. To save and restore the state and the content of active file of the process is an important aspect of checkpointing and rollback recovery. A new file checkpointing approach called virtual file operation (VFO) is presented. VFO buffers all the write operations after a checkpoint until the next one, making all the operations between two checkpoints atomic as a whole. By step-to-step checkpointing, and managing file as blocks, this approach achieves lower overhead than the others.