On real-time quasi-durable checkpointing

This study investigates real-time checkpointing techniques in the context of distributed process control applications where checkpointing and recovery operations must meet timing constraints, such as process deadline and plant state validity. We introduce the notion of quasidurability, which allows one to make tradeoffs between storage device reliability and the process control and recovery timing constraints. Based on this notion, we study three protocols for real-time quasi-durable checkpointing and recovery. For each protocol, we analyze its recoverability and provide the sufficient and necessary conditions for a set of devices to be feasible for checkpointing and recovery.