Recovering from process failures in the time warp mechanism

A recovery procedure for distributed systems using the time warp control mechanism is described. Time warp is an optimistic execution technique in which synchronization is achieved using rollback. The recovery procedure is a protocol that exploits the redundancy already available to implement process rollback in the time warp mechanism. Thus, the recovery protocol has little additional bookkeeping overhead, unlike many other recovery procedures. An informal proof of the correctness of the recovery procedure for a single process failure is presented. The protocol is extended so that it becomes resilient to multiple process failures.<<ETX>>