A New Approach to Proactive Recovery

Recent papers propose asynchronous protocols that can tolerate any number of faults over the lifetime of the system, provided that at most f nodes become faulty during a given window of time. This is achieved through the so-called proactive recovery, which consists of periodically rejuvenating the system. Proactive recovery in asynchronous systems, though a major breakthrough, has some limitations which we identified in a recent work. In fact, proactive recovery protocols typically require stronger environment assumptions (e.g., synchrony, security) than the rest of the system. In this paper, we take this in consideration and propose a new approach to proactive recovery that is based on an architecturally hybrid distributed system model. In this context, we present a secure real-time distributed component — the Proactive Recovery Wormhole (PRW) — that aims to execute, in a more dependable and effective way, proactive recovery protocols. We also briefly show how PRW can be used in practice to enhance the dependability of an existent proactive recovery based system.

[1]  Miguel Castro,et al.  Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[2]  Flaviu Cristian,et al.  The Timed Asynchronous Distributed System Model , 1998, IEEE Trans. Parallel Distributed Syst..

[3]  Antonio Casimiro,et al.  Using the timely computing base for dependable QoS adaptation , 2001, Proceedings 20th IEEE Symposium on Reliable Distributed Systems.

[4]  Paulo Veríssimo,et al.  How dependable are distributed f fault/intrusion-tolerant systems? , 2005 .

[5]  Tal Rabin,et al.  Secure distributed storage and retrieval , 1997, Theor. Comput. Sci..

[6]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[7]  Rafail Ostrovsky,et al.  How to withstand mobile virus attacks (extended abstract) , 1991, PODC '91.

[8]  Anna Lysyanskaya,et al.  Asynchronous verifiable secret sharing and proactive cryptosystems , 2002, CCS '02.

[9]  Miroslaw Malek,et al.  Self – Rejuvenation-an Effective Way to High Availability , 2004 .

[10]  Antonio Casimiro,et al.  The Timely Computing Base Model and Architecture , 2002, IEEE Trans. Computers.

[11]  Paulo Veríssimo Uncertainty and predictability: can they be reconciled? , 2003 .

[12]  Fred B. Schneider,et al.  CODEX: a robust and secure secret distribution system , 2004, IEEE Transactions on Dependable and Secure Computing.

[13]  Robbert van Renesse,et al.  APSS: proactive secret sharing in asynchronous systems , 2005, TSEC.

[14]  Miguel Correia,et al.  The Design of a COTSReal-Time Distributed Security Kernel , 2002, EDCC.

[15]  Hugo Krawczyk,et al.  Proactive Secret Sharing Or: How to Cope With Perpetual Leakage , 1995, CRYPTO.