Proactive resilience through architectural hybridization

In a recent work, we have shown that it is not possible to dependably build any type of distributed f fault or intrusion-tolerant system under the asynchronous model. This result follows from the fact that in an asynchronous environment one cannot guarantee that the system terminates its execution before the occurrence of more than the assumed number of faults.Some systems resorted to proactive recovery as a way to address this problem, by attempting to ensure that no more than f faults ever occur: nodes are periodically rejuvenated to remove the effects of faults or malicious attacks. However, asynchronous systems with proactive recovery also suffer from the same problem. In fact, proactive recovery protocols usually require stronger assumptions (e.g., synchrony, security) than the system that is proactively recovered.To solve this contradiction, we work with a hybrid distributed system model. We propose proactive resilience as a new and more resilient approach to proactive recovery, based on architectural hybridization: proactive recovery functions are encapsulated in architectural devices that meet the required stronger assumptions, and have a well-defined interface with the recovered system.We present the Proactive Resilience Model (PRM) and describe a design methodology under the PRM. This methodology is a way of building systems which guaranteedly do not suffer more than the assumed number of faults, and we use it to derive a distributed intrusion-tolerant secret sharing system.

[1]  Flaviu Cristian,et al.  The Timed Asynchronous Distributed System Model , 1998, IEEE Trans. Parallel Distributed Syst..

[2]  Rafail Ostrovsky,et al.  How to withstand mobile virus attacks (extended abstract) , 1991, PODC '91.

[3]  G. R. BLAKLEY Safeguarding cryptographic keys , 1979, 1979 International Workshop on Managing Requirements Knowledge (MARK).

[4]  Fred B. Schneider,et al.  CODEX: a robust and secure secret distribution system , 2004, IEEE Transactions on Dependable and Secure Computing.

[5]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[6]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[7]  Paulo Veríssimo Uncertainty and predictability: can they be reconciled? , 2003 .

[8]  Rafail Ostrovsky,et al.  How To Withstand Mobile Virus Attacks , 1991, PODC 1991.

[9]  LiskovBarbara,et al.  Practical byzantine fault tolerance and proactive recovery , 2002 .

[10]  Anna Lysyanskaya,et al.  Asynchronous verifiable secret sharing and proactive cryptosystems , 2002, CCS '02.

[11]  Robbert van Renesse,et al.  APSS: proactive secret sharing in asynchronous systems , 2005, TSEC.

[12]  Miguel Correia,et al.  The Design of a COTSReal-Time Distributed Security Kernel , 2002, EDCC.

[13]  Paulo Veríssimo,et al.  How resilient are distributed f fault/intrusion-tolerant systems? , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[14]  Hugo Krawczyk,et al.  Proactive Secret Sharing Or: How to Cope With Perpetual Leakage , 1995, CRYPTO.

[15]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[16]  Sam Toueg,et al.  A Modular Approach to Fault-Tolerant Broadcasts and Related Problems , 1994 .

[17]  Nancy A. Lynch,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[18]  Antonio Casimiro,et al.  The Timely Computing Base Model and Architecture , 2002, IEEE Trans. Computers.

[19]  P. Verissimo,et al.  How to build a timely computing base using real-time Linux , 2000, 2000 IEEE International Workshop on Factory Communication Systems. Proceedings (Cat. No.00TH8531).

[20]  Miguel Castro,et al.  Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[21]  Tal Rabin,et al.  Secure distributed storage and retrieval , 2000, Theor. Comput. Sci..

[22]  Paulo Veríssimo,et al.  Distributed Systems for System Architects , 2001, Advances in Distributed Computing and Middleware.

[23]  Markus Jakobsson,et al.  Proactive public key and signature systems , 1997, CCS '97.

[24]  Fred B. Schneider,et al.  COCA: a secure distributed online certification authority , 2002 .