An architecture for a resilient cloud computing infrastructure

This paper proposes an architecture for a resilient cloud computing infrastructure that provably maintains cloud functionality against persistent successful corruptions of cloud nodes. The architecture is composed of a self-healing software mechanism for the entire cloud, as well as hardware-assisted regeneration of compromised (or faulty) nodes from a pristine state. Such an architecture aims to secure critical distributed cloud computations well beyond the current state of the art by tolerating, in a seamless fashion, a continuous rate of successful corruptions up to certain corruption rate limit, e.g., 30% of all cloud nodes may be corrupted within a tunable window of time. The proposed architecture achieves these properties based on a principled separation of distributed task supervision from the computation of user-defined jobs. The task supervision and enduser communication are performed by a new software mechanism called the Control Operations Plane (COP), which builds a trustworthy and resilient, self-healing cloud computing infrastructure out of the underlying untrustworthy and faulty hosts. The COP leverages provably-secure cryptographic protocols that are efficient and robust in the presence of many corrupted participants - such a cloud regularly and unobtrusively refreshes itself by restoring COP nodes from a pristine state at regular intervals.

[1]  Michael Norrish,et al.  seL4: formal verification of an OS kernel , 2009, SOSP '09.

[2]  Salvatore J. Stolfo,et al.  The MEERKATS Cloud Security Architecture , 2012, 2012 32nd International Conference on Distributed Computing Systems Workshops.

[3]  Rafail Ostrovsky,et al.  How to withstand mobile virus attacks (extended abstract) , 1991, PODC '91.

[4]  Rafail Ostrovsky,et al.  Near-Linear Unconditionally-Secure Multiparty Computation with a Dishonest Minority , 2012, CRYPTO.

[5]  Gabriel Bracha,et al.  An O(log n) expected rounds randomized byzantine generals protocol , 1987, JACM.

[6]  Hugo Krawczyk,et al.  Proactive Secret Sharing Or: How to Cope With Perpetual Leakage , 1995, CRYPTO.

[7]  Elwyn R. Berlekamp,et al.  Algebraic coding theory , 1984, McGraw-Hill series in systems science.

[8]  Ivan Damgård,et al.  Scalable and Unconditionally Secure Multiparty Computation , 2007, CRYPTO.

[9]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[10]  G. R. Blakley,et al.  Safeguarding cryptographic keys , 1899, 1979 International Workshop on Managing Requirements Knowledge (MARK).

[11]  Gabriel Bracha,et al.  An O(lg n) expected rounds randomized Byzantine generals protocol , 1985, STOC '85.

[12]  Yuval Ishai,et al.  Perfectly Secure Multiparty Computation and the Computational Overhead of Cryptography , 2010, IACR Cryptol. ePrint Arch..

[13]  Adi Shamir,et al.  How to share a secret , 1979, CACM.