SPARE: Replicas on Hold

Despite numerous improvements in the development and maintenance of software, bugs and security holes exist in today’s products, and malicious intrusions happen frequently. While this is a general problem, it explicitly applies to webbased services. However, Byzantine fault-tolerant (BFT) replication and proactive recovery offer a powerful combination to tolerate and overcome these kinds of faults, thereby enabling long-term service provision. BFT replication is commonly associated with the overhead of 3f + 1 replicas to handle f faults. Using a trusted component, some previous systems were able to reduce the resource cost to 2f +1 replicas. In general, adding support for proactive recovery further increases the resource demand. We believe this enormous resource demand is one of the key reasons why BFT replication is not commonly applied and considered unsuitable for web-based services. In this paper we present SPARE, a cloud-aware approach that harnesses virtualization to reduce the resource demand of BFT replication and to provide efficient support for proactive recovery. In SPARE, we focus on the main source of software bugs and intrusions; that is, the services and their associated execution environments. This approach enables us to restrict replication and request execution to only f + 1 replicas in the fault-free case while rapidly activating up to f additional replicas by utilizing virtualization in case of timing violations and faults. For an instant reaction, we keep spare replicas that are periodically updated in a paused state. In the fault-free case, these passive replicas require far less resources than active replicas and aid efficient proactive recovery.

[1]  Ricardo Jiménez-Peris,et al.  Lightweight Reflection for Middleware-based Database Replication , 2006, 2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06).

[2]  Scott Shenker,et al.  Diverse Replication for Single-Machine Byzantine-Fault Tolerance , 2008, USENIX Annual Technical Conference.

[3]  Paulo Veríssimo,et al.  Hidden problems of asynchronous proactive recovery , 2007 .

[4]  Andreas Haeberlen,et al.  PeerReview: practical accountability for distributed systems , 2007, SOSP.

[5]  Richard Wolski,et al.  The Eucalyptus Open-Source Cloud-Computing System , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[6]  Edmund L. Wong,et al.  BFT: the time is now , 2008, LADIS '08.

[7]  Scott Shenker,et al.  Attested append-only memory: making adversaries stick to their word , 2007, SOSP.

[8]  Fernando Pedone,et al.  Pronto: High availability for standard off-the-shelf databases , 2008, J. Parallel Distributed Comput..

[9]  Alberto Bartoli,et al.  Online reconfiguration in replicated databases based on group communication , 2001, 2001 International Conference on Dependable Systems and Networks.

[10]  Jon Crowcroft,et al.  Location based placement of whole distributed systems , 2005, CoNEXT '05.

[11]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[12]  Yair Amir,et al.  The Spread Wide Area Group Communication System , 2007 .

[13]  Michael Norrish,et al.  seL4: formal verification of an OS kernel , 2009, SOSP '09.

[14]  Lorenzo Strigini,et al.  Fault Tolerance via Diversity for Off-the-Shelf Products: A Study with SQL Database Servers , 2007, IEEE Transactions on Dependable and Secure Computing.

[15]  Thomas Santen,et al.  Verifying the Microsoft Hyper-V Hypervisor with VCC , 2009, FM.

[16]  Rüdiger Kapitza,et al.  Hypervisor-Based Efficient Proactive Recovery , 2007, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007).

[17]  Miguel Castro,et al.  Proactive recovery in a Byzantine-fault-tolerant system , 2000, OSDI.

[18]  Yongdae Kim,et al.  Secure group communication in asynchronous networks with failures: integration and experiments , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[19]  P. Shenoy,et al.  Cheap Practical BFT using Virtualization , 2008 .

[20]  Miguel Correia,et al.  How to tolerate half less one Byzantine nodes in practical distributed systems , 2004, Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004..

[21]  Arun Venkataramani,et al.  Separating agreement from execution for byzantine fault tolerant services , 2003, SOSP '03.

[22]  Tsutomu Hoshino,et al.  The PAX project , 1992 .

[23]  Gustavo Alonso,et al.  Non-intrusive, parallel recovery of replicated data , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[24]  Nikolai Joukov,et al.  GreenFS: making enterprise computers greener by protecting them better , 2008, Eurosys '08.

[25]  Sape J. Mullender,et al.  Distributed systems (2nd Ed.) , 1993 .

[26]  Camino de Vera Almost Triggerless Writeset Extraction in Multiversioned Databases , 2009 .

[27]  Margo I. Seltzer,et al.  Passive NFS Tracing of Email and Research Workloads , 2003, FAST.

[28]  Javier García,et al.  TPC-W E-Commerce Benchmark Evaluation , 2003, Computer.

[29]  Shankar Pasupathy,et al.  Measurement and Analysis of Large-Scale Network File System Workloads , 2008, USENIX Annual Technical Conference.

[30]  Tobias Distler,et al.  State Transfer for Hypervisor-Based Proactive Recovery of Heterogeneous Replicated Services , 2010, Sicherheit.

[31]  Bettina Kemme,et al.  Online recovery in cluster databases , 2008, EDBT '08.

[32]  Borja Sotomayor,et al.  Virtual Infrastructure Management in Private and Hybrid Clouds , 2009, IEEE Internet Computing.

[33]  David Patterson,et al.  Service placement in shared wide-area platforms , 2005, SOSP '05.

[34]  Steven Hand,et al.  Privilege separation made easy: trusting small libraries not big processes , 2008, EUROSEC '08.

[35]  Leslie Lamport,et al.  Cheap Paxos , 2004, International Conference on Dependable Systems and Networks, 2004.

[36]  Petr Kuznetsov,et al.  BFTW3: why? when? where? workshop on the theory and practice of byzantine fault tolerance , 2010, SIGA.

[37]  Toby Velte,et al.  Microsoft Virtualization with Hyper-V , 2009 .

[38]  Miguel Castro,et al.  Using abstraction to improve fault tolerance , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[39]  Idit Keidar,et al.  Venus: verification for untrusted cloud storage , 2010, CCSW '10.

[40]  Miguel Castro,et al.  BASE: Using abstraction to improve fault tolerance , 2003, TOCS.

[41]  Jehan-François Pâris,et al.  Voting with Witnesses: A Constistency Scheme for Replicated Files , 1986, ICDCS.

[42]  Udo Steinberg,et al.  NOVA: a microhypervisor-based secure virtualization architecture , 2010, EuroSys '10.

[43]  Sape Mullender,et al.  Distributed systems , 1989 .

[44]  Dutch T. Meyer,et al.  Remus: High Availability via Asynchronous Virtual Machine Replication. (Best Paper) , 2008, NSDI.