Escape Capsule: Explicit State Is Robust and Scalable

Software is modular, and so is run-time state. We argue that by allowing individual layers of the software stack to store isolated runtime state, we cripple the ability of systems to effectively scale or respond to failures. Given the strong desire to build elastic and highly available applications for the cloud, we propose Slice, an abstraction that allows applications to declare appropriate granularities of scale-oriented state, and allows layers to contribute the appropriate layer-specific data to those containers. Slices can be transparently migrated and replicated between application instances, thereby simplifying design of elastic and highly available systems, while retaining the modularity of modern software.

[1]  Barton P. Miller,et al.  Process migration in DEMOS/MP , 1983, SOSP '83.

[2]  Peter Druschel,et al.  Resource containers: a new facility for resource management in server systems , 1999, OSDI '99.

[3]  Andrew Warfield,et al.  Split/Merge: System Support for Elastic Execution in Virtual Middleboxes , 2013, NSDI.

[4]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[5]  Robert E. Strom,et al.  Optimistic recovery in distributed systems , 1985, TOCS.

[6]  Satish Narayanasamy,et al.  Respec: Efficient Online Multiprocessor Replay via Speculation and External Determinism , 2010, ASPLOS 2010.

[7]  Andrew Warfield,et al.  RemusDB: transparent high availability for database systems , 2011, The VLDB Journal.

[8]  Dutch T. Meyer,et al.  Remus: High Availability via Asynchronous Virtual Machine Replication. (Best Paper) , 2008, NSDI.

[9]  Ganesh Venkitachalam,et al.  The Design and Evaluation of a Practical System for Fault-Tolerant Virtual Machines , 2010 .

[10]  Bettina Kemme,et al.  A Unified Framework for Load Distribution and Fault-Tolerance of Application Servers , 2009, Euro-Par.

[11]  Georg Stellner,et al.  CoCheck: checkpointing and process migration for MPI , 1996, Proceedings of International Conference on Parallel Processing.

[12]  Robert Tappan Morris,et al.  An Analysis of Linux Scalability to Many Cores , 2010, OSDI.

[13]  Mary Baker,et al.  The Recovery Box: Using Fast Recovery to Provide High Availability in the UNIX Environment , 1992, USENIX Summer.

[14]  Michael Litzkow,et al.  Supporting checkpointing and process migration outside the UNIX kernel , 1999 .

[15]  Eyal de Lara,et al.  SnowFlock: rapid virtual machine cloning for cloud computing , 2009, EuroSys '09.

[16]  Jason Nieh,et al.  Transparent Checkpoint-Restart of Multiple Processes on Commodity Operating Systems , 2007, USENIX Annual Technical Conference.

[17]  Jason Flinn,et al.  Rethink the sync , 2006, OSDI '06.

[18]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[19]  Marcos K. Aguilera,et al.  Sinfonia: a new paradigm for building scalable distributed systems , 2007, SOSP.

[20]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[21]  Jason Nieh,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation , 2022 .

[22]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[23]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[24]  Brett D. Fleisch,et al.  The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.

[25]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[26]  Amnon Barak,et al.  The MOSIX multicomputer operating system for high performance cluster computing , 1998, Future Gener. Comput. Syst..

[27]  Fred Douglis,et al.  Transparent process migration: Design alternatives and the sprite implementation , 1991, Softw. Pract. Exp..