Elasticizing Linux via Joint Disaggregation of Memory and Computation

In this paper, we propose a set of operating system primitives which provides a scaling abstraction to cloud applications in which they can transparently be enabled to support scaled execution across multiple physical nodes as resource needs go beyond that available on a single machine. These primitives include stretch, to extend the address space of an application to a new node, push and pull, to move pages between nodes as needed for execution and optimization, and jump, to transfer execution in a very lightweight manner between nodes. This joint disaggregation of memory and computing allows for transparent elasticity, improving an application's performance by capitalizing on the underlying dynamic infrastructure without needing an application re-write. We have implemented these primitives in a Linux 2.6 kernel, collectively calling the extended operating system, ElasticOS. Our evaluation across a variety of algorithms shows up to 10x improvement in performance over standard network swap.

[1]  Robbert van Renesse,et al.  The Amoeba distributed operating system - A status report , 1991, Comput. Commun..

[2]  Jonathan M. Smith,et al.  A survey of process migration mechanisms , 1988, OPSR.

[3]  Alan L. Cox,et al.  TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , 1994, USENIX Winter.

[4]  Paul J. Leach,et al.  The file system of an integrated local network , 1985, CSC '85.

[5]  Richard Han,et al.  Towards Elastic Operating Systems , 2013, HotOS.

[6]  Hesham El-Rewini,et al.  Advanced Computer Architecture and Parallel Processing , 2005 .

[7]  Miguel Castro,et al.  FaRM: Fast Remote Memory , 2014, NSDI.

[8]  Lior Amar,et al.  An organizational grid of federated MOSIX clusters , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[9]  Roy H. Campbell,et al.  Fast dynamic process migration , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[10]  Kuzman Ganchev,et al.  Nswap: A Network Swapping Module for Linux Clusters , 2003, Euro-Par.

[11]  Georg Stellner,et al.  CoCheck: checkpointing and process migration for MPI , 1996, Proceedings of International Conference on Parallel Processing.

[12]  Kai Lu,et al.  SwapX: An NVM-Based Hierarchical Swapping Framework , 2017, IEEE Access.

[13]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1986, PODC '86.

[14]  Fred Douglis Experience with Process Migration in Sprite , 1989 .

[15]  Cristiane Rosul Message Passing Interface ( MPI ) Advantages and Disadvantages for applicability in the NoC Environment by , 2008 .

[16]  Kang G. Shin,et al.  Efficient Memory Disaggregation with Infiniswap , 2017, NSDI.

[17]  Jason Duell,et al.  Berkeley Lab Checkpoint/Restart (BLCR) for Linux Clusters , 2006 .

[18]  R. G. Minnich,et al.  Mether: supporting the shared memory model on computing clusters , 1993, Digest of Papers. Compcon Spring.

[19]  Dejan S. Milojicic,et al.  Process migration , 1999, ACM Comput. Surv..

[20]  Umesh Deshpande,et al.  MemX: Virtualization of Cluster-Wide Memory , 2010, 2010 39th International Conference on Parallel Processing.

[21]  Mendel Rosenblum,et al.  Fast crash recovery in RAMCloud , 2011, SOSP.

[22]  Bill Nitzberg,et al.  Distributed shared memory: a survey of issues and algorithms , 1991, Computer.