Davram: Distributed Virtual Memory in User Space

One of the most challenging problems in modern distributed big data systems lies in their memory management: these systems preallocate a fixed amount of memory before applications start. In the best case where more memory can be acquired, users have to reconfigure the deployment and re-compute many intermediate results. If no more memory is available, users are then forced to manually partition the job into smaller tasks, incurring both development and performance overhead. This paper presents a user-level utility for managing the memory in distributed systems—the Distributed and Autonomous Virtual RAM (Davram). Davram enables to efficiently swap data between memory and disk in a distributed system without users' intervention or applications' awareness.

[1]  Jia Wang,et al.  I/O-Aware Batch Scheduling for Petascale Computing Systems , 2015, 2015 IEEE International Conference on Cluster Computing.

[2]  Robert B. Ross,et al.  Towards Exploring Data-Intensive Scientific Applications at Extreme Scales through Systems and Simulations , 2016, IEEE Transactions on Parallel and Distributed Systems.

[3]  Kurt B. Ferreira,et al.  Improving Application Resilience to Memory Errors with Lightweight Compression , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[4]  Dongfang Zhao Toward Real-Time and Fine-Grained Monitoring of Software-Defined Networking in the Cloud , 2016, 2016 IEEE 9th International Conference on Cloud Computing (CLOUD).

[5]  Cody Cutler,et al.  Phase Reconciliation for Contended In-Memory Transactions , 2014, OSDI.

[6]  Jian Yin,et al.  Improving the I / O Throughput for Data-Intensive Scientific Applications with Efficient Compression Mechanisms , 2013 .

[7]  Michael Lang,et al.  Optimizing load balancing and data-locality with data-aware scheduling , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[8]  Chen-Yong Cher,et al.  A System Software Approach to Proactive Memory-Error Avoidance , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[9]  Zhou Zhou,et al.  Exploiting multi‐cores for efficient interchange of large messages in distributed systems , 2016, Concurr. Comput. Pract. Exp..

[10]  Ke Wang,et al.  Albatross: An efficient cloud-enabled task scheduling and execution framework using distributed message queues , 2016, 2016 IEEE 12th International Conference on e-Science (e-Science).

[11]  Ke Wang,et al.  FaBRiQ: Leveraging Distributed Hash Tables towards Distributed Publish-Subscribe Message Queues , 2015, 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC).

[12]  Ke Wang,et al.  A convergence of key‐value storage systems from clouds to supercomputers , 2016, Concurr. Comput. Pract. Exp..

[13]  Jian Yin,et al.  Dynamic Virtual Chunks: On Supporting Efficient Accesses to Compressed Scientific Data , 2016, IEEE Transactions on Services Computing.

[14]  Jia Wang,et al.  I/O-aware bandwidth allocation for petascale computing systems , 2016, Parallel Comput..

[15]  Mattan Erez,et al.  Frugal ECC: efficient and versatile memory error protection through fine-grained compression , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[16]  Christoforos E. Kozyrakis,et al.  Usenix Association 10th Usenix Symposium on Operating Systems Design and Implementation (osdi '12) 335 Dune: Safe User-level Access to Privileged Cpu Features , 2022 .

[17]  Jian Yin,et al.  Virtual chunks: On supporting random accesses to scientific data in compressible storage systems , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[18]  Lu Fang,et al.  Interruptible tasks: treating memory pressure as interrupts for highly scalable data-parallel programs , 2015, SOSP.

[19]  Mohamed Mohamed,et al.  Toward locality-aware scheduling for containerized cloud services , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[20]  Alvin Cheung,et al.  Comparative Evaluation of Big-Data Systems on Scientific Image Analytics Workloads , 2016, Proc. VLDB Endow..

[21]  Bo Wu,et al.  ScaAnalyzer: a tool to identify memory scalability bottlenecks in parallel programs , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[22]  Kyle Chard,et al.  Toward Scalable Indexing and Search on Distributed and Unstructured Data , 2017, 2017 IEEE International Congress on Big Data (BigData Congress).

[23]  Ioan Raicu,et al.  Towards cost-effective and high-performance caching middleware for distributed systems , 2016, Int. J. Big Data Intell..

[24]  Jacob Nelson,et al.  Latency-Tolerant Software Distributed Shared Memory , 2015, USENIX ATC.

[25]  Dan Suciu,et al.  The Myria Big Data Management and Analytics System and Cloud Services , 2017, CIDR.

[26]  Ke Wang,et al.  A Dynamically Scalable Cloud Data Infrastructure for Sensor Networks , 2015, ScienceCloud@HPDC.

[27]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[28]  Ke Wang,et al.  GRAPH/Z: A Key-Value Store Based Scalable Graph Processing System , 2015, 2015 IEEE International Conference on Cluster Computing.

[29]  Robert B. Ross,et al.  FusionFS: Toward supporting data-intensive scientific applications on extreme-scale high-performance computing systems , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[30]  Ke Wang,et al.  Toward high-performance key-value stores through GPU encoding and locality-aware encoding , 2016, J. Parallel Distributed Comput..

[31]  Zhihan Lv,et al.  Toward Efficient and Flexible Metadata Indexing of Big Data Systems , 2017, IEEE Transactions on Big Data.

[32]  Ke Wang,et al.  ZHT: A Light-Weight Reliable Persistent Dynamic Scalable Zero-Hop Distributed Hash Table , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[33]  Ioan Raicu,et al.  HyCache+: Towards Scalable High-Performance Caching Middleware for Parallel File Systems , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[34]  Chen Shou,et al.  Distributed data provenance for large-scale data-intensive computing , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).

[35]  Brett D. Fleisch,et al.  Mirage: a coherent distributed shared memory design , 1989, SOSP '89.

[36]  Eddie Kohler,et al.  Speedy transactions in multicore in-memory databases , 2013, SOSP.