One-sided RDMA-Conscious Extendible Hashing for Disaggregated Memory

Memory disaggregation is a promising technique in datacenters with the benefit of improving resource utilization, failure isolation, and elasticity. Hashing indexes have been widely used to provide fast lookup services in distributed memory systems. However, traditional hashing indexes become inefficient for disaggregated memory since the computing power in the memory pool is too weak to execute complex index requests. To provide efficient indexing services in disaggregated memory scenarios, this paper proposes RACE hashing, a one-sided RDMA-Conscious Extendible hashing index with lock-free remote concurrency control and efficient remote resizing. RACE hashing enables all index operations to be efficiently executed by using only one-sided RDMA verbs without involving any compute resource in the memory pool. To support remote concurrent access with high performance, RACE hashing leverages a lock-free remote concurrency control scheme to enable different clients to concurrently operate the same hashing index in the memory pool in a lock-free manner. To resize the hash table with low overheads, RACE hashing leverages an extendible remote resizing scheme to reduce extra RDMA accesses caused by extendible resizing and allow concurrent request execution during resizing. Extensive experimental results demonstrate that RACE hashing outperforms state-of-the-art distributed in-memory hashing indexes by 1.4−13.7× in YCSB hybrid workloads.

[1]  Shin-Yeh Tsai,et al.  Disaggregating Persistent Memory and Controlling Them Remotely: An Exploration of Passive Disaggregated Key-Value Stores , 2020, USENIX ATC.

[2]  Marcos K. Aguilera,et al.  Designing Far Memory Data Structures: Think Outside the Box , 2019, HotOS.

[3]  Maurice Herlihy,et al.  Hopscotch Hashing , 2008, DISC.

[4]  Bo Ding,et al.  Lock-free Concurrent Level Hashing for Persistent Memory , 2020, USENIX Annual Technical Conference.

[5]  Thomas F. Wenisch,et al.  System-level implications of disaggregated memory , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[6]  Thomas F. Wenisch,et al.  Disaggregated memory for expansion and sharing in blade servers , 2009, ISCA '09.

[7]  Nhan Nguyen,et al.  Lock-Free Cuckoo Hashing , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[8]  Sam H. Noh,et al.  Write-Optimized Dynamic Hashing for Persistent Memory , 2019, FAST.

[9]  Hakim Weatherspoon,et al.  Shoal: A Network Architecture for Disaggregated Racks , 2019, NSDI.

[10]  Ronald Fagin,et al.  Extendible hashing—a fast access method for dynamic files , 1979, ACM Trans. Database Syst..

[11]  Marcos K. Aguilera,et al.  AIFM: High-Performance, Application-Integrated Far Memory , 2020, OSDI.

[12]  Eddie Kohler,et al.  Cache craftiness for fast multicore key-value storage , 2012, EuroSys '12.

[13]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[14]  Pradeep Dubey,et al.  Architecting to achieve a billion requests per second throughput on a single key-value store server platform , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[15]  Per-Åke Larson,et al.  Easy Lock-Free Indexing in Non-Volatile Memory , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[16]  Yiying Zhang,et al.  LegoOS: A Disseminated, Distributed OS for Hardware Resource Disaggregation , 2018, OSDI.

[17]  Tony Tung,et al.  Scaling Memcache at Facebook , 2013, NSDI.

[18]  Michael Mitzenmacher,et al.  The Power of Two Choices in Randomized Load Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[19]  Jinyang Li,et al.  Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store , 2013, USENIX ATC.

[20]  Xiaozhou Li,et al.  Algorithmic improvements for fast concurrent Cuckoo hashing , 2014, EuroSys '14.

[21]  Yiying Zhang,et al.  LITE Kernel RDMA Support for Datacenter Applications , 2017, SOSP.

[22]  Scott Shenker,et al.  Network Requirements for Resource Disaggregation , 2016, OSDI.

[23]  Miguel Castro,et al.  FaRM: Fast Remote Memory , 2014, NSDI.

[24]  Krste Asanovic,et al.  FireBox: A Hardware Building Block for 2020 Warehouse-Scale Computers , 2014 .

[25]  Michael Stonebraker,et al.  Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores , 2014, Proc. VLDB Endow..

[26]  Miryung Kim,et al.  Semeru: A Memory-Disaggregated Managed Runtime , 2020, OSDI.

[27]  Keith D. Underwood,et al.  Intel® Omni-path Architecture: Enabling Scalable, High Performance Fabrics , 2015, 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects.

[28]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[29]  Marcos K. Aguilera,et al.  Remote regions: a simple abstraction for remote memory , 2018, USENIX ATC.

[30]  David G. Andersen,et al.  Design Guidelines for High Performance RDMA Systems , 2016, USENIX ATC.

[31]  Jie Wu,et al.  Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory , 2018, OSDI.

[32]  Babak Falsafi,et al.  Meet the walkers accelerating index traversals for in-memory databases , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[33]  Bin Fan,et al.  MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing , 2013, NSDI.

[34]  Enhong Chen,et al.  KV-Direct: High-Performance In-Memory Key-Value Store with Programmable NIC , 2017, SOSP.

[35]  Miron Livny,et al.  Transactional client-server cache consistency: alternatives and performance , 1997, TODS.

[36]  Amanda Carbonari,et al.  Tolerating Faults in Disaggregated Datacenters , 2017, HotNets.

[37]  Hitesh Ballani,et al.  R2C2: A Network Stack for Rack-scale Computers , 2015, Comput. Commun. Rev..

[38]  Hyeontaek Lim,et al.  Cicada: Dependably Fast Multi-Core In-Memory Transactions , 2017, SIGMOD Conference.

[39]  Dejan S. Milojicic,et al.  Beyond Processor-centric Operating Systems , 2015, HotOS.