High-Performance Key-Value Store on OpenSHMEM

Recently, there has been a growing interest in enabling fast data analytics by leveraging system capabilities from large-scale high-performance computing (HPC) systems. OpenSHMEM is a popular run-time system on HPC systems that has been used for large-scale compute-intensive scientific applications. In this paper, we propose to leverage OpenSHMEM to design a distributed in-memory key-value store for fast data analytics. Accordingly, we have developed SHMEMCache on top of OpenSHMEM to leverage its symmetric global memory, efficient one-sided communication operations and general portability. We have also evaluated SHMEMCache through extensive experimental studies. Our results show that SHMEMCache has accomplished significant performance improvements over hte original Memcached in terms of latency and throughput. Our evaluation on the Titan supercomputer has also demonstrated that SHMEMCache can scale to 1024 nodes.

[1]  John Bent,et al.  MDHIM: A Parallel Key/Value Framework for HPC , 2015, HotStorage.

[2]  Barbara M. Chapman,et al.  Introducing OpenSHMEM: SHMEM for the PGAS community , 2010, PGAS '10.

[3]  Miguel Castro,et al.  FaRM: Fast Remote Memory , 2014, NSDI.

[4]  Dhabaleswar K. Panda,et al.  Designing Scalable Graph500 Benchmark with Hybrid MPI+OpenSHMEM Programming Models , 2013, ISC.

[5]  Bin Fan,et al.  MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing , 2013, NSDI.

[6]  Dilma Da Silva,et al.  Providing a cloud network infrastructure on a supercomputer , 2010, HPDC '10.

[7]  D. Panda,et al.  Extending OpenSHMEM for GPU Computing , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[8]  Dhabaleswar K. Panda,et al.  High performance OpenSHMEM for Xeon Phi clusters: Extensions, runtime designs and application co-design , 2014, 2014 IEEE International Conference on Cluster Computing (CLUSTER).

[9]  Alfonso Niño,et al.  A Survey of Parallel Programming Models and Tools in the Multi and Many-core Era , 2022 .

[10]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[11]  Dhabaleswar K. Panda,et al.  High performance RDMA-based MPI implementation over InfiniBand , 2003, ICS.

[12]  Weikuan Yu,et al.  SHMemCache: Enabling Memcached on the OpenSHMEM Global Address Model , 2016, OpenSHMEM.

[13]  Dhabaleswar K. Panda,et al.  Scalable Memcached Design for InfiniBand Clusters Using Hybrid Transports , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[14]  Jeffery A Kuehn,et al.  OpenSHMEM Performance and Potential: A NPB Experimental Study , 2012 .

[15]  Dhabaleswar K. Panda,et al.  High-Performance Hybrid Key-Value Store on Modern Clusters with RDMA Interconnects and SSDs: Non-blocking Extensions, Designs, and Benefits , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[16]  David G. Andersen,et al.  Design Guidelines for High Performance RDMA Systems , 2016, USENIX ATC.

[17]  Li Zhang,et al.  HydraDB: a resilient RDMA-driven key-value middleware for in-memory cluster computing , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[18]  Amar Phanishayee,et al.  FAWN: a fast array of wimpy nodes , 2009, SOSP '09.

[19]  Li Zhang,et al.  C-Hint: An Effective and Reliable Cache Management for RDMA-Accelerated Key-Value Stores , 2014, SoCC.

[20]  Tony Tung,et al.  Scaling Memcache at Facebook , 2013, NSDI.

[21]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[22]  Jinyang Li,et al.  Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store , 2013, USENIX ATC.

[23]  David G. Andersen,et al.  Using RDMA efficiently for key-value services , 2015, SIGCOMM 2015.

[24]  Hyeontaek Lim,et al.  MICA: A Holistic Approach to Fast In-Memory Key-Value Storage , 2014, NSDI.

[25]  Sayantan Sur,et al.  Memcached Design on High Performance RDMA Capable Interconnects , 2011, 2011 International Conference on Parallel Processing.