Benchmarking key-value stores on high-performance storage and interconnects for web-scale workloads

Leveraging a distributed key-value based caching layer has proven to be invaluable for scalable data-intensive web applications. With the emergence of high-performance storage (e.g. SSD) and interconnects (e.g. InfiniBand) on modern clusters, several efforts are being made to design high-performance key-value stores that can operate well with `RAM+SSD' hybrid storage architecture. This has made it essential for us to design micro-benchmarks that are tailored to evaluate these upcoming, hybrid designs. In this paper, we study popular web-scale and cloud serving workloads, to identify different application-specific aspects, including commonly occurring data request distributions, update patterns, and environmental factors, that affect the performance of hybrid key-value stores. Based on these characterization studies, we propose a micro-benchmark suite that can be used to study high-performance, hybrid key-value stores on modern clusters, from the perspectives of both the application and the key-value store. We demonstrate its ease-of-use using database-integrated and stand-alone execution modes. Performance evaluations with different Memcached distributions, such as SSD-Assisted RDMA-Memcached, fatcache, and twemcache, over different networks/protocols, show that `SSD+RDMA' can significantly enhance the performance of Memcached for various read-only and read-heavy workloads, that are representative of several common web-scale workloads.

[1]  Brad Fitzpatrick,et al.  Distributed caching with memcached , 2004 .

[2]  Tilmann Rabl,et al.  CaSSanDra: An SSD boosted key-value store , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[3]  Hyeontaek Lim,et al.  MICA: A Holistic Approach to Fast In-Memory Key-Value Storage , 2014, NSDI.

[4]  Dhabaleswar K. Panda,et al.  Can RDMA benefit online data processing workloads on memcached and MySQL? , 2015, 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[5]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[6]  Jinyang Li,et al.  Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store , 2013, USENIX ATC.

[7]  Tilmann Rabl,et al.  Optimizing key-value stores for hybrid storage architectures , 2014, CASCON.

[8]  Sayantan Sur,et al.  Memcached Design on High Performance RDMA Capable Interconnects , 2011, 2011 International Conference on Parallel Processing.

[9]  Yuqing Zhu,et al.  BigDataBench: A big data benchmark suite from internet services , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[10]  Dhabaleswar K. Panda,et al.  SSD-Assisted Hybrid Memory to Accelerate Memcached over High Performance Networks , 2012, 2012 41st International Conference on Parallel Processing.

[11]  Guillaume Pierre,et al.  Wikipedia workload analysis for decentralized hosting , 2009, Comput. Networks.

[12]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[13]  Miguel Castro,et al.  FaRM: Fast Remote Memory , 2014, NSDI.

[14]  Carlo Curino,et al.  Benchmarking OLTP/web databases in the cloud: the OLTP-bench framework , 2012, CloudDB '12.