A Multilevel NOSQL Cache Design Combining In-NIC and In-Kernel Caches

Since a large-scale in-memory data store, such as key-value store (KVS), is an important software platform for data centers, this paper focuses on an FPGA-based custom hardware to further improve the efficiency of KVS. Although such FPGA-based KVS accelerators have been studied and shown a high performance per Watt compared to software-based processing, since their cache capacity is strictly limited by the DRAMs implemented on FPGA boards, their application domain is also limited. To address this issue, in this paper, we propose a multilevel NOSQL cache architecture that utilizes both the FPGA-based hardware cache and an in-kernel software cache in a complementary style. They are referred as L1 and L2 NOSQL caches, respectively. The proposed multilevel NOSQL cache architecture motivates us to explore various design options, such as cache write and inclusion policies between L1 and L2 NOSQL caches. We implemented a prototype system of the proposed multilevel NOSQL cache using NetFPGA-10G board and Linux Netfilter framework. Based on the prototype implementation, we explore the various design options for the multilevel NOSQL caches. Simulation results show that our multilevel NOSQL cache design reduces the cache miss ratio and improves the throughput compared to the non-hierarchical design.

[1]  KozyrakisChristos,et al.  The case for RAMClouds , 2010 .

[2]  Bin Fan,et al.  MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing , 2013, NSDI.

[3]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[4]  Ling Liu,et al.  Achieving 10Gbps Line-rate Key-value Stores with FPGAs , 2013, HotCloud.

[5]  Thomas F. Wenisch,et al.  Thin servers with smart pipes: designing SoC accelerators for memcached , 2013, ISCA.

[6]  Martin Fowler,et al.  NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence , 2012 .

[7]  Kees A. Vissers,et al.  Dataflow architectures for 10Gbps line-rate key-value-stores , 2013, 2013 IEEE Hot Chips 25 Symposium (HCS).

[8]  John W. Lockwood,et al.  Implementing Ultra Low Latency Data Center Services with Programmable Logic , 2015, 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects.

[9]  Andreas Koch,et al.  ffLink: A Lightweight High-Performance Open-Source PCI Express Gen3 Interface for Reconfigurable Accelerators , 2016, CARN.

[10]  Ling Liu,et al.  Scaling Out to a Single-Node 80Gbps Memcached Server with 40Terabytes of Memory , 2015, HotStorage.

[11]  Parag Agrawal,et al.  The case for RAMClouds: scalable high-performance storage entirely in DRAM , 2010, OPSR.

[12]  Bharat Sukhwani,et al.  Database analytics acceleration using FPGAs , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[13]  Andrew W. Moore,et al.  NetFPGA SUME: Toward 100 Gbps as Research Commodity , 2014, IEEE Micro.

[14]  Pradeep Dubey,et al.  Architecting to achieve a billion requests per second throughput on a single key-value store server platform , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[15]  Martin Margala,et al.  An FPGA memcached appliance , 2013, FPGA '13.

[16]  Kiyoung Choi,et al.  An FPGA implementation of high-throughput key-value store using Bloom filter , 2014, Technical Papers of 2014 International Symposium on VLSI Design, Automation and Test.

[17]  Gustavo Alonso,et al.  A flexible hash table design for 10GBPS key-value stores on FPGAS , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[18]  Luigi Rizzo,et al.  netmap: A Novel Framework for Fast Packet I/O , 2012, USENIX ATC.

[19]  Hari Angepat,et al.  An FPGA-based In-Line Accelerator for Memcached , 2014, IEEE Computer Architecture Letters.

[20]  Jinyang Li,et al.  Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store , 2013, USENIX ATC.

[21]  Song Jiang,et al.  Building a high-performance key-value cache as an energy-efficient appliance , 2014, Perform. Evaluation.

[22]  Tetsuya Asai,et al.  Caching memcached at reconfigurable network interface , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[23]  Yingwei Luo,et al.  LAMA: Optimized Locality-aware Memory Allocation for Key-value Cache , 2015, USENIX Annual Technical Conference.

[24]  Hyeontaek Lim,et al.  MICA: A Holistic Approach to Fast In-Memory Key-Value Storage , 2014, NSDI.