A Hash Table for Line-Rate Data Processing

FPGA-based data processing is becoming increasingly relevant in data centers, as the transformation of existing applications into dataflow architectures can bring significant throughput and power benefits. Furthermore, a tighter integration of computing and network is appealing, as it overcomes traditional bottlenecks between CPUs and network interfaces, and dramatically reduces latency. In this article, we present the design of a novel hash table, a fundamental building block used in many applications, to enable data processing on FPGAs close to the network. We present a fully pipelined design capable of sustaining consistent 10Gbps line-rate processing by deploying a concurrent mechanism to handle hash collisions. We address additional design challenges such as support for a broad range of key sizes without stalling the pipeline through careful matching of lookup time with packet reception time. Finally, the design is based on a scalable architecture that can be easily parameterized to work with different memory types operating at different access speeds and latencies. We have tested the proposed hash table in an FPGA-based memcached appliance implementing a main-memory key-value store in hardware. The hash table is used to index 2 million entries in 24GB of external DDR3 DRAM while sustaining 13 million requests per second, the maximum packet rate that can be achieved with UDP packets on a 10Gbps link for this application.

[1]  Ling Liu,et al.  Achieving 10Gbps Line-rate Key-value Stores with FPGAs , 2013, HotCloud.

[2]  Andrei Z. Broder,et al.  Using multiple hash functions to improve IP lookups , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[3]  Brad Fitzpatrick,et al.  Distributed caching with memcached , 2004 .

[4]  Andrei Z. Broder,et al.  Multilevel adaptive hashing , 1990, SODA '90.

[5]  Martin Margala,et al.  An FPGA memcached appliance , 2013, FPGA '13.

[6]  Gustavo Alonso,et al.  Complex event detection at wire speed with FPGAs , 2010, Proc. VLDB Endow..

[7]  Vern Paxson,et al.  The shunt: an FPGA-based accelerator for network intrusion prevention , 2007, FPGA '07.

[8]  Masanori Bando,et al.  FlashLook: 100-Gbps hash-tuned route lookup architecture , 2009, 2009 International Conference on High Performance Switching and Routing.

[9]  Gustavo Alonso,et al.  A flexible hash table design for 10GBPS key-value stores on FPGAS , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[10]  Ramarathnam Venkatesan,et al.  Orthogonal Security with Cipherbase , 2013, CIDR.

[11]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[12]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[13]  Jan Korenek,et al.  Fast and scalable packet classification using perfect hash functions , 2009, FPGA '09.

[14]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[15]  Gustavo Alonso,et al.  Ibex - An Intelligent Storage Engine with Support for Advanced SQL Off-loading , 2014, Proc. VLDB Endow..

[16]  Stamatis Vassiliadis,et al.  A reconfigurable perfect-hashing scheme for packet inspection , 2005, International Conference on Field Programmable Logic and Applications, 2005..

[17]  Hari Angepat,et al.  An FPGA-based In-Line Accelerator for Memcached , 2014, IEEE Computer Architecture Letters.

[18]  Jürgen Teich,et al.  Acceleration of SQL Restrictions and Aggregations through FPGA-Based Dynamic Partial Reconfiguration , 2013, 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines.

[19]  Gustavo Alonso,et al.  Streams on Wires - A Query Compiler for FPGAs , 2009, Proc. VLDB Endow..

[20]  Michael Mitzenmacher,et al.  More Robust Hashing: Cuckoo Hashing with a Stash , 2008, ESA.