Advanced hashing schemes for packet forwarding using set associative memory architectures

Building a high performance IP packet forwarding (PF) engine remains a challenge due to increasingly stringent throughput requirements and the growing sizes of IP forwarding tables.The router has to match the incoming packet's IP address against the forwarding table.The matching process has to be done in wire speed which is why scalability and low power consumption are features that PF engines must maintain.It is common for PF engines to use hash tables; however, the classic hashing downsides have to be dealt with (e.g., collisions, worst case memory access time, ... etc.).While open addressing hash tables, in general, provide good average case search performance, their memory utilization and worst case performance can degrade quickly due to collisions that leads to bucket overflows.Set associative memory can be used for hardware implementations of hash tables with the property that each bucket of a hash table can be searched in one memory cycle.Hence, PF engine architectures based on associative memory will outperform those based on the conventional Ternary Content Addressable Memory (TCAM) in terms of power and scalability.The two standard solutions to the overflow problem are either to use some sort of predefined probing (e.g., linear or quadratic) or to use multiple hash functions.This work presents two new hash schemes that extend both aforementioned solutions to tackle the overflow problem efficiently.The first scheme is a hash probing scheme that is called Content-based HAsh Probing, or CHAP.CHAP is a probing scheme that is based on the content of the hash table to avoid the classical side effects of predefined hash probing methods (i.e., primary and secondary clustering phenomena) and at the same time reduces the overflow.The second scheme, called Progressive Hashing, or PH, is a general multiple hash scheme that reduces the overflow as well.PH splits the prefixes into groups where each group is assigned one hash function, then reuse some hash functions in a progressive fashion to reduce the overflow.We show by experimenting with real IP lookup tables that both schemes outperform other hashing schemes.

[1]  Anand Rangarajan,et al.  Algorithms for advanced packet classification with ternary CAMs , 2005, SIGCOMM '05.

[2]  Keith Sklower,et al.  A Tree-Based Packet Routing Table for Berkeley Unix , 1991, USENIX Winter.

[3]  Fang Hao,et al.  IPv6 Lookups using Distributed and Load Balanced Bloom Filters for 100Gbps Core Router Line Cards , 2009, IEEE INFOCOM 2009.

[4]  Stefanos Kaxiras,et al.  IPStash: a set-associative memory approach for efficient IP-lookup , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[5]  Rami G. Melhem,et al.  An Efficient Hardware-Based Multi-hash Scheme for High Speed IP Lookup , 2008, 2008 16th IEEE Symposium on High Performance Interconnects.

[6]  Francis Zane,et al.  Coolcams: power-efficient TCAMs for forwarding engines , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[7]  Norman P. Jouppi,et al.  Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .

[8]  Dean M. Tullsen,et al.  A Tree Based Router Search Engine Architecture with Single Port Memories , 2005, ISCA 2005.

[9]  Yakov Rekhter,et al.  An Architecture for IP Address Allocation with CIDR , 1993, RFC.

[10]  Viktor K. Prasanna,et al.  Multi-terabit ip lookup using parallel bidirectional pipelines , 2008, CF '08.

[11]  George Varghese,et al.  Network Algorithmics-An Interdisciplinary Approach to Designing Fast Networked Devices , 2004 .

[12]  Grigore Rosu,et al.  A tree based router search engine architecture with single port memories , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[13]  M. Mitzenmacher,et al.  Simple Summaries for Hashing with Multiple Choices , 2005 .

[14]  Stefanos Kaxiras,et al.  IPStash: a Power-Efficient Memory Architecture for IP-lookup , 2003, MICRO.

[15]  Viktor K. Prasanna,et al.  A Memory-Balanced Linear Pipeline Architecture for Trie-based IP Lookup , 2007, 15th Annual IEEE Symposium on High-Performance Interconnects (HOTI 2007).

[16]  Patrick Crowley,et al.  CAMP: fast and efficient IP lookup architecture , 2006, ANCS '06.

[17]  H. Jonathan Chao,et al.  High Performance Switches and Routers , 2007 .

[18]  Sartaj Sahni,et al.  Efficient construction of multibit tries for IP lookup , 2003, TNET.

[19]  Rami G. Melhem,et al.  CA-RAM: A High-Performance Memory Substrate for Search-Intensive Applications , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[20]  Svante Carlsson,et al.  Small forwarding tables for fast routing lookups , 1997, SIGCOMM '97.

[21]  SangKyun Yun Hardware-Based IP Lookup Using n-Way Set Associative Memory and LPM Comparator , 2006, SAMOS.

[22]  M. V. Ramakrishna,et al.  Efficient Hardware Hashing Functions for High Performance Computers , 1997, IEEE Trans. Computers.

[23]  H.J. Mattausch,et al.  A cost-efficient dynamic Ternary CAM in 130 nm CMOS technology with planar complementary capacitors and TSR architecture , 2003, 2003 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.03CH37408).

[24]  Viktor K. Prasanna,et al.  Reducing dynamic power dissipation in pipelined forwarding engines , 2009, 2009 IEEE International Conference on Computer Design.

[25]  Srihari Cadambi,et al.  Chisel: A Storage-efficient, Collision-free Hash-based Network Processing Architecture , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[26]  Sartaj Sahni,et al.  Dynamic Tree Bitmap for IP Lookup and Update , 2007, Sixth International Conference on Networking (ICN'07).

[27]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[28]  Michael Mitzenmacher,et al.  Simple summaries for hashing with choices , 2008, TNET.

[29]  Gunnar Karlsson,et al.  IP-address lookup using LC-tries , 1999, IEEE J. Sel. Areas Commun..

[30]  Ying Zhang,et al.  A 4.0 GHz 291 Mb Voltage-Scalable SRAM Design in a 32 nm High-k + Metal-Gate CMOS Technology With Integrated Power Management , 2010, IEEE Journal of Solid-State Circuits.

[31]  Rami G. Melhem,et al.  CHAP: Enabling Efficient Hardware-Based Multiple Hash Schemes for IP Lookup , 2009, Networking.

[32]  Sartaj Sahni,et al.  Efficient Construction of Pipelined Multibit-Trie Router-Tables , 2007, IEEE Transactions on Computers.

[33]  Berthold Vöcking,et al.  How asymmetry helps load balancing , 1999, JACM.

[34]  Nick McKeown,et al.  Routing lookups in hardware at memory access speeds , 1998, Proceedings. IEEE INFOCOM '98, the Conference on Computer Communications. Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies. Gateway to the 21st Century (Cat. No.98.

[35]  K. Fujishima,et al.  A cost-efficient high-performance dynamic TCAM with pipelined hierarchical searching and shift redundancy architecture , 2005, IEEE Journal of Solid-State Circuits.

[36]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[37]  Girija J. Narlikar,et al.  Fast incremental updates for pipelined forwarding engines , 2005, IEEE/ACM Transactions on Networking.

[38]  Rami G. Melhem,et al.  Progressive hashing for packet processing using set associative memory , 2009, ANCS '09.

[39]  V. Srinivasan,et al.  Fast address lookups using controlled prefix expansion , 1999, TOCS.

[40]  Brian Randell,et al.  A note on storage fragmentation and program segmentation , 1969, CACM.

[41]  Devavrat Shah,et al.  Fast Updating Algorithms for TCAMs , 2001, IEEE Micro.

[42]  Geoff Huston,et al.  Analyzing the Internet's BGP Routing Table , 2001 .

[43]  Eli Upfal,et al.  Balanced Allocations , 1999, SIAM J. Comput..

[44]  Rina Panigrahy,et al.  Reducing TCAM power consumption and increasing throughput , 2002, Proceedings 10th Symposium on High Performance Interconnects.

[45]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[46]  Haoyu Song,et al.  Fast hash table lookup using extended bloom filter: an aid to network processing , 2005, SIGCOMM '05.

[47]  Andrei Z. Broder,et al.  Using multiple hash functions to improve IP lookups , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[48]  Ying Zhang,et al.  A 4.0 GHz 291Mb voltage-scalable SRAM design in 32nm high-κ metal-gate CMOS with integrated power management , 2009, 2009 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.