GRIP: a reconfigurable architecture for host-based gigabit-rate packet processing

One of the fundamental challenges for modern high-performance network interfaces is the processing capabilities required to process packets at high speeds. Simply transmitting or receiving data at gigabit speeds fully utilizes the CPU on a standard workstation. Any processing that must be done to the data, whether at the application layer or the network layer, decreases the achievable throughput. This paper presents an architecture for offloading a significant portion of the network, processing from the host CPU onto the network interface. A prototype, called the GRIP (Gigabit Rate IPSec) card, has been constructed based on an FPGA coupled with a commodity Gigabit Ethernet MAC. Experimental results based on the prototype are presented and analyzed. In addition, a second generation design is presented in the context of lessons learned from the prototype.

[1]  Christof Paar,et al.  An FPGA Implementation and Performance Evaluation of the AES Block Cipher Candidate Algorithm Finalists , 2000, AES Candidate Conference.

[2]  Máire O'Neill,et al.  High Performance Single-Chip FPGA Rijndael Algorithm Implementations , 2001, CHES.

[3]  Keith D. Underwood,et al.  Acceleration of a 2D-FFT on an Adaptable Computing Cluster , 2001, The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01).

[4]  Patrick W. Dowd,et al.  An FPGA-based coprocessor for ATM firewalls , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[5]  Cameron D. Patterson High Performance DES Encryption in Virtex(tm) FPGAs Using Jbits(tm) , 2000 .

[6]  Keith D. Underwood,et al.  A reconfigurable extension to the network interface of beowulf clusters , 2001, Proceedings 42nd IEEE Symposium on Foundations of Computer Science.

[7]  Kris Gaj,et al.  Experimental Testing of the Gigabit IPSec-Compliant Implementations of Rijndael and Triple DES Using SLAAC-1V FPGA Accelerator Board , 2001, ISC.

[8]  Michael Roe Performance of Protocols ( Extended , 2001 .

[9]  Kris Gaj,et al.  Fast implementations of secret-key block ciphers using mixed inner- and outer-round pipelining , 2001, FPGA '01.

[10]  Yvonne Coady,et al.  Using embedded network processors to implement global memory management in a workstation cluster , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).

[11]  Peter Steenkiste,et al.  Fine grain parallel communication on general purpose LANs , 1996, ICS '96.

[12]  José D. P. Rolim,et al.  An adaptive cryptographic engine for IPSec architectures , 2000, Proceedings 2000 IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00871).

[13]  Monk-Ping Leong,et al.  A bit-serial implementation of the international data encryption algorithm IDEA , 2000, Proceedings 2000 IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00871).

[14]  J. Larus,et al.  Tempest and Typhoon: user-level shared memory , 1994, Proceedings of 21 International Symposium on Computer Architecture.

[15]  Laurent Moll,et al.  Sepia: scalable 3D compositing using PCI Pamette , 1999, Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00375).

[16]  P. Wyckoff,et al.  EMP: Zero-Copy OS-Bypass NIC-Driven Gigabit Ethernet Message Passing , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[17]  Hugo Krawczyk,et al.  A Security Architecture for the Internet Protocol , 1999, IBM Syst. J..

[18]  Hiroshi Harada,et al.  The design and evaluation of high performance communication using a Gigabit Ethernet , 1999, ICS '99.