Studying network protocol offload with emulation: approach and preliminary results

To take full advantage of high-speed networks while freeing CPU cycles for application processing, the industry is proposing new techniques relying on an extended role for network interface cards such as TCP offload engine and remote direct memory access. The paper presents an experimental study aimed at collecting the performance data needed to assess these techniques. This work is based on the emulation of an advanced network interface card plugged on the I/O bus. In the experimental setting, a processor of a partitioned SMP machine is dedicated to network processing. Achieving a faithful emulation of a network interface card is one of the main concerns and it is guiding the design of the offload engine software. This setting has the advantage of being flexible so that many different offload scenarios can be evaluated. Preliminary throughput results of an emulated TCP offload engine demonstrate a large benefit. The emulated TCP offload engine indeed yields 600% to 900% improvement while still relying on memory copies at the kernel boundary.

[1]  Jonathan Adams,et al.  Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources , 2001, USENIX Annual Technical Conference, General Track.

[2]  David D. Clark,et al.  Architectural considerations for a new generation of protocols , 1990, SIGCOMM '90.

[3]  Jeffrey S. Chase,et al.  End system optimizations for high-speed TCP , 2001, IEEE Commun. Mag..

[4]  Erich M. Nahum,et al.  Performance issues in parallelized network protocols , 1994, OSDI '94.

[5]  Hsiao-Keng Jerry Chu,et al.  Zero-Copy TCP in Solaris , 1996, USENIX Annual Technical Conference.

[6]  David D. Clark,et al.  An analysis of TCP processing overhead , 1988, IEEE Communications Magazine.

[7]  David R. Cheriton,et al.  The VMP network adapter board (NAB): high-performance network communication for multiprocessors , 1988, SIGCOMM 1988.

[8]  Brent Callaghan,et al.  NFS over RDMA , 2003, NICELI '03.

[9]  David R. Cheriton,et al.  The VMP network adapter board (NAB): high-performance network communication for multiprocessors , 1988, SIGCOMM '88.

[10]  Craig Partridge How slow is one gigabit per second? , 1989, CCRV.

[11]  Joseph Pasquale,et al.  The importance of non-data touching processing overheads in TCP/IP , 1993, SIGCOMM '93.

[12]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[13]  Martina Zitterbart A multiprocessor architecture for high speed network interconnections , 1989, IEEE INFOCOM '89, Proceedings of the Eighth Annual Joint Conference of the IEEE Computer and Communications Societies.

[14]  Jeffrey C. Mogul,et al.  TCP Offload Is a Dumb Idea Whose Time Has Come , 2003, HotOS.

[15]  A. Charlesworth The Sun Fireplane System Interconnect , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[16]  Vikram A. Saletore,et al.  ETA: experience with an Intel Xeon processor as a packet processing engine , 2004, IEEE Micro.

[17]  Guru M. Parulkar,et al.  Axon: a high speed communication architecture for distributed applications , 1990, Proceedings. IEEE INFOCOM '90: Ninth Annual Joint Conference of the IEEE Computer and Communications Societies@m_The Multiple Facets of Integration.