Effects of data passing semantics and operating system structure on network i/o performance

Abstract : Elimination of data and control passing overheads in I/O has been a long-sought goal. Researchers have often proposed changing the semantics of I/O data passing, so as to make copying unnecessary, or the structure of the operating system, so as to reduce or eliminate data and control passing. However, most such changes are incompatible with existing applications and therefore have not been adopted in conventional systems. My thesis is that, in network I/O, optimizations that preserve data passing semantics and system structure can give end-to-end improvements competitive with those of data and control passing optimizations that change semantics or structure. Moreover, current technological trends tend to reduce differences in such improvements. To demonstrate the thesis, I introduce new models of I/O organization, optimization, and data passing, emphasizing structure and compatibility rather than implementation. I review previous network I/O optimizations and propose many new ones, including emulated copy, for data passing without copying but with copy semantics between application and system buffers, and I/O-oriented IPC, for efficient data passing to and from user-level server buffers. I examine in detail network adapter requirements for copy avoidance. I describe the implementation of the different optimizations in Genie, a new I/O framework. Using Genie, I experimentally compare the optimizations on a variety of platforms and with different levels of hardware support. The experiments confirm the thesis, showing that: (1) Emulated copy performs competitively with data passing schemes with move or share semantics; (2) Emulated copy performs competitively with data and control passing optimizations enabled by extensible kernels; and (3) I/O-oriented IPC gives user-level I/O servers performance approaching that of kernel-level ones.

[1]  David L. Black Scheduling support for concurrency and parallelism in the Mach operating system , 1990, Computer.

[2]  Meng Chang Chen,et al.  HiPEC: high performance external virtual memory caching , 1994, OSDI '94.

[3]  Helen Custer,et al.  Inside Windows NT , 1992 .

[4]  Margo I. Seltzer,et al.  A Comparison of OS Extension Technologies , 1996, USENIX Annual Technical Conference.

[5]  Sherali Zeadally,et al.  An Analysis of Process and Memory Models to Support High-Speed Networking in a UNIX Environment , 1996, USENIX Annual Technical Conference.

[6]  Larry L. Peterson,et al.  Fbufs: a high-bandwidth cross-domain transfer facility , 1994, SOSP '93.

[7]  Kai Li,et al.  Implementation and performance of application-controlled file caching , 1994, OSDI '94.

[8]  Thomas R. Gross,et al.  Decoupling synchronization and data transfer in message passing systems of parallel computers , 1995, ICS '95.

[9]  Brian N. Bershad,et al.  Lightweight remote procedure call , 1990 .

[10]  Henry M. Levy,et al.  Separating data and control transfer in distributed operating systems , 1994, ASPLOS VI.

[11]  Brian N. Bershad,et al.  The interaction of architecture and operating system design , 1991, ASPLOS IV.

[12]  Hsiao-Keng Jerry Chu,et al.  Zero-Copy TCP in Solaris , 1996, USENIX Annual Technical Conference.

[13]  Dawson R. Engler,et al.  Exokernel: an operating system architecture for application-level resource management , 1995, SOSP.

[14]  Randall J. Atkinson Default IP MTU for use over ATM AAL5 , 1994, RFC.

[15]  David P. Anderson,et al.  The performance of message‐passing using restricted virtual memory remapping , 1991, Softw. Pract. Exp..

[16]  J. Howard Et El,et al.  Scale and performance in a distributed file system , 1988 .

[17]  David L. Black,et al.  Machine-independent virtual memory management for paged uniprocessor and multiprocessor architectures , 1987, IEEE Trans. Computers.

[18]  Brian N. Bershad,et al.  Extensibility safety and performance in the SPIN operating system , 1995, SOSP.

[19]  Joseph Pasquale,et al.  Exploiting In-Kernel Data Paths to Improve I/O Throughput and CPU Availability , 1993, USENIX Winter.

[20]  Guru M. Parulkar,et al.  A real-time upcall facility for protocol processing with QoS guarantees , 1995, SOSP.

[21]  Thu D. Nguyen,et al.  Implementing network protocols at user level , 1993, TNET.

[22]  Larry L. Peterson,et al.  PathFinder: A Pattern-Based Packet Classifier , 1994, OSDI.

[23]  José Carlos Brustoloni,et al.  Effects of buffering semantics on I/O performance , 1996, OSDI '96.

[24]  Brian Zill,et al.  Software support for outboard buffering and checksumming , 1995, SIGCOMM '95.

[25]  Jon Postel,et al.  The TCP Maximum Segment Size and Related Topics , 1983, RFC.

[26]  Todd C. Mowry,et al.  Automatic compiler-inserted I/O prefetching for out-of-core applications , 1996, OSDI '96.

[27]  Trent Jaeger,et al.  Achieved IPC performance (still the foundation for extensibility) , 1997, Proceedings. The Sixth Workshop on Hot Topics in Operating Systems (Cat. No.97TB100133).

[28]  Dawson R. Engler,et al.  ASHs: Application-specific handlers for high-performance messaging , 1996, SIGCOMM 1996.

[29]  Robert Grimm,et al.  Application performance and flexibility on exokernel systems , 1997, SOSP.

[30]  Jeffrey C. Mogul,et al.  The packer filter: an efficient mechanism for user-level network code , 1987, SOSP '87.

[31]  Peter Druschel,et al.  Experiences with a high-speed network adaptor: a software perspective , 1994, SIGCOMM 1994.

[32]  Michael Burrows,et al.  Performance of Firefly RPC , 1990, ACM Trans. Comput. Syst..

[33]  Peter Steenkiste,et al.  Buffer management and flow control in the Credit Net ATM host interface , 1995, Proceedings of 20th Conference on Local Computer Networks.

[34]  Willy Zwaenepoel,et al.  Optimistic implementation of bulk data transfer protocols , 1989, SIGMETRICS '89.

[35]  Alan L. Cox,et al.  TreadMarks: shared memory computing on networks of workstations , 1996 .

[36]  Brian Zill,et al.  Protocol implementation on the Nectar Communication Processor , 1990, SIGCOMM 1990.

[37]  Joseph Pasquale,et al.  Container shipping: operating system support for I/O-intensive applications , 1994, Computer.

[38]  John K. Ousterhout,et al.  Why Aren't Operating Systems Getting Faster As Fast as Hardware? , 1990, USENIX Summer.

[39]  Peter Steenkiste,et al.  Fine grain parallel communication on general purpose LANs , 1996, ICS '96.

[40]  Brian N. Bershad,et al.  The impact of operating system structure on memory system performance , 1994, SOSP '93.

[41]  Willy Zwaenepoel,et al.  The distributed V kernel and its performance for diskless workstations , 1983, SOSP '83.

[42]  David Banks,et al.  A High-Performance Network Architecture for a PA-RISC Workstation , 1993, IEEE J. Sel. Areas Commun..

[43]  Brian N. Bershad,et al.  Using continuations to implement thread management and communication in operating systems , 1991, SOSP '91.

[44]  Joseph S. Barrera A Fast Mach Network IPC Implementation , 1991, USENIX MACH Symposium.

[45]  David D. Clark,et al.  Architectural considerations for a new generation of protocols , 1990, SIGCOMM '90.

[46]  José Carlos Brustoloni,et al.  Copy emulation in checksummed, multiple-packet communication , 1997, Proceedings of INFOCOM '97.

[47]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[48]  Dawson R. Engler,et al.  ASHs: Application-Specific Handlers for High-Performance Messaging , 1996, SIGCOMM.

[49]  Peter Druschel,et al.  Lazy receiver processing (LRP): a network subsystem architecture for server systems , 1996, OSDI '96.

[50]  José Carlos Brustoloni,et al.  Evaluation of data passing and scheduling avoidance , 1997, Proceedings of 7th International Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV '97).

[51]  Hussein M. Abdel-Wahab,et al.  A proportional share resource allocation algorithm for real-time, time-shared systems , 1996, 17th IEEE Real-Time Systems Symposium.

[52]  Thorsten von Eicken,et al.  U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.

[53]  Alessandro Forin,et al.  UNIX as an Application Program , 1990, USENIX Summer.

[54]  Willy Zwaenepoel,et al.  The peregrine high‐performance RPC system , 1993, Softw. Pract. Exp..

[55]  José Carlos Brustoloni Exposed buffering and sub-datagram flow control for ATM LANs , 1994, Proceedings of 19th Conference on Local Computer Networks.

[56]  Joseph Pasquale,et al.  Improving continuous-media playback performance with in-kernel data paths , 1994, 1994 Proceedings of IEEE International Conference on Multimedia Computing and Systems.

[57]  Milon Mackey,et al.  An implementation of the Hamlyn sender-managed interface architecture , 1996, OSDI '96.

[58]  K. K. Ramakrishnan,et al.  Eliminating receive livelock in an interrupt-driven kernel , 1996, TOCS.

[59]  Dawson R. Engler,et al.  Server operating systems , 1996, EW 7.

[60]  Margo I. Seltzer,et al.  Dealing with disaster: surviving misbehaved kernel extensions , 1996, OSDI '96.

[61]  Brian N. Bershad,et al.  Protocol service decomposition for high-performance networking , 1994, SOSP '93.

[62]  Willy Zwaenepoel,et al.  Extensible kernels are leading OS research astray , 1997, Proceedings. The Sixth Workshop on Hot Topics in Operating Systems (Cat. No.97TB100133).

[63]  Jochen Liedtke,et al.  Improving IPC by kernel design , 1994, SOSP '93.

[64]  George C. Necula,et al.  Safe kernel extensions without run-time checking , 1996, OSDI '96.

[65]  Larry L. Peterson,et al.  Making paths explicit in the Scout operating system , 1996, OSDI '96.

[66]  Brian N. Bershad,et al.  Service decomposition: a structuring principle for flexible, high-performance operating systems , 1997 .

[67]  Stephen E. Deering,et al.  Path MTU discovery , 1990, RFC.

[68]  H. T. Kung,et al.  A Host Interface Architecture for High-Speed Networks , 1992, HPN.

[69]  Brian N. Bershad,et al.  User-level interprocess communication for shared memory multiprocessors , 1991, TOCS.

[70]  Armando P. Stettner The design and implementation of the 4.3BSD UNIX operating system , 1988 .

[71]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.