I/O Is Faster Than the CPU: Let's Partition Resources and Eliminate (Most) OS Abstractions

I/O is getting faster in servers that have fast programmable NICs and non-volatile main memory operating close to the speed of DRAM, but single-threaded CPU speeds have stagnated. Applications cannot take advantage of modern hardware capabilities when using interfaces built around abstractions that assume I/O to be slow. We therefore propose a structure for an OS called parakernel, which eliminates most OS abstractions and provides interfaces for applications to leverage the full potential of the underlying hardware. The parakernel facilitates application-level parallelism by securely partitioning the resources and multiplexing only those resources that are not partitioned.

[1]  Yiying Zhang,et al.  LegoOS: A Disseminated, Distributed OS for Hardware Resource Disaggregation , 2018, OSDI.

[2]  Hyeontaek Lim,et al.  MICA: A Holistic Approach to Fast In-Memory Key-Value Storage , 2014, NSDI.

[3]  Thomas E. Anderson,et al.  Ingress Pipeline Queues Packet Buffer DMA PipelineDMA Egress Pipeline , 2015 .

[4]  Jon Crowcroft,et al.  Unikernels: library operating systems for the cloud , 2013, ASPLOS '13.

[5]  Luis Ceze,et al.  Operating System Implications of Fast, Cheap, Non-Volatile Memory , 2011, HotOS.

[6]  Nick Knupffer Intel Corporation , 2018, The Grants Register 2019.

[7]  Jin-Soo Kim,et al.  NVMeDirect: A User-space I/O Framework for Application-specific Optimization on NVMe SSDs , 2016, HotStorage.

[8]  Adrian Schüpbach,et al.  The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.

[9]  Xi Wang,et al.  Hyperkernel: Push-Button Verification of an OS Kernel , 2017, SOSP.

[10]  Toke Høiland-Jørgensen,et al.  The eXpress data path: fast programmable packet processing in the operating system kernel , 2018, CoNEXT.

[11]  Gernot Heiser,et al.  The Jury Is In: Monolithic OS Design Is Flawed: Microkernel-based Designs Improve Security , 2018, APSys.

[12]  Timothy Roscoe,et al.  Arrakis , 2014, OSDI.

[13]  Herbert Bos,et al.  On Sockets and System Calls: Minimizing Context Switches for the Socket API , 2014, TRIOS.

[14]  Philip Levis,et al.  The Case for Writing a Kernel in Rust , 2017, APSys.

[15]  Frank Hady,et al.  When poll is better than interrupt , 2012, FAST.

[16]  Dimitris Mitropoulos,et al.  POSIX abstractions in modern operating systems: the old, the new, and the missing , 2016, EuroSys.

[17]  Steven McCanne,et al.  The BSD Packet Filter: A New Architecture for User-level Packet Capture , 1993, USENIX Winter.

[18]  Gernot Heiser,et al.  From L3 to seL4 what have we learnt in 20 years of L4 microkernels? , 2013, SOSP.

[19]  Dawson R. Engler,et al.  Exokernel: an operating system architecture for application-level resource management , 1995, SOSP.

[20]  Luigi Rizzo,et al.  netmap: A Novel Framework for Fast Packet I/O , 2012, USENIX ATC.

[21]  David G. Andersen,et al.  Design Guidelines for High Performance RDMA Systems , 2016, USENIX ATC.

[22]  Michio Honda,et al.  PASTE: A Network Programming Interface for Non-Volatile Main Memory , 2018, NSDI.

[23]  David A. Patterson,et al.  Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .

[24]  Jeffrey C. Mogul,et al.  The packer filter: an efficient mechanism for user-level network code , 1987, SOSP '87.

[25]  Jeffrey C. Mogul,et al.  TCP Offload Is a Dumb Idea Whose Time Has Come , 2003, HotOS.

[26]  Scott Devine,et al.  Disco: running commodity operating systems on scalable multiprocessors , 1997, TOCS.

[27]  Jialin Li,et al.  Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency , 2014, SoCC.

[28]  Michael Norrish,et al.  seL4: formal verification of an OS kernel , 2009, SOSP '09.

[29]  Dan Williams,et al.  Will Serverless End the Dominance of Linux in the Cloud? , 2017, HotOS.

[30]  Leonid Ryzhyk,et al.  System Programming in Rust: Beyond Safety , 2017, HotOS.

[31]  Nicolaas Viljoen,et al.  Hardware Offload to SmartNICs : cls bpf and XDP , 2016 .

[32]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[33]  Kushagra Vaid,et al.  Azure Accelerated Networking: SmartNICs in the Public Cloud , 2018, NSDI.