X-Containers: Breaking Down Barriers to Improve Performance and Isolation of Cloud-Native Containers

"Cloud-native" container platforms, such as Kubernetes, have become an integral part of production cloud environments. One of the principles in designing cloud-native applications is called Single Concern Principle, which suggests that each container should handle a single responsibility well. In this paper, we propose X-Containers as a new security paradigm for isolating single-concerned cloud-native containers. Each container is run with a Library OS (LibOS) that supports multi-processing for concurrency and compatibility. A minimal exokernel ensures strong isolation with small kernel attack surface. We show an implementation of the X-Containers architecture that leverages Xen paravirtualization (PV) to turn Linux kernel into a LibOS. Doing so results in a highly efficient LibOS platform that does not require hardware-assisted virtualization, improves inter-container isolation, and supports binary compatibility and multi-processing. By eliminating some security barriers such as seccomp and Meltdown patch, X-Containers have up to 27X higher raw system call throughput compared to Docker containers, while also achieving competitive or superior performance on various benchmarks compared to recent container platforms such as Google's gVisor and Intel's Clear Containers.

[1]  Toshiyuki Maeda Kernel mode Linux , 2003 .

[2]  Brian N. Bershad,et al.  Extensibility safety and performance in the SPIN operating system , 1995, SOSP.

[3]  Christoforos E. Kozyrakis,et al.  IX: A Protected Dataplane Operating System for High Throughput and Low Latency , 2014, OSDI.

[4]  David M. Eyers,et al.  SCONE: Secure Linux Containers with Intel SGX , 2016, OSDI.

[5]  Thomas Anderson,et al.  The case for application-specific operating systems , 1992, [1992] Proceedings Third Workshop on Workstation Operating Systems.

[6]  William J. Bolosky,et al.  Mach: A New Kernel Foundation for UNIX Development , 1986, USENIX Summer.

[7]  Larry L. Peterson,et al.  Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors , 2007, EuroSys '07.

[8]  Robert Wahbe,et al.  Efficient software-based fault isolation , 1994, SOSP '93.

[9]  Donald E. Porter,et al.  Cooperation and security isolation of library OSes for multi-process applications , 2014, EuroSys '14.

[10]  No License,et al.  Intel ® 64 and IA-32 Architectures Software Developer ’ s Manual Volume 3 A : System Programming Guide , Part 1 , 2006 .

[11]  Sharath George Usermode kernel : running the kernel in userspace in VM environments , 2008 .

[12]  Jochen Liedtke,et al.  The performance of μ-kernel-based systems , 1997, SOSP.

[13]  Hakim Weatherspoon,et al.  The Xen-Blanket: virtualize once, run everywhere , 2012, EuroSys '12.

[14]  Christopher Negus,et al.  GNU General Public License , 2015 .

[15]  Eric A. Brewer,et al.  Kubernetes and the path to cloud native , 2015, SoCC.

[16]  Jeff Dike,et al.  A user-mode port of the Linux kernel , 2000, Annual Linux Showcase & Conference.

[17]  Galen C. Hunt,et al.  Shielding Applications from an Untrusted Cloud with Haven , 2014, OSDI.

[18]  Florian Schmidt,et al.  My VM is Lighter (and Safer) than your Container , 2017, SOSP.

[19]  Jon Crowcroft,et al.  Unikernels: library operating systems for the cloud , 2013, ASPLOS '13.

[20]  Don Marti,et al.  OSv - Optimizing the Operating System for Virtual Machines , 2014, USENIX Annual Technical Conference.

[21]  George N. Rouskas,et al.  On the impact of scheduler settings on the performance of multi-threaded SIP servers , 2015, 2015 IEEE International Conference on Communications (ICC).

[22]  Brendan Burns,et al.  Design Patterns for Container-based Distributed Systems , 2016, HotCloud.

[23]  Robin Fairbairns,et al.  The Design and Implementation of an Operating System to Support Distributed Multimedia Applications , 1996, IEEE J. Sel. Areas Commun..

[24]  Brian N. Bershad,et al.  Improving the reliability of commodity operating systems , 2005, TOCS.

[25]  Nicolae Tapus,et al.  LKL: The Linux kernel library , 2010, 9th RoEduNet IEEE International Conference.

[26]  Pooyan Jamshidi,et al.  Migrating to Cloud-Native Architectures Using Microservices: An Experience Report , 2015, ESOCC Workshops.

[27]  Cristiano Giuffrida,et al.  Enhanced Operating System Security Through Efficient and Fine-grained Address Space Randomization , 2012, USENIX Security Symposium.

[28]  Pooyan Jamshidi,et al.  Microservices Architecture Enables DevOps: Migration to a Cloud-Native Architecture , 2016, IEEE Software.

[29]  Kang G. Shin,et al.  Automated control of multiple virtualized resources , 2009, EuroSys '09.

[30]  James R. Larus,et al.  Singularity: rethinking the software stack , 2007, OPSR.

[31]  Xiaoyun Zhu,et al.  Memory overbooking and dynamic control of Xen virtual machines in consolidated environments , 2009, 2009 IFIP/IEEE International Symposium on Integrated Network Management.

[32]  Liming Zhu,et al.  DevOps - A Software Architect's Perspective , 2015, SEI series in software engineering.

[33]  James R. Larus,et al.  Sealing OS processes to improve dependability and safety , 2007, EuroSys '07.

[34]  Donald E. Porter,et al.  Rethinking the library OS from the top down , 2011, ASPLOS XVI.

[35]  David R. Cheriton,et al.  A caching model of operating system kernel functionality , 1994, OSDI '94.

[36]  Dawson R. Engler,et al.  Exokernel: an operating system architecture for application-level resource management , 1995, SOSP.

[37]  Han Dong,et al.  EbbRT: A Framework for Building Per-Application Library Operating Systems , 2016, OSDI.

[38]  Sam Newman,et al.  Building microservices - designing fine-grained systems, 1st Edition , 2015 .

[39]  Roberto Bifulco,et al.  ClickOS and the Art of Network Function Virtualization , 2014, NSDI.

[40]  Christoforos E. Kozyrakis,et al.  Usenix Association 10th Usenix Symposium on Operating Systems Design and Implementation (osdi '12) 335 Dune: Safe User-level Access to Privileged Cpu Features , 2022 .