Shared Resource Orchestration Extensions for Kubernetes to Support Real-Time Cloud Containers

The improvements in network latency and the advent of Edge computing have inspired industries to explore providing Real-time (RT) applications as cloud-based services and benefit from the availability, scalability, and efficient hardware resource utilization of clouds. It is crucial to improve the entire stack, including the applications’ containerization, container deployment, and orchestration across nodes to host RT applications in the cloud. However, state-of-the-art container orchestrators, e.g., Kubernetes (K8s), and the underlying Linux and containerization layer ignore orchestration and management of shared resources (e.g., memory bandwidth, cache); thus, rendering them unsuitable for RT use cases due unpredictability as a result of shared resource contention. We propose K8s extensions inspired by existing RT resource management frameworks to the underlying Linux kernel and containerization layer of each node for shared resource monitoring to help K8s maintain a cloud-wide view and allocate and dynamically orchestrate shared resources to enforce the guarantees required by the RT containers. Additionally, as a proof-of-concept, we design and implement (1) new K8s shared resource orchestration extensions to support memory bandwidth and last-level cache allocation and (2) a shared-resource controller in Linux based on a new algorithm to combine approximate but throttling-free memory bandwidth allocation by simple and efficient hardware controllers (e.g., Intel MBA) together with strict but pessimistic guarantees offered by software budget allocation and throttling (e.g., Memguard). We performed experiments to evaluate and demonstrate the newly implemented extensions on server-grade hardware.

[1]  Alessandro Biondi,et al.  Analyzing Arm's MPAM From the Perspective of Time Predictability , 2023, IEEE Transactions on Computers.

[2]  G. Fohler,et al.  Monitoring Framework to Support Mixed-Criticality Applications on Multicore Platforms , 2022, Euromicro Symposium on Digital Systems Design.

[3]  H. Yun,et al.  A Closer Look at Intel Resource Director Technology (RDT) , 2022, RTNS.

[4]  G. Fohler,et al.  Assessing Intel’s Memory Bandwidth Allocation for resource limitation in real-time systems , 2022, 2022 IEEE 25th International Symposium On Real-Time Distributed Computing (ISORC).

[5]  T. Cucinotta,et al.  RT-kubernetes: containerized real-time cloud computing , 2022, SAC.

[6]  R. Buyya,et al.  Container Orchestration in Edge and Fog Computing Environments for Real-Time IoT Applications , 2022, ArXiv.

[7]  G. Fohler,et al.  Work-in-Progress: Cloud Computing for Time-Triggered Safety-Critical Systems , 2021, IEEE Real-Time Systems Symposium.

[8]  Silviu S. Craciunas,et al.  REACT: Enabling Real-Time Container Orchestration , 2021, 2021 26th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA ).

[9]  M. Cinque,et al.  Preventing timing failures in mixed-criticality clouds with dynamic real-time containers , 2021, European Dependable Computing Conference.

[10]  Rajkumar Buyya,et al.  FogBus2: a lightweight and distributed container-based framework for integration of IoT-enabled systems with edge and cloud computing , 2021, BiDEDE@SIGMOD.

[11]  Gerhard Fohler,et al.  RT-Cloud: Virtualization Technologies and Cloud Computing for Railway Use-Case , 2021, 2021 IEEE 24th International Symposium on Real-Time Distributed Computing (ISORC).

[12]  Jerry Chou,et al.  KubeShare: A Framework to Manage GPUs as First-Class and Shared Resources in Container Cloud , 2020, HPDC.

[13]  Tommaso Cucinotta,et al.  Container-based real-time scheduling in the Linux kernel , 2019, SIGBED.

[14]  Marcello Cinque,et al.  RT-CASEs: Container-Based Virtualization for Temporally Separated Mixed-Criticality Task Sets , 2019, ECRTS.

[15]  Cong Xu,et al.  NBWGuard: Realizing Network QoS for Kubernetes , 2018, Middleware Industry.

[16]  Ákos Kovács,et al.  Comparison of different Linux containers , 2017, 2017 40th International Conference on Telecommunications and Signal Processing (TSP).

[17]  Daniel Gracia Pérez,et al.  DREAMS about reconfiguration and adaptation in avionics , 2016 .

[18]  Henrik Theiling,et al.  Multi-core Interference-Sensitive WCET Analysis Leveraging Runtime Resource Capacity Enforcement , 2014, 2014 26th Euromicro Conference on Real-Time Systems.

[19]  Lui Sha,et al.  MemGuard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms , 2013, 2013 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[20]  Karl-Erik Årzén,et al.  Resource Management on Multicore Systems: The ACTORS Approach , 2011, IEEE Micro.

[21]  Moris Behnam,et al.  Real-Time Containers: A Survey , 2020, Fog-IoT.

[22]  Sascha Uhrig,et al.  Contention-Aware Dynamic Memory Bandwidth Isolation with Predictability in COTS Multicores: An Avionics Case Study , 2017, ECRTS.

[23]  Frank Bellosa,et al.  Process Cruise Control: Throttling Memory Access in a Soft Real-Time Environment , 1997, SOSP 1997.