Optimal Offloading of Kubernetes Pods in Three-Tier Networks

By pushing resources to far-edge servers located in the proximity of users, edge computing can greatly reduce end-to-end transmission delays. Task offloading in multi-tier networks refers to the optimization of which tasks should be offloaded from the far-edge to the edge and the cloud. Moreover, the containerization of applications can further reduce resource and time consumption and, in turn, the latency of such applications. Even though Kubernetes has become the de facto container orchestrator, not many works have considered the offloading of containerized applications in Kubernetes clusters spanning from cloud to far-edge. In this work, the problem of offloading Kubernetes tasks (or pods) in three-tier networks is formulated and optimized. First, a utility function is presented in terms of the cumulative weighted pod response time, and a utility minimization problem with central processing unit (CPU) constraints is presented. Based on the optimal theoretical solution to this problem, a three-tier offloading decision algorithm (TTODA) is developed. Horizontal scaling is considered, and specific hardware capabilities of each node are taken into account by setting specific SLAs that are fed back to the algorithm. Numerical results show that TTODA outperforms a typical Kubernetes QoS model based on first-in, first-served algorithm (FIFSA) in terms of utility, average pod response time, and usage of far-edge CPU. Further, TTODA achieves an excellent trade-off between performance and computational complexity, and thus it can help achieve the requirements of latency-sensitive applications. Moreover, TTODA can easily be extended to scenarios with joint memory and CPU constraints.