Machine learning for load balancing in the Linux kernel

The OS load balancing algorithm governs the performance gains provided by a multiprocessor computer system. The Linux's Completely Fair Scheduler (CFS) scheduler tracks process loads by average CPU utilization to balance workload between processor cores. That approach maximizes the utilization of processing time but overlooks the contention for lower-level hardware resources. In servers running compute-intensive workloads, an imbalanced need for limited computing resources hinders execution performance. This paper solves the above problem using a machine learning (ML)-based resource-aware load balancer. We describe (1) low-overhead methods for collecting training data; (2) an ML model based on a multi-layer perceptron model that imitates the CFS load balancer based on the collected training data; and (3) an in-kernel implementation of inference on the model. Our experiments demonstrate that the proposed model has an accuracy of 99% in making migration decisions and while only increasing the latency by 1.9 μs.

[1]  Sandhya Dwarkadas,et al.  Compatible phase co-scheduling on a CMP of multi-threaded processors , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[2]  Ravishankar K. Iyer,et al.  A ML-based Runtime System for Executing Dataflow Graphs on Heterogeneous Processors , 2018, SoCC.

[3]  Suresh Siddha Chip Multi Processing aware Linux Kernel Scheduler , 2010 .

[4]  M Namratha,et al.  A Machine Learning Approach for Improving Process Scheduling: A Survey , 2017 .

[5]  Elisabeth Larsson,et al.  Resource-Aware Task Scheduling , 2015, ACM Trans. Embed. Comput. Syst..

[6]  Ravishankar K. Iyer,et al.  Inductive Bias-driven Reinforcement Learning For Efficient Schedules in Heterogeneous Clusters , 2019, ICML.

[7]  Christian Bienia,et al.  Benchmarking modern multiprocessors , 2011 .

[8]  N Nikhil Jain,et al.  Improvising process scheduling using machine learning , 2018, 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT).

[9]  Vivien Quéma,et al.  The Linux scheduler: a decade of wasted cores , 2016, EuroSys.

[10]  Claudio Scordino,et al.  An EDF scheduling class for the Linux kernel ∗ , 2009 .

[11]  Francisco J. Cazorla,et al.  Dynamically Controlled Resource Allocation in SMT Processors , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[12]  A. Negi,et al.  Applying Machine Learning Techniques to Improve Linux Process Scheduling , 2005, TENCON 2005 - 2005 IEEE Region 10 Conference.

[13]  Engin Ipek,et al.  Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[14]  Muhammad Arshad Islam,et al.  Troodon: A machine-learning based load-balancing application scheduler for CPU-GPU system , 2019, J. Parallel Distributed Comput..