A Heterogeneous Cluster Multi-resource Fair Scheduling Algorithm Based on Machine Learning

The resource scheduling of data center is a research hotspot of cloud computing. The exiting research work is concerned with the issue of fairness, resource utilization and energy efficiency, which are only applicable to the same cluster environment or specific application situations. First, the default scheduling algorithm (DRF) of Mesos is analyzed. The DRF algorithm does not consider machine performance and task types. Then, this paper presents a heterogeneous cluster multi-resource fair scheduling algorithm based on machine learning to solve the problem. The algorithm is to test the performance of the machine and use the machine learning method to classify the computing tasks and reach the goal of reasonable resource allocation. Finally, the experimental results show that the method presented in this paper not only ensures the fairness of resource allocation, but also makes the system more reasonable allocation of resources and further improves the system’s resource utilization.

[1]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[2]  Jing Zhang,et al.  Cluster resource adjustment based on an improved artificial fish swarm algorithm in Mesos , 2016, 2016 IEEE 13th International Conference on Signal Processing (ICSP).

[3]  Gu Jing,et al.  Predicting Misconfiguration-Induced Unsuccessful Executions of Jobs in Big Data System , 2017, 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC).

[4]  Aditya Akella,et al.  Altruistic Scheduling in Multi-Resource Clusters , 2016, OSDI.

[5]  Scott Shenker,et al.  Choosy: max-min fair sharing for datacenter jobs with constraints , 2013, EuroSys '13.

[6]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[7]  Andrew V. Goldberg,et al.  Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.

[8]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[9]  Wei Wang,et al.  Multi-Resource Fair Allocation in Heterogeneous Cloud Computing Systems , 2015, IEEE Transactions on Parallel and Distributed Systems.

[10]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[11]  Patrick Wendell,et al.  Sparrow: distributed, low latency scheduling , 2013, SOSP.

[12]  Xiangyu Li,et al.  Mystic: Predictive Scheduling for GPU Based Cloud Servers Using Machine Learning , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[13]  Benjamin Hindman,et al.  Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.