Chord: Checkpoint-based scheduling using hybrid waiting list in shared clusters

Abstract Cloud platforms supported by shared clusters are getting increasingly effective. Numerous tasks are submitted into clusters by a variety of users. Cloud platforms usually assign tasks with different priorities based on different Quality of Services (QoS) chosen by users. High-priority tasks can be executed primarily. As a consequence, preemption frequently occurs in almost all the commercial cloud platforms, such as Google and Amazon cluster. Although kill-based preemption is adopted as an optimal solution for high-priority tasks, it severely harms low-priority tasks. Especially, during the peak time, some low-priority tasks may be preempted and restarted repeatedly resulting in consuming much more precious resources including CPU cores, RAM and hard drives. Thanks to the checkpoint technology that provides an efficient solution to addressing the preemption issue. However, using checkpoint blindly will cause more resource waste. To address this issue, in this paper, we propose a concept of hybrid waiting list that holds all unfinished tasks and makes the resumption of tasks regularly. We leverage the checkpoint technology and design a novel approach based on the hybrid waiting list named Chord ( Ch eckp o int with hyb r id sche d uling method) which effectively improves the performance of shared clusters. Specifically, by checking the occupancy of resources periodically and making checkpoints for certain tasks, our approach can effectively reduce unnecessary checkpoints and improve the performance of the whole cluster, especially for low-priority tasks. Extensive simulation experiments injecting tasks from the Google cloud trace logs were conducted to validate the superiority of our approach. Compared with the ordinary priority scheduling methods adopt by several commercial clouds, the improvement of response time gained by our Chord can reach 18.94%.

[1]  Calton Pu,et al.  Improving Preemptive Scheduling with Application-Transparent Checkpointing in Shared Clusters , 2015, Middleware.

[2]  Jie Xu,et al.  An Approach for Characterizing Workloads in Google Cloud to Derive Realistic Resource Utilization Models , 2013, 2013 IEEE Seventh International Symposium on Service-Oriented System Engineering.

[4]  Xiaomin Zhu,et al.  QoS-Aware Fault-Tolerant Scheduling for Real-Time Tasks on Heterogeneous Clusters , 2011, IEEE Transactions on Computers.

[5]  Jesús Carretero,et al.  Different aspects of workflow scheduling in large-scale distributed systems , 2017, Simul. Model. Pract. Theory.

[6]  Tao Ke,et al.  Checkpointing Orchestration: Toward a Scalable HPC Fault-Tolerant Environment , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[7]  Miron Livny,et al.  Managing network resources in Condor , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.

[8]  Chi-Yi Lin,et al.  On Improving Fault Tolerance for Heterogeneous Hadoop MapReduce Clusters , 2013, 2013 International Conference on Cloud Computing and Big Data.

[9]  Franck Cappello,et al.  Optimization of cloud task processing with checkpoint-restart mechanism , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[10]  Carlo Curino,et al.  Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.

[11]  Samy El-Tawab,et al.  Towards Fault-Tolerant Job Assignment in Vehicular Cloud , 2015, 2015 IEEE International Conference on Services Computing.

[12]  Christine Morin,et al.  Checkpointing as a Service in Heterogeneous Cloud Environments , 2014, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[13]  Ion Stoica,et al.  True elasticity in multi-tenant data-intensive compute clusters , 2012, SoCC '12.

[14]  Xiaomin Zhu,et al.  Improving the Performance of Data Sharing in Dynamic Peer-to-Peer Mobile Cloud , 2016, 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS).

[15]  Bo Li,et al.  Submitted to Ieee Transactions on Parallel and Distributed Systems 1 on Arbitrating the Power-performance Tradeoff in Saas Clouds , 2022 .

[16]  Andy B. Yoo,et al.  Approved for Public Release; Further Dissemination Unlimited X-ray Pulse Compression Using Strained Crystals X-ray Pulse Compression Using Strained Crystals , 2002 .

[17]  John A. Chandy,et al.  Leveraging checkpoint/restore to optimize utilization of cloud compute resources , 2015, 2015 IEEE 40th Local Computer Networks Conference Workshops (LCN Workshops).

[18]  Michael Abd-El-Malek,et al.  Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.

[19]  Xiaomin Zhu,et al.  Real-Time Fault-Tolerant Scheduling Based on Primary-Backup Approach in Virtualized Clouds , 2013, 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing.

[20]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[21]  David J. Yates,et al.  Towards Fault-Tolerant Energy-Efficient High Performance Computing in the Cloud , 2012, 2012 IEEE International Conference on Cluster Computing.

[22]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[23]  Rami G. Melhem,et al.  Shadow Computing: An energy-aware fault tolerant computing model , 2014, 2014 International Conference on Computing, Networking and Communications (ICNC).

[24]  Raouf Boutaba,et al.  Mitigating the negative impact of preemption on heterogeneous MapReduce workloads , 2011, 2011 7th International Conference on Network and Service Management.