Preemptive Parallel Job Scheduling for Heterogeneous Systems Supporting Urgent Computing

Dedicated infrastructures are commonly used for urgent computations. However, using dedicated resources is not always affordable due to budget constraints. As a result, utilizing shared infrastructures becomes an alternative solution for urgent computations. Since the infrastructures are meant to serve many users, the urgent jobs may arrive when regular jobs are using the necessary resources. In such a case, it is necessary to preempt the regular jobs so that urgent jobs can be executed immediately. Most conventional methods for job scheduling have focused on reducing the response times and waiting times of all jobs. However, these methods can delay urgent jobs and hinder them from being completed within a stipulated deadline. Furthermore, in heterogeneous systems with coprocessors, preemption becomes more difficult because coprocessors rely on several system software functionalities provided by the host processor. In this paper, we propose a parallel job scheduling method to effectively use shared heterogeneous systems for urgent computations. Our method employs an in-memory process swapping mechanism to preempt jobs running on the coprocessor devices. The results of our simulations show that our method can achieve a significant reduction in the response time and slowdown of regular jobs without substantial delays of urgent jobs.

[1]  Vivek S. Pai,et al.  SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy , 2011, NSDI.

[2]  Ivan Beschastnikh,et al.  SPRUCE: A System for Supporting Urgent High-Performance Computing , 2006, Grid-Based Problem Solving Environments.

[3]  Hiroaki Kobayashi,et al.  Real-time tsunami inundation forecast system for tsunami disaster prevention and mitigation , 2018, The Journal of Supercomputing.

[4]  Miguel A. Vega-Rodríguez,et al.  Fattened backfilling: An improved strategy for job scheduling in parallel systems , 2016, J. Parallel Distributed Comput..

[5]  Denis Trystram,et al.  Tuning EASY-Backfilling Queues , 2017, JSSPP.

[6]  Dror G. Feitelson,et al.  Metrics for Parallel Job Scheduling and Their Convergence , 2001, JSSPP.

[7]  Bernd Freisleben,et al.  A comparative study of online scheduling algorithms for networks of workstations , 2000, Cluster Computing.

[8]  Dirk Koch,et al.  Heterogeneous Resource-Elastic Scheduling for CPU+FPGA Architectures , 2019, HEART.

[9]  Dirk Koch,et al.  Resource Elastic Virtualization for FPGAs Using OpenCL , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).

[10]  Mark J. Clement,et al.  Preemption Based Backfill , 2002, JSSPP.

[11]  Noriki Uchida,et al.  Disaster Information System from Communication Traffic Analysis and Connectivity (Quick Report from Japan Earthquake and Tsunami on March 11th, 2011) , 2011, 2011 14th International Conference on Network-Based Information Systems.

[12]  Ayan Banerjee,et al.  Spatio-temporal thermal-aware job scheduling to minimize energy consumption in virtualized heterogeneous data centers , 2009, Comput. Networks.

[13]  Frank Bellosa,et al.  GPUswap: Enabling Oversubscription of GPU Memory through Transparent Swapping , 2015, VEE.

[14]  Jinkyu Jeong,et al.  A Case for Hardware-Based Demand Paging , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).

[15]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[16]  Yuan Xie,et al.  Hybrid checkpointing using emerging nonvolatile memories for future exascale systems , 2011, TACO.

[17]  Hiroaki Kobayashi,et al.  Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[18]  Philippas Tsigas,et al.  GPU-Quicksort: A practical Quicksort algorithm for graphics processors , 2010, JEAL.

[19]  Mark J. Clement,et al.  Core Algorithms of the Maui Scheduler , 2001, JSSPP.

[20]  Mateo Valero,et al.  Out-of-order vector architectures , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[21]  Dalibor Klusácek,et al.  Complex Job Scheduling Simulations with Alea 4 , 2016, SimuTools.

[22]  SuKyoung Lee,et al.  Deadline-guaranteed scheduling algorithm with improved resource utilization for cloud computing , 2015, 2015 12th Annual IEEE Consumer Communications and Networking Conference (CCNC).

[23]  Dror G. Feitelson,et al.  Utilization and Predictability in Scheduling the IBM SP2 with Backfilling , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[24]  Christian Haubelt,et al.  Efficient hardware checkpointing: concepts, overhead analysis, and implementation , 2007, FPGA '07.

[25]  Jonathan M. Smith,et al.  A survey of process migration mechanisms , 1988, OPSR.

[26]  Nan Qi,et al.  Practical Resource Usage Prediction Method for Large Memory Jobs in HPC Clusters , 2019, SCFA.

[27]  Erich Strohmaier,et al.  The TOP500 List and Progress in High-Performance Computing , 2015, Computer.

[28]  Veljko M. Milutinovic,et al.  A Survey of Microprocessor Architectures for Memory Management , 1987, Computer.

[29]  Norman R. Nielsen,et al.  An analysis of some time-sharing techniques , 1971, CACM.

[30]  Kang G. Shin,et al.  Efficient Memory Disaggregation with Infiniswap , 2017, NSDI.

[31]  Jorge Macías Sánchez,et al.  Urgent Tsunami Computing , 2019, 2019 IEEE/ACM HPC for Urgent Decision Making (UrgentHPC).

[32]  Dan Tsafrir,et al.  Backfilling Using System-Generated Predictions Rather than User Runtime Estimates , 2007, IEEE Transactions on Parallel and Distributed Systems.

[33]  F. Mueller,et al.  Proactive process-level live migration in HPC environments , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[34]  Gérard Berry,et al.  Preemption in Concurrent Systems , 1993, FSTTCS.

[35]  Xiang Wang,et al.  A preemption-based runtime to efficiently schedule multi-process applications on heterogeneous clusters with GPUs , 2013, HPDC '13.

[36]  Dan Tsafrir,et al.  Experience with using the Parallel Workloads Archive , 2014, J. Parallel Distributed Comput..

[37]  Hans-Peter Plag,et al.  Rapid determination of earthquake magnitude using GPS for tsunami warning systems , 2006 .

[38]  Hiroaki Kobayashi,et al.  CheCUDA: A Checkpoint/Restart Tool for CUDA Applications , 2009, 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies.

[39]  Dirk Koch,et al.  A Survey on FPGA Virtualization , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).

[40]  Dieter Kranzlmüller,et al.  Towards a General Definition of Urgent Computing , 2015, ICCS.

[41]  Kuzman Ganchev,et al.  Nswap: A Network Swapping Module for Linux Clusters , 2003, Euro-Par.

[42]  Dmitry N. Zotkin,et al.  Attacking the bottlenecks of backfilling schedulers , 2004, Cluster Computing.

[43]  John McGee,et al.  Urgent Computing of Storm Surge for North Carolina's Coast , 2012, ICCS.

[44]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[45]  Moni Naor,et al.  Job Scheduling Strategies for Parallel Processing , 2017, Lecture Notes in Computer Science.

[46]  Yukio Fujinawa,et al.  Japan's Earthquake Early Warning System on 11 March 2011: Performance, Shortcomings, and Changes , 2013 .

[47]  Laxmikant V. Kalé,et al.  FTC-Charm++: an in-memory checkpoint-based fault tolerant runtime for Charm++ and MPI , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[48]  Eric McCreath,et al.  Efficient Evaluation of Scheduling Metrics Using Emulation: A Case Study in the Effect of Artefacts , 2018, ICPP Workshops.

[49]  Dror G. Feitelson,et al.  Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling , 2001, IEEE Trans. Parallel Distributed Syst..

[50]  Angela C. Sodan,et al.  Predictive Space- and Time-Resource Allocation for Parallel Job Scheduling in Clusters, Grids, Clouds , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[51]  Tchimou N'Takpé,et al.  Don't Hurry Be Happy: A Deadline-Based Backfilling Approach , 2017, JSSPP.

[52]  Hiroaki Kobayashi,et al.  CheCL: Transparent Checkpointing and Process Migration of OpenCL Applications , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[53]  Rajkumar Buyya,et al.  GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing , 2002, Concurr. Comput. Pract. Exp..

[54]  Yuta Watanabe,et al.  I/O Performance of the SX-Aurora TSUBASA , 2020, 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[55]  Andreas Traber,et al.  Preemptive Hardware Multitasking in ReconOS , 2015, ARC.

[56]  Dieter Kranzlmüller,et al.  Leveraging e-Infrastructures for Urgent Computing , 2013, ICCS.