Energy-Aware Non-Preemptive Task Scheduling With Deadline Constraint in DVFS-Enabled Heterogeneous Clusters

Energy conservation of large data centers for high performance computing workloads, such as deep learning with Big Data, is of critical significance, where cutting down a few percent of electricity translates into million-dollar savings. This work studies energy conservation on emerging CPU-GPU hybrid clusters through dynamic voltage and frequency scaling (DVFS). We aim at minimizing the total energy consumption of processing a batch of offline tasks or a sequence of real-time tasks under deadline constraints. We derive a fast and accurate analytical model to compute the appropriate voltage/frequency setting for each task, and assign multiple tasks to the cluster with heuristic scheduling algorithms. In particular, our model stresses the nonlinear relationship between task execution time and processor speed for GPU-accelerated applications, for more accurately capturing real-world GPU energy consumption. In performance evaluation driven by real-world power measurement traces, our scheduling algorithm shows comparable energy savings to the theoretical upper bound. With a GPU scaling interval where analytically at most 36% of energy can be saved, we record 33-35% of energy savings. Our results are applicable to energy management on modern heterogeneous clusters.

[1]  Meng Hao,et al.  Dynamic GPU Energy Optimization for Machine Learning Training Workloads , 2022, IEEE Transactions on Parallel and Distributed Systems.

[2]  Hamid Noori,et al.  Fairness-Aware Energy Efficient Scheduling on Heterogeneous Multi-Core Processors , 2021, IEEE Transactions on Computers.

[3]  Luciano Floridi,et al.  GPT-3: Its Nature, Scope, Limits, and Consequences , 2020, Minds and Machines.

[4]  Huimin Huang,et al.  Energy-Aware Task Scheduling on Heterogeneous Computing Systems With Time Constraint , 2020, IEEE Access.

[5]  Nuno Roma,et al.  Modeling and Decoupling the GPU Power Consumption for Cross-Domain DVFS , 2019, IEEE Transactions on Parallel and Distributed Systems.

[6]  Ben H. H. Juurlink,et al.  Predictable GPUs Frequency Scaling for Energy and Performance , 2019, ICPP.

[7]  Kaiyong Zhao,et al.  AutoML: A Survey of the State-of-the-Art , 2019, Knowl. Based Syst..

[8]  Qiang Wang,et al.  The Impact of GPU DVFS on the Energy and Performance of Deep Learning: an Empirical Study , 2019, e-Energy.

[9]  Yanhui Huang,et al.  GPU Energy Consumption Optimization With a Global-Based Neural Network Method , 2019, IEEE Access.

[10]  Henry Hoffmann,et al.  Energy-efficient Application Resource Scheduling using Machine Learning Classifiers , 2018, ICPP.

[11]  Wu-chun Feng,et al.  GPU power prediction via ensemble machine learning for DVFS space exploration , 2018, CF.

[12]  Pedro Tomás,et al.  DVFS-aware application classification to improve GPGPUs energy efficiency , 2018, Parallel Comput..

[13]  Nuno Roma,et al.  GPGPU Power Modeling for Multi-domain Voltage-Frequency Scaling , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[14]  Keqin Li,et al.  Energy-Efficient Scheduling Algorithms for Real-Time Parallel Applications on Heterogeneous Distributed Embedded Systems , 2017, IEEE Transactions on Parallel and Distributed Systems.

[15]  Qiang Wang,et al.  GPGPU Power Estimation with Core and Memory Frequency Scaling , 2017, SIGMETRICS Perform. Evaluation Rev..

[16]  Hai Liu,et al.  Energy Efficient Job Scheduling with DVFS for CPU-GPU Heterogeneous Systems , 2017, e-Energy.

[17]  Hai Liu,et al.  Energy efficient real-time task scheduling on CPU-GPU hybrid clusters , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[18]  Qiang Wang,et al.  GPGPU Performance Estimation with Core and Memory Frequency Scaling , 2017, 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS).

[19]  Qiang Wang,et al.  HKBU Institutional Repository , 2018 .

[20]  Wu-chun Feng,et al.  Online Power Estimation of Graphics Processing Units , 2016, 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid).

[21]  Kuan-Ching Li,et al.  An Energy-Efficient Task Scheduling Algorithm in DVFS-enabled Cloud Environment , 2016, Journal of Grid Computing.

[22]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[23]  Dean M. Tullsen,et al.  The CRISP performance model for dynamic voltage and frequency scaling in a GPGPU , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[24]  Xinxin Mei,et al.  Dissecting GPU Memory Hierarchy Through Microbenchmarking , 2015, IEEE Transactions on Parallel and Distributed Systems.

[25]  Henry Hoffmann,et al.  Racing and Pacing to Idle: Theoretical and Empirical Analysis of Energy Optimization Heuristics , 2015, 2015 IEEE 3rd International Conference on Cyber-Physical Systems, Networks, and Applications.

[26]  S. A. Mirsoleimani,et al.  A statistical performance analyzer framework for OpenCL kernels on Nvidia GPUs , 2015, The Journal of Supercomputing.

[27]  Derek Chiou,et al.  GPGPU performance and power estimation using machine learning , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[28]  Scott A. Mahlke,et al.  Equalizer: Dynamic Tuning of GPU Resources for Efficient Execution , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[29]  Xinxin Mei,et al.  Benchmarking the Memory Hierarchy of Modern GPUs , 2014, NPC.

[30]  Abdullah Gharaibeh,et al.  The energy case for graph processing on hybrid CPU and GPU systems , 2013, IA3 '13.

[31]  Rong Ge,et al.  Effects of Dynamic Voltage and Frequency Scaling on a K20 GPU , 2013, 2013 42nd International Conference on Parallel Processing.

[32]  Shuaiwen Song,et al.  A Simplified and Accurate Model of Power-Performance Efficiency on Emergent GPU Architectures , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[33]  Hiroshi Sasaki,et al.  Power and Performance Analysis of GPU-Accelerated Systems , 2012, HotPower.

[34]  Jian Li,et al.  Power-efficient time-sensitive mapping in heterogeneous systems , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[35]  Rajkumar Buyya,et al.  Energy-aware resource allocation heuristics for efficient management of data centers for Cloud computing , 2012, Future Gener. Comput. Syst..

[36]  David A. Bader,et al.  A Waterfall Model to Achieve Energy Efficient Tasks Mapping for Large Scale GPU Clusters , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[37]  Hyesoon Kim,et al.  An integrated GPU power and performance model , 2010, ISCA.

[38]  Andreas Moshovos,et al.  Demystifying GPU microarchitecture through microbenchmarking , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).

[39]  Hyesoon Kim,et al.  An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009, ISCA '09.

[40]  Susanne Albers,et al.  Speed Scaling on Parallel Processors , 2007, SPAA '07.

[41]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[42]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[43]  Daniel F. Garcia,et al.  Utilization Bounds for EDF Scheduling on Real-Time Multiprocessor Systems , 2004, Real-Time Systems.

[44]  Qi Yang,et al.  Energy-aware partitioning for multiprocessor real-time systems , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[45]  Rami G. Melhem,et al.  Dynamic and aggressive scheduling techniques for power-aware real-time systems , 2001, Proceedings 22nd IEEE Real-Time Systems Symposium (RTSS 2001) (Cat. No.01PR1420).

[46]  F. Frances Yao,et al.  A scheduling model for reduced CPU energy , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[47]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[48]  Joseph Y.-T. Leung,et al.  On-line scheduling of real-time tasks , 1988, Proceedings. Real-Time Systems Symposium.

[49]  Leon Atkins,et al.  Algorithms for power savings , 2014 .