GPGPU Power Modeling for Multi-domain Voltage-Frequency Scaling

Dynamic Voltage and Frequency Scaling (DVFS) on Graphics Processing Units (GPUs) components is one of the most promising power management strategies, due to its potential for significant power and energy savings. However, there is still a lack of simple and reliable models for the estimation of the GPU power consumption under a set of different voltage and frequency levels. Accordingly, a novel GPU power estimation model with both core and memory frequency scaling is herein proposed. This model combines information from both the GPU architecture and the executing GPU application and also takes into account the non-linear changes in the GPU voltage when the core and memory frequencies are scaled. The model parameters are estimated using a collection of 83 microbenchmarks carefully crafted to stress the main GPU components. Based on the hardware performance events gathered during the execution of GPU applications on a single frequency configuration, the proposed model allows to predict the power consumption of the application over a wide range of frequency configurations, as well as to decompose the contribution of different parts of the GPU pipeline to the overall power consumption. Validated on 3 GPU devices from the most recent NVIDIA microarchitectures (Pascal, Maxwell and Kepler), by using a collection of 26 standard benchmarks, the proposed model is able to achieve accurate results (7%, 6% and 12% mean absolute error) for the target GPUs (Titan Xp, GTX Titan X and Tesla K40c).

[1]  Sunita Chandrasekaran,et al.  Statistical modeling of power/energy of scientific kernels on a multi-GPU system , 2013, 2013 International Green Computing Conference Proceedings.

[2]  Alexandra Fedorova,et al.  Addressing shared resource contention in multicore processors via scheduling , 2010, ASPLOS XV.

[3]  B. M. Gordon,et al.  Supply and threshold voltage scaling for low power CMOS , 1997, IEEE J. Solid State Circuits.

[4]  Andreas Moshovos,et al.  Demystifying GPU microarchitecture through microbenchmarking , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).

[5]  Derek Chiou,et al.  GPGPU performance and power estimation using machine learning , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[6]  Dean M. Tullsen,et al.  The CRISP performance model for dynamic voltage and frequency scaling in a GPGPU , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[7]  Qiang Wang,et al.  HKBU Institutional Repository , 2018 .

[8]  Sudhakar Yalamanchili,et al.  Power Modeling for GPU Architectures Using McPAT , 2014, TODE.

[9]  Xinxin Mei,et al.  A measurement study of GPU DVFS on energy conservation , 2013, HotPower '13.

[10]  Xinxin Mei,et al.  Dissecting GPU Memory Hierarchy Through Microbenchmarking , 2015, IEEE Transactions on Parallel and Distributed Systems.

[11]  Hiroshi Sasaki,et al.  Power and Performance Characterization and Modeling of GPU-Accelerated Systems , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[12]  Wu-chun Feng,et al.  Online Power Estimation of Graphics Processing Units , 2016, 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid).

[13]  Rong Ge,et al.  Effects of Dynamic Voltage and Frequency Scaling on a K20 GPU , 2013, 2013 42nd International Conference on Parallel Processing.

[14]  Avi Mendelson,et al.  Fine-Grain Power Breakdown of Modern Out-of-Order Cores and Its Implications on Skylake-Based Systems , 2016, ACM Trans. Archit. Code Optim..

[15]  Kevin Skadron,et al.  A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads , 2010, IEEE International Symposium on Workload Characterization (IISWC'10).

[16]  Wei Chen,et al.  GreenGPU: A Holistic Approach to Energy Efficiency in GPU-CPU Heterogeneous Architectures , 2012, 2012 41st International Conference on Parallel Processing.

[17]  Kevin Skadron,et al.  Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[18]  Satoshi Matsuoka,et al.  Statistical power modeling of GPU kernels using performance counters , 2010, International Conference on Green Computing.

[19]  Frederico Pratas,et al.  Exploring GPU performance, power and energy-efficiency bounds with Cache-aware Roofline Modeling , 2017, 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[20]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[21]  Diana Marculescu,et al.  Analysis of dynamic voltage/frequency scaling in chip-multiprocessors , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[22]  Shuaiwen Song,et al.  A Simplified and Accurate Model of Power-Performance Efficiency on Emergent GPU Architectures , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[23]  Nam Sung Kim,et al.  GPUWattch: enabling energy optimizations in GPGPUs , 2013, ISCA.

[24]  Bin Li,et al.  Statistical GPU power analysis using tree-based methods , 2011, 2011 International Green Computing Conference and Workshops.

[25]  Xiaohan Ma,et al.  Statistical Power Consumption Analysis and Modeling for GPU-based Computing , 2011 .

[26]  Henry Wong,et al.  Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[27]  Bin Li,et al.  Performance and Power Analysis of ATI GPU: A Statistical Approach , 2011, 2011 IEEE Sixth International Conference on Networking, Architecture, and Storage.

[28]  Nuno Roma,et al.  Multi-kernel Auto-Tuning on GPUs: Performance and Energy-Aware Optimization , 2015, 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[29]  Margaret Martonosi,et al.  Runtime power monitoring in high-end processors: methodology and empirical data , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[30]  Thomas Ilsche,et al.  An Energy Efficiency Feature Survey of the Intel Haswell Processor , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.

[31]  Hyesoon Kim,et al.  An integrated GPU power and performance model , 2010, ISCA.

[32]  Nuno Roma,et al.  Performance and Power-Aware Classification for Frequency Scaling of GPGPU Applications , 2016, Euro-Par Workshops.

[33]  Wen-mei W. Hwu,et al.  Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing , 2012 .

[34]  Indrani Paul,et al.  Dynamic GPGPU Power Management Using Adaptive Model Predictive Control , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[35]  G. Sohi,et al.  A static power model for architects , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.

[36]  Neena Imam,et al.  Understanding GPU Power , 2016, ACM Comput. Surv..