Energy, Power, and Performance Characterization of GPGPU Benchmark Programs

This paper studies the effects on energy consumption, power draw, and runtime of a modern compute GPU when changing the core and memory clock frequencies, enabling or disabling ECC, using alternate implementations, and varying the program inputs. We evaluate 34 applications from 5 benchmark suites and measure their power draw over time on a K20c GPU. Our results show that changing the frequency or the program implementation can alter the energy, power, and performance by a factor of two or more. Interestingly, some changes affect these three aspects very unevenly. ECC can greatly increase the runtime and energy consumption, but only on memory-bound codes. Compute-bound codes tend to behave quite differently from memory-bound codes, in particular regarding their power draw. On irregular programs, a small change in frequency can result in a large change in runtime and energy consumption.

[1]  Xiaohan Ma,et al.  Improving Energy Efficiency of GPU based General-Purpose Scientific Computing through Automated Selection of Near Optimal Configurations , 2011 .

[2]  Rong Ge,et al.  Green Supercomputing Comes of Age , 2008, IT Professional.

[3]  Rong Ge,et al.  Effects of Dynamic Voltage and Frequency Scaling on a K20 GPU , 2013, 2013 42nd International Conference on Parallel Processing.

[4]  Tongdan Jin,et al.  Evaluating the performance and energy efficiency of n-body codes on multi-core CPUs and GPUs , 2013, 2013 IEEE 32nd International Performance Computing and Communications Conference (IPCCC).

[5]  R. Leupers,et al.  Compiler based exploration of DSP energy savings by SIMD operations , 2004, ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753).

[6]  Feng Pan,et al.  Exploring the energy-time tradeoff in MPI programs on a power-scalable cluster , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[7]  Juan Li Application-Directed DVFS using Multiple Clock Domains on Graphics Hardware , 2009 .

[8]  Richard W. Vuduc,et al.  Algorithmic Time, Energy, and Power on Candidate HPC Compute Building Blocks , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[9]  Bin Li,et al.  Statistical GPU power analysis using tree-based methods , 2011, 2011 International Green Computing Conference and Workshops.

[10]  Jian Li,et al.  Power-performance considerations of parallel computing on chip multiprocessors , 2005, TACO.

[11]  Keshav Pingali,et al.  A quantitative study of irregular programs on GPUs , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).

[12]  Feng Pan,et al.  Exploring the energy-time tradeoff in high-performance computing , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[13]  Kevin Skadron,et al.  Studying Thermal Management for Graphics-Processor Architectures , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..

[14]  Keshav Pingali,et al.  Optimistic parallelism requires abstractions , 2009, CACM.

[15]  B. Chapman,et al.  Energy Analysis of Parallel Scientific Kernels on Multiple GPUs , 2012, 2012 Symposium on Application Accelerators in High Performance Computing.

[16]  Ben H. H. Juurlink,et al.  How a single chip causes massive power bills GPUSimPow: A GPGPU power simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[17]  Sayantan Sur,et al.  Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters , 2010, 2010 39th International Conference on Parallel Processing.

[18]  Guibin Wang Power analysis and optimizations for GPU architecture using a power simulator , 2010, 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE).

[19]  Martin Burtscher,et al.  Measuring GPU Power with the K20 Built-in Sensor , 2014, GPGPU@ASPLOS.

[20]  Gul A. Agha,et al.  Towards optimizing energy costs of algorithms for shared memory architectures , 2010, SPAA '10.

[21]  Natalie D. Enright Jerger,et al.  Power Modeling for Heterogeneous Processors , 2014, GPGPU@ASPLOS.

[22]  Wu-chun Feng,et al.  Understanding Power Measurement Implications in the Green500 List , 2010, 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing.

[23]  Sungchan Kim,et al.  Empirical characterization of power efficiency for large scale data processing , 2015, 2015 17th International Conference on Advanced Communication Technology (ICACT).

[24]  Andrew S. Grimshaw,et al.  Scalable GPU graph traversal , 2012, PPoPP '12.