Power Modeling for Heterogeneous Processors

As power becomes an ever more important design consideration, there is a need for accurate power models at all stages of the design process. While power models are available for CPUs and GPUs, only simple models are available for heterogeneous processors. We present a micro-benchmark-based modeling technique that can be used for chip multiprocessor (CMPs) and accelerated processing units (APUs). We use our approach to model power on an Intel Xeon CPU and an AMD Fusion heterogeneous processor. The resulting error rate for the Xeon's model is below 3% and is only 7% for the Fusion. We also present a method to reduce the number of benchmarks required to create these models. Instead of running micro-benchmarks for every combination of factors (e.g. different operations or memory access patterns), we cluster similar micro-benchmarks to avoid unnecessary simulations. We show that it is possible to eliminate as many as 93% of the compute micro-benchmarks, while still producing power models having less than 10% error rate.

[1]  Nam Sung Kim,et al.  GPUWattch: enabling energy optimizations in GPGPUs , 2013, ISCA.

[2]  Dragos-Paul Pop A LOOK AT INTEL’S NEW NEHALEM ARCHITECTURE: THE BLOOMFIELD AND LYNNFIELD FAMILIES AND THE NEW TURBO BOOST TECHNOLOGY , 2009 .

[3]  A. Pettitt Testing the Normality of Several Independent Samples Using the Anderson‐Darling Statistic , 1977 .

[4]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[5]  T. W. Anderson,et al.  Asymptotic Theory of Certain "Goodness of Fit" Criteria Based on Stochastic Processes , 1952 .

[6]  Hyesoon Kim,et al.  An integrated GPU power and performance model , 2010, ISCA.

[7]  Natalie D. Enright Jerger,et al.  DistCL: A Framework for the Distributed Execution of OpenCL Kernels , 2013, 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems.

[8]  Greg Hamerly,et al.  Learning the k in k-means , 2003, NIPS.

[9]  Eduard Ayguadé,et al.  Counter-Based Power Modeling Methods: Top-Down vs. Bottom-Up , 2013, Comput. J..

[10]  Sharad Malik,et al.  Orion: a power-performance simulator for interconnection networks , 2002, MICRO.

[11]  David R. Kaeli,et al.  Multi2Sim: A simulation framework for CPU-GPU computing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[12]  Kevin Skadron,et al.  Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[13]  Satoshi Matsuoka,et al.  Statistical power modeling of GPU kernels using performance counters , 2010, International Conference on Green Computing.

[14]  Guibin Wang Power analysis and optimizations for GPU architecture using a power simulator , 2010, 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE).

[15]  Haifeng Wang,et al.  Power Estimating Model and Analysis of General Programming on GPU , 2012, J. Softw..

[16]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[17]  Hao Wang,et al.  Workload and power budget partitioning for single-chip heterogeneous processors , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[18]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[19]  Hyesoon Kim,et al.  An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009, ISCA '09.

[20]  Trevor N. Mudge,et al.  Power: A First-Class Architectural Design Constraint , 2001, Computer.

[21]  Vincent M. Weaver Self-monitoring overhead of the Linux perf_ event performance counter interface , 2015, 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[22]  Ben H. H. Juurlink,et al.  How a single chip causes massive power bills GPUSimPow: A GPGPU power simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[23]  Maurice Steinman,et al.  AMD Fusion APU: Llano , 2012, IEEE Micro.

[24]  Vittorio Zaccaria,et al.  Energy estimation and optimization of embedded VLIW processors based on instruction clustering , 2002, Proceedings 2002 Design Automation Conference (IEEE Cat. No.02CH37324).

[25]  Charles L. Lawson,et al.  Solving least squares problems , 1976, Classics in applied mathematics.

[26]  Bin Li,et al.  Performance and Power Analysis of ATI GPU: A Statistical Approach , 2011, 2011 IEEE Sixth International Conference on Networking, Architecture, and Storage.

[27]  Lizy Kurian John,et al.  Complete System Power Estimation Using Processor Performance Events , 2012, IEEE Transactions on Computers.

[28]  Xiaohan Ma,et al.  Statistical Power Consumption Analysis and Modeling for GPU-based Computing , 2011 .