Phase characterization for power: evaluating control-flow-based and event-counter-based techniques

Computer systems increasingly rely on dynamic, phase-based system management techniques, in which system hardware and software parameters may be altered or tuned at runtime for different program phases. Prior research has considered a range of possible phase analysis techniques, but has focused almost exclusively on performance-oriented phases; the notion of power-oriented phases has not been explored. Moreover, the bulk of phase-analysis studies have focused on simulation evaluation. There is need for real-system experiments that provide direct comparison of different practical techniques (such as control flow sampling, event counters, and power measurements) for gauging phase behavior. In this paper, we propose and evaluate a live, real-system measurement framework for collecting and analyzing power phases in running applications. Our experimental frameworks simultaneously collects control flow, performance counter and live power measurement information. Using this framework, we directly compare between code-oriented techniques (such as "basic block vectors") and performance counter techniques for characterizing power phases. Across a collection of both SPEC2000 benchmarks as well as mainstream desktop applications, our results indicate that both techniques are promising, but that performance counters consistently provide better representation of power behavior. For many of the experimented cases, basic block vectors demonstrate a strong relationship between the execution path and power consumption. However, there are instances where power behavior cannot be captured from control flow, for example due to differences in memory hierarchy performance. We demonstrate these with examples from real applications. Overall, counter-based techniques offer average classification errors of 1.9% for SPEC and 7.1% for other benchmarks, while basic block vectors achieve 2.9% average errors for SPEC and 11.7% for other benchmarks respectively.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Rajiv Kapoor,et al.  Pinpointing Representative Portions of Large Intel® Itanium® Programs with Dynamic Instrumentation , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[3]  Margaret Martonosi,et al.  Long-term workload phases: duration predictions and applications to DVFS , 2005, IEEE Micro.

[4]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[5]  Margaret Martonosi,et al.  Dynamically exploiting narrow width operands to improve processor power and performance , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[6]  Margaret Martonosi,et al.  Identifying program power phase behavior using power vectors , 2003, 2003 IEEE International Conference on Communications (Cat. No.03CH37441).

[7]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[8]  Chen Ding,et al.  Locality phase prediction , 2004, ASPLOS XI.

[9]  SherwoodTimothy,et al.  Phase tracking and prediction , 2003 .

[10]  H AlbonesiDavid,et al.  Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor , 2003 .

[11]  Sandhya Dwarkadas,et al.  Characterizing and predicting program behavior and its variability , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.

[12]  Parthasarathy Ranganathan,et al.  Energy-Driven Statistical Sampling: Detecting Software Hotspots , 2002, PACS.

[13]  Diana Marculescu,et al.  Power aware microarchitecture resource scaling , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[14]  Ken Kennedy,et al.  Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999, PLDI '99.

[15]  Michael C. Huang,et al.  Dynamically Tuning Processor Resources with Adaptive Processing , 2003, Computer.

[16]  Daniel A. Jiménez,et al.  Toward an evaluation infrastructure for power and energy optimizations , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[17]  David Chiu,et al.  BOOK REVIEW: "PATTERN CLASSIFICATION", R. O. DUDA, P. E. HART and D. G. STORK, Second Edition , 2001 .

[18]  Brad Calder,et al.  The Strong correlation Between Code Signatures and Performance , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..

[19]  Krste Asanovic,et al.  Reducing power density through activity migration , 2003, ISLPED '03.

[20]  KennedyKen,et al.  Improving cache performance in dynamic applications through data and computation reorganization at run time , 1999 .

[21]  Brad Calder,et al.  Phase tracking and prediction , 2003, ISCA '03.

[22]  Brad Calder,et al.  Transition phase classification and prediction , 2005, 11th International Symposium on High-Performance Computer Architecture.

[23]  M. Martonosi,et al.  Detecting recurrent phase behavior under real-system variability , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..

[24]  V. T. Rajan,et al.  Phase Shift Detection: A Problem Classification , 2003 .

[25]  T. N. Vijaykumar,et al.  Heat-and-run: leveraging SMT and CMP to manage power density through the operating system , 2004, ASPLOS XI.

[26]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[27]  James E. Smith,et al.  Managing multi-configuration hardware via dynamic working set analysis , 2002, ISCA.

[28]  Margaret Martonosi,et al.  A dynamic compilation framework for controlling microprocessor energy and performance , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).

[29]  Brad Calder,et al.  Basic block distribution analysis to find periodic behavior and simulation points in applications , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[30]  Massoud Pedram,et al.  Dynamic voltage and frequency scaling based on workload decomposition , 2004, Proceedings of the 2004 International Symposium on Low Power Electronics and Design (IEEE Cat. No.04TH8758).

[31]  Michael L. Scott,et al.  Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor , 2003, ISCA '03.

[32]  Margaret Martonosi,et al.  Runtime Power Monitoring in High-End Processors: Methodology and Empirical Data , 2003, MICRO.

[33]  Frank Bellosa,et al.  Process cruise control: event-driven clock scaling for dynamic power management , 2002, CASES '02.

[34]  Jeanine Cook,et al.  Examining performance differences in workload execution phases , 2001 .

[35]  R. Todi SPEClite: using representative samples to reduce SPEC CPU2000 workload , 2001 .

[36]  Michael C. Huang,et al.  Positional adaptation of processors: application to energy reduction , 2003, ISCA '03.

[37]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[38]  Frank Bellosa,et al.  Event-Driven Energy Accounting for Dynamic Thermal Management , 2002 .

[39]  D.J. Lilja,et al.  Accurate statistical approaches for generating representative workload compositions , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..

[40]  Ryan N. Rakvic,et al.  The Fuzzy Correlation between Code and Performance Predictability , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[41]  James E. Smith,et al.  Comparing Program Phase Detection Techniques , 2003, MICRO.