Automated Fingerprinting of Performance Pathologies Using Performance Monitoring Units ( PMUs )

Modern architectures provide access to many hardware performance events, which are capable of providing insight into architectural performance bottlenecks throughout the core and memory hierarchy. These events can provide programmers with unique and powerful insights into the causes of performance problems in their programs, but interpreting these events has been a significant challenge. We describe a technique that uses data mining to automatically fingerprint a program’s performance problems, permitting programmers to reap the architectural insights made possible by the events while shielding them from the onerous task of interpreting raw events. We use a decision tree algorithm on a set of micro-benchmarks to construct a model of performance problems. This extracted model is able to divide a profiled application into program phases, and label the phases with the patterns of hardware bottlenecks. Our framework provides programmers with a detailed map of what to optimize in their code, sparing them the need to interpret raw events.

[1]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[2]  Mineo Takai,et al.  Parssec: A Parallel Simulation Environment for Complex Systems , 1998, Computer.

[3]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[4]  Jack J. Dongarra,et al.  A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[5]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[6]  Michael Stumm,et al.  Online performance analysis by statistical sampling of microprocessor performance counters , 2005, ICS '05.

[7]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[8]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[9]  SPEC CPU 2006 Benchmark Descriptions , 2006 .

[10]  Dimitrios S. Nikolopoulos,et al.  Online power-performance adaptation of multithreaded programs using hardware event-based prediction , 2006, ICS '06.

[11]  Ricardo Bianchini,et al.  Mercury and freon: temperature emulation and management for server systems , 2006, ASPLOS XII.

[12]  Thomas R. Gross,et al.  Online optimizations driven by hardware performance monitoring , 2007, PLDI '07.

[13]  Frank Bellosa,et al.  Energy Management for Hypervisor-Based Virtual Machines , 2007, USENIX Annual Technical Conference.

[14]  Xiao Zhang,et al.  Hardware counter driven on-the-fly request signatures , 2008, ASPLOS.

[15]  Engin Ipek,et al.  Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[16]  Michael I. Jordan,et al.  Detecting large-scale system problems by mining console logs , 2009, SOSP '09.