Memory Profiling using Hardware Counters

Although memory performance is often a limiting factor in application performance, most tools only show performance data relating to the instructions in the program, not to its data. In this paper, we describe a technique for directly measuring the memory profile of an application. We describe the tools and their user model, and then discuss a particular code, the MCFbenchmark from SPEC CPU 2000. We show performance data for the data structures and elements, and discuss the use of the data to improve program performance. Finally, we discuss extensions to the work to provide feedback to the compiler for prefetching and to generate additional reports from the data.

[1]  Robert Hundt,et al.  HP Caliper: a framework for performance analysis tools , 2000, IEEE Concurr..

[2]  S. Turner,et al.  Performance Analysis Using the MIPS R10000 Performance Counters , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[3]  Brinkley Sprunt,et al.  Pentium 4 Performance-Monitoring Features , 2002, IEEE Micro.

[4]  A. Löbel Optimale Vehicle Scheduling in Public Transit , 1997 .

[5]  James R. Larus,et al.  Cache-conscious structure definition , 1999, PLDI '99.

[6]  Margaret Martonosi,et al.  MemSpy: analyzing memory system bottlenecks in programs , 1992, SIGMETRICS '92/PERFORMANCE '92.

[7]  Jeffrey K. Hollingsworth,et al.  Using Hardware Performance Monitors to Isolate Memory Bottlenecks , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[8]  Jack J. Dongarra,et al.  A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[9]  Rastislav Bodík,et al.  An efficient profile-analysis framework for data-layout optimizations , 2002, POPL '02.

[10]  James R. Larus,et al.  Cache-conscious structure layout , 1999, PLDI '99.

[11]  Jeffrey Dean,et al.  Transparent, low-overhead profiling on modern processors , 1998 .

[12]  Jeffrey K. Hollingsworth,et al.  SIGMA: A Simulator Infrastructure to Guide Memory Analysis , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[13]  Susan L. Graham,et al.  An execution profiler for modular programs , 1983, Softw. Pract. Exp..

[14]  Rudolf Berrendorf,et al.  PCL - The Performance Counter Library: A Common Interface to Access Hardware Performance Counters on Microprocessors , 1998 .

[15]  Martin Hirzel,et al.  Dynamic hot data stream prefetching for general-purpose programs , 2002, PLDI '02.