Temporal vertical profiling

Modern systems are enormously complex; many applications today comprise millions of lines of code, make extensive use of software frameworks, and run on complex, multi‐tiered, run‐time systems. Understanding the performance of these applications is challenging because it depends on the interactions between the many software and the hardware components. This paper describes and evaluates an interactive and iterative methodology, temporal vertical profiling, for understanding the performance of applications. There are two key insights behind temporal vertical profiling. First, we need to collect and reason across information from multiple layers of the system before we can understand an application's performance. Second, application performance changes over time and thus we must consider the time‐varying behavior of the application instead of aggregate statistics. We have developed temporal vertical profiling from our own experience of analyzing performance anomalies and have found it very helpful for methodically exploring the space of hardware and software components. By representing an application's behavior as a set of metrics, where each metric is represented as a time series, temporal vertical profiling provides a way to reason about performance across system layers, regardless of their level of abstraction, and independent of their semantics. Temporal vertical profiling provides a methodology to explore a large space of metrics, hundreds of metrics even for small benchmarks, in a systematic way. Copyright © 2010 John Wiley & Sons, Ltd.

[1]  Bryan Cantrill,et al.  Dynamic Instrumentation of Production Systems , 2004, USENIX Annual Technical Conference, General Track.

[2]  Gary Sevitsky,et al.  Drive-by Analysis of Running Programs , 2001 .

[3]  Perry Cheng,et al.  Oil and water? High performance garbage collection in Java with MMTk , 2004, Proceedings. 26th International Conference on Software Engineering.

[4]  Stephen J. Fink,et al.  Design, implementation and evaluation of adaptive recompilation with on-stack replacement , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[5]  Barton P. Miller,et al.  The Paradyn Parallel Performance Measurement Tool , 1995, Computer.

[6]  Robert J. Fowler,et al.  HPCVIEW: A Tool for Top-down Analysis of Node Performance , 2002, The Journal of Supercomputing.

[7]  Vivek Sarkar,et al.  The Jikes Research Virtual Machine project: Building an open-source research community , 2005, IBM Syst. J..

[8]  Matthew Arnold,et al.  Adaptive optimization in the Jalapeno JVM , 2000, SIGP.

[9]  David J. Lilja,et al.  Measuring computer performance : A practitioner's guide , 2000 .

[10]  Charles L. Forgy,et al.  Rete: a fast algorithm for the many pattern/many object pattern match problem , 1991 .

[11]  Laurie Hendren,et al.  Dynamic metrics for java , 2003, OOPSLA 2003.

[12]  Doug Kimelman,et al.  Strata-various: multi-layer visualization of dynamics in software system behavior , 1994, Proceedings Visualization '94.

[13]  Amer Diwan,et al.  Understanding the behavior of compiler optimizations , 2006, Softw. Pract. Exp..

[14]  Matthias Hauswirth,et al.  Automating vertical profiling , 2005, OOPSLA '05.

[15]  Lieven Eeckhout,et al.  How java programs interact with virtual machines at the microarchitectural level , 2003, OOPSLA 2003.

[16]  Matthew Arnold,et al.  A Survey of Adaptive Optimization in Virtual Machines , 2005, Proceedings of the IEEE.

[17]  Oscar Naim,et al.  Dynamic instrumentation of threaded applications , 1999, PPoPP '99.

[18]  Brinkley Sprunt,et al.  The Basics of Performance-Monitoring Hardware , 2002, IEEE Micro.

[19]  Bowen Alpern,et al.  Implementing jalapeño in Java , 1999, OOPSLA '99.

[20]  Barton P. Miller,et al.  Performance Measurement Of Dynamically Compiled Java Executions , 2000 .

[21]  Daniel A. Reed,et al.  SvPablo: A multi-language architecture-independent performance analysis system , 1999, Proceedings of the 1999 International Conference on Parallel Processing.

[22]  Lieven Eeckhout,et al.  Quantifying the Impact of Input Data Sets on Program Behavior and its Applications , 2003, J. Instr. Level Parallelism.

[23]  Lieven Eeckhout,et al.  Designing Computer Architecture Research Workloads , 2003, Computer.

[24]  William Gropp,et al.  From Trace Generation to Visualization: A Performance Framework for Distributed Parallel Systems , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[25]  Matthias Hauswirth,et al.  Vertical profiling: understanding the behavior of object-priented applications , 2004, OOPSLA.

[26]  Ravi B. Konuru,et al.  An information exploration tool for performance analysis of Java programs , 2001, Proceedings Technology of Object-Oriented Languages and Systems. TOOLS 38.

[27]  D.A. Reed,et al.  Scalable performance analysis: the Pablo performance analysis environment , 1993, Proceedings of Scalable Parallel Libraries Conference.

[28]  Balaram Sinharoy,et al.  IBM Power5 chip: a dual-core multithreaded processor , 2004, IEEE Micro.

[29]  Paul J. Fortier,et al.  Computer Systems Performance Evaluation and Prediction , 2003 .

[30]  Jeffrey S. Vetter,et al.  Scalable Analysis Techniques for Microprocessor Performance Counter Metrics , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[31]  William Gropp,et al.  Toward Scalable Performance Visualization with Jumpshot , 1999, Int. J. High Perform. Comput. Appl..

[32]  Eduard Ayguadé Parra,et al.  Java instrumentation suite: accurate analysis of Java threaded applications , 2000 .

[33]  Jesús Labarta,et al.  DiP: A Parallel Program Development Environment , 1996, Euro-Par, Vol. II.

[34]  Brinkley Sprunt,et al.  Pentium 4 Performance-Monitoring Features , 2002, IEEE Micro.

[35]  Matthias Hauswirth,et al.  Using Hardware Performance Monitors to Understand the Behavior of Java Applications , 2004, Virtual Machine Research and Technology Symposium.

[36]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.