Optimally profiling and tracing programs

This paper describes algorithms for inserting monitoring code to profile and trace programs. These algorithms greatly reduce the cost of measuring programs with respect to the commonly used technique of placing code in each basic block. Program profiling counts the number of times each basic block in a program executes. Instruction tracing records the sequence of basic blocks traversed in a program execution. The algorithms optimize the placement of counting/tracing code with respect to the expected or measured frequency of each block or edge in a program's control-flow graph. We have implemented the algorithms in a profiling/tracing tool, and they substantially reduce the overhead of profiling and tracing. We also define and study the hierarchy of profiling problems. These problems have two dimensions: what is profiled (i.e., vertices (basic blocks) or edges in a control-flow graph) and where the instrumentation code is placed (in blocks or along edges). We compare the optimal solutions to the profiling problems and describe a new profiling problem: basic-block profiling with edge counters. This problem is important because an optimal solution to any other profiling problem (for a given control-flow graph) is never better than an optimal solution to this problem. Unfortunately, finding an optimal placement of edge counters for vertex profiling appears to be a hard problem in general. However, our work shows that edge profiling with edge counters works well in practice because it is simple and efficient and finds optimal counter placements in most cases. Furthermore, it yields more information than a vertex profile. Tracing also benefits from placing instrumentation code along edges rather than on vertices.

[1]  Ira R. Forman,et al.  On the time overhead of counters and traversal markers , 1981, ICSE '81.

[2]  N. S. Barnett,et al.  Private communication , 1969 .

[3]  Bjarne Stroustrup,et al.  The C++ programming language (2nd ed.) , 1991 .

[4]  Vivek Sarkar,et al.  Determining average program execution times and their variance , 1989, PLDI '89.

[5]  Alexandru Nicolau,et al.  Parallel processing: a smart compiler and a dumb machine , 1984, SIGP.

[6]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[7]  James R. Larus,et al.  Rewriting executable files to measure program behavior , 1994, Softw. Pract. Exp..

[8]  James R. Larus,et al.  Efficient program tracing , 1993, Computer.

[9]  James R. Larus,et al.  Optimally profiling and tracing programs , 1992, POPL '92.

[10]  R. V. Helgason,et al.  Algorithms for network programming , 1980 .

[11]  Alan Dain Samples,et al.  Profile-Driven Compilation , 1991 .

[12]  Brian W. Kernighan,et al.  The C Programming Language , 1978 .

[13]  James R. Larus,et al.  Abstract execution: A technique for efficiently tracing programs , 1990, Softw. Pract. Exp..

[14]  Robert E. Tarjan,et al.  Data structures and network algorithms , 1983, CBMS-NSF regional conference series in applied mathematics.

[15]  Jong-Deok Choi,et al.  Techniques for debugging parallel programs with flowback analysis , 1991, TOPL.

[16]  Karl Pettis,et al.  Profile guided code positioning , 1990, PLDI '90.

[17]  John L. Hennessy,et al.  Mtool: An Integrated System for Performance Debugging Shared Memory Multiprocessor Applications , 1993, IEEE Trans. Parallel Distributed Syst..

[18]  Marvin H. Solomon,et al.  Optimal code for control structures , 1982, POPL '82.

[19]  Donald E. Knuth,et al.  Optimal measurement points for program frequency counts , 1973 .

[20]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[21]  W. G. Morris,et al.  CCG: a prototype coagulating code generator , 1991, PLDI '91.

[22]  Robert L. Probert,et al.  Optimal Insertion of Software Probes in Well-Delimited Programs , 1982, IEEE Transactions on Software Engineering.

[23]  C. V. Ramamoorthy,et al.  Optimal placement of software monitors aiding systematic testing , 1975, IEEE Transactions on Software Engineering.

[24]  Susan L. Graham,et al.  An execution profiler for modular programs , 1983, Softw. Pract. Exp..

[25]  Donald E. Knuth,et al.  The art of computer programming: V.1.: Fundamental algorithms , 1997 .

[26]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[27]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[28]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .

[29]  Scott McFarling,et al.  Procedure merging with instruction caches , 1991, PLDI '91.

[30]  David R. Ditzel,et al.  An analysis of MIPS and SPARC instruction set utilization on the SPEC benchmarks , 1991 .

[31]  Shachindra N Maheshwari Traversal Marker Placement Problems Are NP-Complete ; CU-CS-092-76 , 1976 .

[32]  James R. Larus,et al.  Branch prediction for free , 1993, PLDI '93.

[33]  David W. Wall,et al.  Predicting program behavior using real or estimated profiles , 2004, SIGP.