Accuracy of profile maintenance in optimizing compilers

Modern processors rely heavily on optimizing compilers to deliver their performance potentials. The compilers, in turn, rely greatly on profile information to focus the optimization efforts and better match the generated code with the target machines. Maintaining the profile in an optimizing compiler is important as many optimizations can benefit from profile information and they are often performed one after the other. Maintaining a profile is, however, tedious and error prone. An erroneous profile is not easy to detect as it affects only the performance, not the correctness, of a program. Maintaining a profile also inherently loses accuracy, as the profile update operations often have to use probabilistic approximation. In this paper, we measure the accuracy of maintaining CFG profiles in a high-performance optimizing compiler. Our data indicates that the compiler maintains the profile more accurately within individual functions than globally across functions, and function inlining may be responsible for the loss of profile accuracy globally. We also identify a number of research issues related to profile maintenance.

[1]  William C. Kreahling,et al.  Profile assisted register allocation , 2000, SAC '00.

[2]  Ken Kennedy,et al.  Procedure cloning , 1992, Proceedings of the 1992 International Conference on Computer Languages.

[3]  Wen-mei W. Hwu,et al.  Inline function expansion for compiling C programs , 1989, PLDI '89.

[4]  Vivek Sarkar Optimized unrolling of nested loops , 2000, ICS '00.

[5]  Frank Vahid,et al.  Procedure cloning: a transformation for improved system-level functional partitioning , 1997, ED&TC.

[6]  Michael D. Smith,et al.  Better global scheduling using path profiles , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[7]  David W. Wall,et al.  Predicting program behavior using real or estimated profiles , 2004, SIGP.

[8]  James R. Larus,et al.  Optimally profiling and tracing programs , 1994, TOPL.

[9]  Brad Calder,et al.  Value profiling , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[10]  Todd C. Mowry,et al.  Predicting data cache misses in non-numeric applications through correlation profiling , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[11]  Scott A. Mahlke,et al.  Control CPR: a branch height reduction optimization for EPIC architectures , 1999, PLDI '99.

[12]  Jack W. Davidson,et al.  Profile guided code positioning , 1990, SIGP.

[13]  Thomas Ball,et al.  Edge profiling versus path profiling: the showdown , 1998, POPL '98.

[14]  Kishore N. Menezes,et al.  Wavefront scheduling: path based data representation and scheduling of subgraphs , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[15]  Matthew Farrens,et al.  Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture , 1999, MICRO 1999.

[16]  Rajiv Gupta,et al.  Path profile guided partial dead code elimination using predication , 1997, Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques.

[17]  James R. Larus,et al.  Static branch frequency and program profile analysis , 1994, MICRO 27.

[18]  Scott A. Mahlke,et al.  Using profile information to assist classic code optimizations , 1991, Softw. Pract. Exp..