Improving the Productivity of Compiler Code Quality Analysis

Producing quality code is one of the most important goals of an optimizing compiler. Analyzing code quality is therefore an essential activity in compiler engineering. By motivating new optimizations and diagnosing regressions, it takes a bottleneck position in the process. However, it has been highly empirical, and dependent on architectures and tools. This makes it a difficult and time-consuming task, and its productivity is unpredictable and usually low. This paper proposes two novel approaches for code quality analysis. The first approach focuses on the key scenario in compiler construction, the computation-intensive benchmarks. We observed that the workload upon the processor dominates the execution time of such benchmarks. Therefore, we use the compiler to parse the workload. By doing this, the compiler applies its static analysis power to identify its own code quality issues and potentials. We have implemented a software system for this approach and integrated it with our daily testing infrastructure. It automates the code quality analysis for Spec benchmarks, and provides developers relevant information in reasonable time. The second approach addresses code quality regressions of operating system benchmarks. We take advantage of the built-in instrumentation of the operating system to collect traces of events, and then construct a tree out of the trace. Analyzing regression is thus simplified as tree comparison. With the tree structure, this approach divides and conquers the usually huge amount of data, and effectively localizes the focus to a few leaves of the trees. It has been implemented, and addressed several difficult issues that led to visible code quality improvement. This paper also summarizes our experiences in bringing up the code quality of a product compiler framework, establishes a few simple guidelines to achieve code quality objectives with minimized efforts, and empirically compares various analysis approaches.

[1]  J. Larus Whole program paths , 1999, PLDI '99.

[2]  Alan Eustace,et al.  ATOM - A System for Building Customized Program Analysis Tools , 1994, PLDI.

[3]  James E. Smith,et al.  Modeling superscalar processors via statistical simulation , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[4]  Alan Jay Smith,et al.  Analysis of benchmark characteristics and benchmark performance prediction , 1996, TOCS.

[5]  James R. Larus,et al.  Efficient path profiling , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[6]  Charles Yount,et al.  Using Model Trees for Computer Architecture Performance Analysis of Software Applications , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[7]  Lizy K. John,et al.  Workload characterization: motivation, goals and methodology , 1998, Workload Characterization: Methodology and Case Studies. Based on the First Workshop on Workload Characterization.

[8]  Keith D. Cooper,et al.  Adaptive Optimizing Compilers for the 21st Century , 2002, The Journal of Supercomputing.

[9]  Min Zhou,et al.  Experiences and lessons learned with a portable interface to hardware performance counters , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[10]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[12]  Robert F. Lucas,et al.  Compiler-assisted performance tuning , 2007 .

[13]  Saman Amarasinghe,et al.  Dynamic native optimization of interpreters , 2003, IVME '03.

[14]  Rob Pooley,et al.  Software engineering and performance: a roadmap , 2000, ICSE '00.

[15]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[16]  Sanjay Bhansali,et al.  Framework for instruction-level tracing and analysis of program executions , 2006, VEE '06.