Chainsaw: Using Binary Matching for Relative Instruction Mix Comparison

With advances in hardware, instruction set architectures are undergoing continual evolution. As a result, compilers are under constant pressure to adapt and take full advantage of available features. However, current techniques for evaluating relative compiler performance only compare profiles at the application level, ignoring relative performance differences at finer granularities. To ensure that new features are put to good use, a more rigorous approach is necessary. A fundamental step in tuning compiler performance is identifying the specific examples that can be improved. To solve this problem, we present a compiler-independent binary matching technique to compare executions of differently compiled programs and identify intervals where the behavior can be meaningfully compared. Matched intervals can be automatically analyzed to identify anomalous segments of execution where one version performs significantly differently versus another. We present case studies using Chainsaw to identify significant performance anomalies between differently compiled codes.

[1]  Juan Rodriguez,et al.  A dynamic tool for finding redundant computations in native code , 2008, WODA '08.

[2]  Dirk Grunwald,et al.  Seekable Compressed Traces , 2007, 2007 IEEE 10th International Symposium on Workload Characterization.

[3]  Colin J. Fidge,et al.  Timestamps in Message-Passing Systems That Preserve the Partial Ordering , 1988 .

[4]  Brad Calder,et al.  A Loop Correlation Technique to Improve Performance Auditing , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[5]  Xiangyu Zhang,et al.  Whole Execution Traces , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[6]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[7]  Peter M. W. Knijnenburg,et al.  Optimizing general purpose compiler optimization , 2005, CF '05.

[8]  Andy Nisbet,et al.  GAPS: Iterative Feedback Directed Parallelisation Using Genetic Algorithms , 2000 .

[9]  Dirk Grunwald,et al.  OptiScope: Performance Accountability for Optimizing Compilers , 2009, 2009 International Symposium on Code Generation and Optimization.

[10]  Rudolf Eigenmann,et al.  Fast, automatic, procedure-level performance tuning , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[11]  Rajiv Gupta,et al.  Debugging and Testing Optimizers through Comparison Checking , 2002, COCV@ETAPS.

[12]  David I. August,et al.  Compiler optimization-space exploration , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[13]  Peter A. Darnell,et al.  Bugfind: a tool for debugging optimizing compilers , 1990, SOEN.

[14]  Xiangyu Zhang,et al.  Matching execution histories of program versions , 2005, ESEC/FSE-13.

[15]  Scott A. Mahlke,et al.  An architecture framework for transparent instruction set customization in embedded processors , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[16]  Bjorn De Sutter,et al.  Matching Control Flow of Program Versions , 2007, 2007 IEEE International Conference on Software Maintenance.

[17]  David Kaeli,et al.  Performance Characterization of SPEC CPU2006 Integer Benchmarks on x86-64 Architecture , 2006, 2006 IEEE International Symposium on Workload Characterization.

[18]  Zheng Wang,et al.  BMAT -- A Binary Matching Tool , 1999 .

[19]  Brad Calder,et al.  Using SimPoint for accurate and efficient simulation , 2003, SIGMETRICS '03.

[20]  Michael F. P. O'Boyle,et al.  Rapidly Selecting Good Compiler Optimizations using Performance Counters , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[21]  Keith D. Cooper,et al.  Adaptive Optimizing Compilers for the 21st Century , 2002, The Journal of Supercomputing.

[22]  Aamer Jaleel,et al.  Cross Binary Simulation Points , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[23]  Douglas L. Jones,et al.  Fast searches for effective optimization phase sequences , 2004, PLDI '04.