ADAPT: Automated De-coupled Adaptive Program Transformation

Dynamic program optimization offers performance improvements far beyond those possible with traditional compile-time optimization. These gains are due to the ability to exploit both architectural and input data set characteristics that are unknown prior to execution time. In this paper, we propose a novel framework for dynamic program optimization, ADAPT (Automated De-coupled Adaptive Program Transformation), that builds on the strengths of existing approaches. The key to our framework is the de-coupling of the dynamic compilation of new code variants from the dynamic selection of these variants at their points of use. This allows code generation to occur concurrently with program execution, removing dynamic compilation overheads from the critical path. We present a compilation system, based on the Polaris optimizing compiler, that automatically applies this framework to general "plugged-in" optimization techniques. We evaluate our system on three programs from the SPEC floating point benchmark suite by dynamically applying loop distribution, loop unrolling, loop tiling and automatic parallelization. We show that our techniques can improve performance by as much as 70% over statically optimized code.

[1]  L. Rauchwerger,et al.  The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization , 1999, IEEE Trans. Parallel Distributed Syst..

[2]  Michael Wolfe,et al.  Multiple Version Loops , 1987, ICPP.

[3]  Lawrence Rauchwerger,et al.  The privatizing DOALL test: a run-time technique for DOALL loop identification and array privatization , 1994, ICS '94.

[4]  Martin C. Rinard,et al.  Dynamic feedback: an effective technique for adaptive computing , 1997, PLDI '97.

[5]  Peter Lee,et al.  Optimizing ML with run-time code generation , 1996, PLDI '96.

[6]  Dawson R. Engler,et al.  C and tcc: a language and compiler for dynamic code generation , 1999, TOPL.

[7]  Brian N. Bershad,et al.  Fast, effective dynamic compilation , 1996, PLDI '96.

[8]  Charles Consel,et al.  Efficient incremental run-time specialization for free , 1999, PLDI '99.

[9]  Joel H. Saltz,et al.  Run-time parallelization and scheduling of loops , 1989, SPAA '89.

[10]  Markus Mock,et al.  A retrospective on: "an evaluation of staged run-time optimizations in DyC" , 2004, SIGP.

[11]  Rajiv Gupta,et al.  Adaptive loop transformations for scientific programs , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.

[12]  Ron Cytron,et al.  Does “just in time” = “better late than never”? , 1997, POPL '97.

[13]  Yunheung Paek,et al.  Advanced Program Restructuring for High-Performance Computers with Polaris , 2000 .

[14]  Rudolf Eigenmann,et al.  A framework for remote dynamic program optimization , 2000 .

[15]  Charles Consel,et al.  A general approach for run-time specialization and its application to C , 1996, POPL '96.

[16]  Scott A. Mahlke,et al.  Profile‐guided automatic inline expansion for C programs , 1992, Softw. Pract. Exp..

[17]  Norman Rubin,et al.  Spike: an optimizer for alpha/NT executables , 1997 .

[18]  Yunheung Paek,et al.  Parallel Programming with Polaris , 1996, Computer.

[19]  Jack J. Dongarra,et al.  Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[20]  Karl Pettis,et al.  Profile guided code positioning , 1990, PLDI '90.

[21]  Lawrence Rauchwerger,et al.  The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization , 1995, PLDI '95.

[22]  Dawson R. Engler,et al.  VCODE: a retargetable, extensible, very fast dynamic code generation system , 1996, PLDI '96.