Control Flow Optimization by Loop Nest Splitting at the Source Code Level

In recent years, the application of optimization techniques at the level of program source codes has increasingly attracted interest due to the high effectiveness and the inherent retargetability of such approaches. In this report, a novel source code transformation technique for control flow optimization called loop nest splitting is presented. The goal of this optimization is to reduce runtimes and energy consumption by minimizing the number of if-statements executed in loop nests of typical embedded multimedia applications. Complementary to already known optimizations in this area, we explicitly focus on the optimization of loop-variant if-statements. The analysis techniques required for performing loop nest splitting are illustrated in detail. They base on precise mathematic models combined with genetic algorithms. The analysis is done statically at compile time and does not rely on profiling. For a detailed evaluation of the benefits of loop nest splitting, the effects of our optimization with respect to instruction pipeline and cache behavior, runtimes, energy consumption and code sizes are shown. The application of our implemented tools for loop nest splitting to three real-life multimedia benchmarks leads to average reductions of pipeline stalls between 19.7% and 64.8% and an average decrease of instruction cache misses between 8.9% and 45.3%. Measurements on a variety of different programmable processors show average speed-ups between 23.6% and 62.1% of the benchmarks, whereas reductions of energy dissipation between 19.2% and 57.6% are observed.

[1]  David L. Levine,et al.  Users guide to the PGAPack parallel genetic algorithm library , 1995 .

[2]  Doran Wilde,et al.  A LIBRARY FOR DOING POLYHEDRAL OPERATIONS , 2000 .

[3]  M. Bister,et al.  Automated segmentation of cardiac MR images , 1989, [1989] Proceedings. Computers in Cardiology.

[4]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[5]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[6]  Narayanan Vijaykrishnan,et al.  Effect of compiler optimizations on memory energy , 2000, 2000 IEEE Workshop on SiGNAL PROCESSING SYSTEMS. SiPS 2000. Design and Implementation (Cat. No.00TH8528).

[7]  Mahmut T. Kandemir,et al.  Influence of compiler optimizations on system power , 2001, IEEE Trans. Very Large Scale Integr. Syst..

[8]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[9]  D. Verkest,et al.  Systematic high-level address code transformations for piece-wise linear indexing: illustration on a medical imaging algorithm , 2000, 2000 IEEE Workshop on SiGNAL PROCESSING SYSTEMS. SiPS 2000. Design and Implementation (Cat. No.00TH8528).

[10]  Francky Catthoor,et al.  Analysis of high-level address code transformations for programmable processors , 2000, DATE '00.

[11]  Mahmut T. Kandemir,et al.  Influence of compiler optimizations on system power , 2000, Proceedings 37th Design Automation Conference.

[12]  Rainer Leupers,et al.  Code optimization techniques for embedded processors - methods, algorithms, and tools , 2000 .

[13]  Ralf Niemann Hardware, software co-design for data flow dominated embedded systems , 1998 .

[14]  Hugo De Man,et al.  Power exploration for data dominated video applications , 1996, ISLPED '96.

[15]  H. Raiffa,et al.  3. The Double Description Method , 1953 .

[16]  Dimitrios Soudris,et al.  A code transformation-based methodology for improving I-cache performance of DSP applications , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[17]  Steven S. Muchnick,et al.  Optimizing compilers for SPARC , 1989 .

[18]  Birger Landwehr A genetic algorithm based approach for multi-objective data-flow graph optimization , 1999, Proceedings of the ASP-DAC '99 Asia and South Pacific Design Automation Conference 1999 (Cat. No.99EX198).

[19]  Junqiang Sun,et al.  Tms320c6000 cpu and instruction set reference guide , 2000 .

[20]  Rainer Leupers,et al.  A uniform optimization technique for offset assignment problems , 1998, Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210).

[21]  Gerhard Fettweis,et al.  Low-energy DSP code generation using a genetic algorithm , 2001, Proceedings 2001 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD 2001.

[22]  Steven W. K. Tjiang,et al.  An overview of the suif compiler system , 1990 .

[23]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[24]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[25]  Mircea R. Stan,et al.  Bus-invert coding for low-power I/O , 1995, IEEE Trans. Very Large Scale Integr. Syst..

[26]  David B. Fogel,et al.  Evolutionary algorithms in theory and practice , 1997, Complex.