Systematic high-level address code transformations for piece-wise linear indexing: illustration on a medical imaging algorithm

Exploring data transfer and storage issues is crucial to efficiently map data intensive applications (e.g., multimedia) onto programmable processors. Code transformations are used to minimise main memory bus load and hence also power and system performance, However this typically incurs a considerable arithmetic overhead in the addressing and local control. For instance, memory optimising in-place and data-layout transformations add costly module and integer division operations to the initial addressing code. In this paper, we show how the cycle overhead can be almost completely removed. This is done according to a systematic methodology which is a combination of an algebraic transformation exploration approach for the (non)linear arithmetic on top of an efficient transformation technique for reducing the piece-wise linear indexing to linear pointer arithmetic. The approach is illustrated on a real-life medical application, using a variety of programmable processor architectures. Total gains in cycle count ranging between a factor 5 and 25 are obtained compared to conventional compilers.

[1]  Rainer Leupers,et al.  A uniform optimization technique for offset assignment problems , 1998, Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210).

[2]  Srinivas Devadas,et al.  Analysis and Evaluation of Address Arithmetic Capabilities in Custom DSP Architectures , 1997, Des. Autom. Embed. Syst..

[3]  Francky Catthoor,et al.  Analysis of high-level address code transformations for programmable processors , 2000, DATE '00.

[4]  Hugo De Man,et al.  System-Level Memory Management for Weakly Parallel Image Processing , 1996, Euro-Par, Vol. II.

[5]  Hugo De Man,et al.  High-level address optimization and synthesis techniques for data-transfer-intensive applications , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[6]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[7]  Ahmed Amine Jerraya,et al.  Address calculation for retargetable compilation and exploration of instruction-set architectures , 1996, DAC '96.

[8]  Hugo De Man,et al.  Advanced Data Layout Optimization for Multimedia Applications , 2000, IPDPS Workshops.

[9]  Rainer Leupers,et al.  Register-constrained address computation in DSP programs , 1998, Proceedings Design, Automation and Test in Europe.

[10]  Rainer Leupers,et al.  Algorithms for address assignment in DSP code generation , 1996, Proceedings of International Conference on Computer Aided Design.

[11]  Sharad Malik,et al.  Paged Absolute Addressing Mode Optimizations for Embedded Digital Signal Processors Using Post-pass Data-flow Analysis , 1999, Des. Autom. Embed. Syst..

[12]  Francky Catthoor,et al.  Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design , 1998 .

[13]  Hugo De Man,et al.  Transformation of Nested Loops with Modulo Indexing to Affine Recurrences , 1994, Parallel Process. Lett..

[14]  Hugo De Man,et al.  Platform Independent Data Transfer and Storage Exploration Illustrated on Parallel Cavity Detection Algorithm , 1999, PDPTA.

[15]  Hugo De Man,et al.  Program transformation strategies for memory size and power reduction of pseudoregular multimedia subsystems , 1998, IEEE Trans. Circuits Syst. Video Technol..

[16]  Catherine H. Gebotys DSP address optimization using a minimum cost circulation technique , 1997, 1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD).

[17]  Vivek Tiwari,et al.  Reducing power in high-performance microprocessors , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[18]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[19]  Bernhard Wess Minimization of Data Address Computation Overhead in DSP Programs , 1999, Des. Autom. Embed. Syst..

[20]  Thanos Stouraitis,et al.  A novel hardware algorithm for residue evaluation , 1999, 1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461).

[21]  Hugo De Man,et al.  A specification invariant technique for operation cost minimisation in flow-graphs , 1994, Proceedings of 7th International Symposium on High-Level Synthesis.