Hierarchical memory size estimation for loop fusion and loop shifting in data-dominated applications

Loop fusion and loop shifting are important transformations for improving data locality to reduce the number of costly accesses to off-chip memories. Since exploring the exact platform mapping for all the loop transformation alternatives is a time consuming process, heuristics steered by improved data locality are generally used. However, pure locality estimates do not sufficiently take into account the hierarchy of the memory platform. This paper presents a fast, incremental technique for hierarchical memory size requirement estimation for loop fusion and loop shifting at the early loop transformations design stage. As the exact memory platform is often not yet defined at this stage, we propose a platform-independent approach which reports the Pareto-optimal trade-off points for scratch-pad memory size and off-chip memory accesses. The estimation comes very close to the actual platform mapping. Experiments on realistic test-vehicles confirm that. It helps the designer or a tool to find the interesting loop transformations that should then be investigated in more depth afterward

[1]  Mahmut T. Kandemir,et al.  Compiler-directed scratch pad memory hierarchy design and management , 2002, DAC '02.

[2]  Doran Wilde,et al.  A LIBRARY FOR DOING POLYHEDRAL OPERATIONS , 2000 .

[3]  Sharad Malik,et al.  Exact memory size estimation for array computations without loop unrolling , 1999, DAC '99.

[4]  Kristof Beyls,et al.  Reuse Distance-Based Cache Hint Selection , 2002, Euro-Par.

[5]  Monica S. Lam,et al.  A data locality optimizing algorithm , 1991, PLDI '91.

[6]  Nikil D. Dutt,et al.  Efficient utilization of scratch-pad memory in embedded processor applications , 1997, Proceedings European Design and Test Conference. ED & TC 97.

[7]  Francky Catthoor,et al.  Data dependency size estimation for use in memory optimization , 2003, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[8]  Gerda Janssens,et al.  Multi-dimensional incremental loop fusion for data locality , 2003, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors. ASAP 2003.

[9]  Francky Catthoor,et al.  Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design , 1998 .

[10]  Erik Brockmeyer,et al.  Layer assignment techniques for low energy in multi-layered memory organisations , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[11]  Nikil D. Dutt,et al.  Memory size estimation for multimedia applications , 1998, Proceedings of the Sixth International Workshop on Hardware/Software Codesign. (CODES/CASHE'98).

[12]  Anne Mignotte,et al.  Loop alignment for memory accesses optimization , 1999, Proceedings 12th International Symposium on System Synthesis.

[13]  Alain Darte On the Complexity of Loop Fusion , 2000, Parallel Comput..

[14]  Utpal Banerjee,et al.  Loop Transformations for Restructuring Compilers: The Foundations , 1993, Springer US.

[15]  Erik Brockmeyer,et al.  Data reuse analysis technique for software-controlled memory hierarchies , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[16]  Francky Catthoor,et al.  Custom Memory Management Methodology , 1998, Springer US.

[17]  Rudy Lauwereins,et al.  Search space definition and exploration for nonuniform data reuse opportunities in data-dominant applications , 2003, TODE.

[18]  Hugo De Man,et al.  Background memory area estimation for multidimensional signal processing systems , 1995, IEEE Trans. Very Large Scale Integr. Syst..

[19]  Martin Palkovic,et al.  Memory requirement optimization with loop fusion and loop shifting , 2004, Euromicro Symposium on Digital System Design, 2004. DSD 2004..

[20]  Hugo De Man,et al.  Formalized methodology for data reuse exploration in hierarchical memory mappings , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.