Reducing Off-Chip Memory Access Costs Using Data Recomputation in Embedded Chip Multi-processors

There have been numerous efforts on Scratch-Pad Memory (SPM) management in the context of single CPU systems and, more recently, multi-processor architectures. This paper presents a novel SPM space utilization strategy, for embedded chip multi-processor systems, based on recomputing the value of an off-chip data element using on-chip (SPM resident) data elements. In doing so, our goal is to eliminate the corresponding off-chip memory access that would otherwise be performed, and save execution cycles and power. This paper presents the details of a compiler algorithm that implements this approach and reports the experimental data we collected using six data-intensive applications. Our results indicate that, on a four processor chip multiprocessor, the average performance improvement our approach brings is about 11.8%, over a state-of-the-art SPM management scheme. We also observed that there is a specific range of total SPM size/total data size ratios, for which our approach generates the best results. Finally, our results also show that the proposed approach brings consistent improvements when the number of CPUs is varied between 2 and 16.

[1]  Stephen Richardson MPOC: A Chip Multiprocessor for Embedded Systems , 2002 .

[2]  Mahmut T. Kandemir,et al.  Studying storage-recomputation tradeoffs in memory-constrained embedded processing , 2005, Design, Automation and Test in Europe.

[3]  Keisuke Inoue,et al.  A 250-MHz single-chip multiprocessor for audio and video signal processing , 2001 .

[4]  N. Okumura,et al.  A 600 MHz single-chip multiprocessor with 4.8 GB/s internal shared pipelined bus and 512 kB internal memory , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..

[5]  Rajeev Barua,et al.  An optimal memory allocation scheme for scratch-pad-based embedded systems , 2002, TECS.

[6]  Nikil D. Dutt,et al.  Efficient utilization of scratch-pad memory in embedded processor applications , 1997, Proceedings European Design and Test Conference. ED & TC 97.

[7]  Mahmut T. Kandemir,et al.  An integer linear programming based approach to simultaneous memory space partitioning and data allocation for chip multiprocessors , 2006, IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures (ISVLSI'06).

[8]  Mahmut T. Kandemir,et al.  Exploiting shared scratch pad memory space in embedded multiprocessor systems , 2002, DAC '02.

[9]  Peter Marwedel,et al.  Scratchpad memory: a design alternative for cache on-chip memory in embedded systems , 2002, Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627).

[10]  Norbert Wehn,et al.  Embedded DRAM Development: Technology, Physical Design, and Application Issues , 2001, IEEE Des. Test Comput..

[11]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[12]  Peter Marwedel,et al.  Assigning program and data objects to scratchpad for energy reduction , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[13]  Mahmut T. Kandemir,et al.  Minimizing Energy Consumption of Banked Memories Using Data Recomputation , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.

[14]  Peter Marwedel,et al.  Scratchpad sharing strategies for multiprocess embedded systems: a first approach , 2005, 3rd Workshop on Embedded Systems for Real-Time Multimedia, 2005..

[15]  Lin Gao,et al.  Memory coloring: a compiler approach for scratchpad memory management , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[16]  Sumesh Udayakumaran,et al.  Compiler-decided dynamic memory allocation for scratch-pad based embedded systems , 2003, CASES '03.

[17]  Scott F. Smith,et al.  Performance of a GALS Single-Chip Multiprocessor , 2004, International Conference on Parallel and Distributed Processing Techniques and Applications.

[18]  Alexandru Nicolau,et al.  Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration , 1998 .

[19]  Josep Torrellas,et al.  A Chip-Multiprocessor Architecture with Speculative Multithreading , 1999, IEEE Trans. Computers.

[20]  Mahmut T. Kandemir,et al.  A compiler-based approach for dynamically managing scratch-pad memories in embedded systems , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.