Scratchpad memory management in a multitasking environment

This paper presents a dynamic scratchpad memory (SPM) code allocation technique for embedded systems running an operating system with preemptive multitasking. Existing SPM allocation schemes do not support multiple tasks or only a fixed number of processes that are known at compile time. These schemes rely on algorithms that select code depending on the size of the SPM. In contemporary portable devices, however, processes are created and terminated on demand and the SPM is shared among them. We introduce a dynamic scratchpad memory code allocation technique for code that supports dynamically created processes. At runtime, an SPM manager (SPMM) loads code pages of the running applications into the SPM on demand. It supports different sharing strategies that determine how the SPM is distributed among the running processes. We analyze several sharing strategies with regard to several preferable properties of multiprocess SPM allocation schemes. We evaluate the proposed multiprocess SPM allocation techniques and compare them to a fully-cached reference system by running several multiprocess benchmarks. The benchmarks comprise of multiple embedded applications such as H.264, MP3, MPEG-4, and PGP. On average, we achieve a 47% improvement in throughput and a 32% reduction in energy consumption. A comparison with the unachievable lower bound shows that the best SPM sharing strategy exploits 87% of the runtime improvements and 89% of the energy savings possible.

[1]  Rajeev Barua,et al.  Heap data allocation to scratch-pad memory in embedded systems , 2005, J. Embed. Comput..

[2]  Lin Gao,et al.  Memory coloring: a compiler approach for scratchpad memory management , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[3]  Rajeev Barua,et al.  Memory allocation for embedded systems with a compile-time-unknown scratch-pad size , 2005, CASES '05.

[4]  Aviral Shrivastava,et al.  Compilation techniques for energy reduction in horizontally partitioned cache architectures , 2005, CASES '05.

[5]  Heonshik Shin,et al.  Scratchpad memory management for portable systems with a memory management unit , 2006, EMSOFT '06.

[6]  Luca Benini,et al.  A post-compiler approach to scratchpad mapping of code , 2004, CASES '04.

[7]  Peter Marwedel,et al.  Cache-aware scratchpad allocation algorithm , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[8]  Sri Parameswaran,et al.  A novel instruction scratchpad memory optimization method based on concomitance metric , 2006, Asia and South Pacific Conference on Design Automation, 2006..

[9]  Peter Marwedel,et al.  Reducing energy consumption by dynamic copying of instructions onto onchip memory , 2002, 15th International Symposium on System Synthesis, 2002..

[10]  Heonshik Shin,et al.  Dynamic data scratchpad memory management for a memory subsystem with an MMU , 2007, LCTES '07.

[11]  Mahmut T. Kandemir,et al.  Compiler-directed scratch pad memory hierarchy design and management , 2002, DAC '02.

[12]  Peter Marwedel,et al.  Scratchpad sharing strategies for multiprocess embedded systems: a first approach , 2005, 3rd Workshop on Embedded Systems for Real-Time Multimedia, 2005..

[13]  Heonshik Shin,et al.  Dynamic scratchpad memory management for code in portable systems with an MMU , 2008, TECS.

[14]  Nikil D. Dutt,et al.  Efficient utilization of scratch-pad memory in embedded processor applications , 1997, Proceedings European Design and Test Conference. ED & TC 97.

[15]  Peter Marwedel,et al.  Scratchpad memory: a design alternative for cache on-chip memory in embedded systems , 2002, Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627).

[16]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[17]  Jaejin Lee,et al.  FaCSim: a fast and cycle-accurate architecture simulator for embedded systems , 2008, LCTES '08.

[18]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[19]  Philip Machanick,et al.  Hardware-software trade-offs in a direct Rambus implementation of the RAMpage memory hierarchy , 1998, ASPLOS VIII.

[20]  Sumesh Udayakumaran,et al.  Compiler-decided dynamic memory allocation for scratch-pad based embedded systems , 2003, CASES '03.