Dynamic scratchpad memory management for code in portable systems with an MMU

In this work, we present a dynamic memory allocation technique for a novel, horizontally partitioned memory subsystem targeting contemporary embedded processors with a memory management unit (MMU). We propose to replace the on-chip instruction cache with a scratchpad memory (SPM) and a small minicache. Serializing the address translation with the actual memory access enables the memory system to access either only the SPM or the minicache. Independent of the SPM size and based solely on profiling information, a postpass optimizer classifies the code of an application binary into a pageable and a cacheable code region. The latter is placed at a fixed location in the external memory and cached by the minicache. The former, the pageable code region, is copied on demand to the SPM before execution. Both the pageable code region and the SPM are logically divided into pages the size of an MMU memory page. Using the MMU's pagefault exception mechanism, a runtime scratchpad memory manager (SPMM) tracks page accesses and copies frequently executed code pages to the SPM before they get executed. In order to minimize the number of page transfers from the external memory to the SPM, good code placement techniques become more important with increasing sizes of the MMU pages. We discuss code-grouping techniques and provide an analysis of the effect of the MMU's page size on execution time, energy consumption, and external memory accesses. We show that by using the data cache as a victim buffer for the SPM, significant energy savings are possible. We evaluate our SPM allocation strategy with fifteen applications, including H.264, MP3, MPEG-4, and PGP. The proposed memory system requires 8&percent; less die are compared to a fully-cached configuration. On average, we achieve a 31&percent; improvement in runtime performance and a 35&percent; reduction in energy consumption with an MMU page size of 256 bytes.

[1]  Sang Lyul Min,et al.  A dynamic code placement technique for scratchpad memory using postpass optimization , 2006, CASES '06.

[2]  Karl Pettis,et al.  Profile guided code positioning , 1990, PLDI '90.

[3]  Norman P. Jouppi,et al.  CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.

[4]  Sri Parameswaran,et al.  A novel instruction scratchpad memory optimization method based on concomitance metric , 2006, Asia and South Pacific Conference on Design Automation, 2006..

[5]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[6]  Mahmut T. Kandemir,et al.  Compiler-directed scratch pad memory hierarchy design and management , 2002, DAC '02.

[7]  Peter Marwedel,et al.  Scratchpad sharing strategies for multiprocess embedded systems: a first approach , 2005, 3rd Workshop on Embedded Systems for Real-Time Multimedia, 2005..

[8]  Luca Benini,et al.  An integrated hardware/software approach for run-time scratchpad management , 2004, Proceedings. 41st Design Automation Conference, 2004..

[9]  LeeJaejin,et al.  Dynamic scratchpad memory management for code in portable systems with an MMU , 2008 .

[10]  Luca Benini,et al.  A post-compiler approach to scratchpad mapping of code , 2004, CASES '04.

[11]  Nikil D. Dutt,et al.  Efficient utilization of scratch-pad memory in embedded processor applications , 1997, Proceedings European Design and Test Conference. ED & TC 97.

[12]  Peter Marwedel,et al.  Scratchpad memory: a design alternative for cache on-chip memory in embedded systems , 2002, Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627).

[13]  Heonshik Shin,et al.  Dynamic data scratchpad memory management for a memory subsystem with an MMU , 2007, LCTES '07.

[14]  Lin Gao,et al.  Memory coloring: a compiler approach for scratchpad memory management , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[15]  Sumesh Udayakumaran,et al.  Compiler-decided dynamic memory allocation for scratch-pad based embedded systems , 2003, CASES '03.

[16]  Peter Marwedel,et al.  Cache-aware scratchpad allocation algorithm , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[17]  DominguezAngel,et al.  Memory allocation for embedded systems with a compile-time-unknown scratch-pad size , 2009 .

[18]  Aviral Shrivastava,et al.  Compilation techniques for energy reduction in horizontally partitioned cache architectures , 2005, CASES '05.

[19]  Heonshik Shin,et al.  Scratchpad memory management for portable systems with a memory management unit , 2006, EMSOFT '06.

[20]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[21]  Philip Machanick,et al.  Hardware-software trade-offs in a direct Rambus implementation of the RAMpage memory hierarchy , 1998, ASPLOS VIII.

[22]  Chris Rowen,et al.  A CMOS RISC Processor with Integrated System Functions , 1986, COMPCON.

[23]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[24]  Peter Marwedel,et al.  Reducing energy consumption by dynamic copying of instructions onto onchip memory , 2002, 15th International Symposium on System Synthesis, 2002..

[25]  Sang Lyul Min,et al.  Compiler-assisted demand paging for embedded systems with flash memory , 2004, EMSOFT '04.

[26]  Mahmut T. Kandemir,et al.  Dynamic management of scratch-pad memory space , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[27]  Luca Benini,et al.  Polynomial-time algorithm for on-chip scratchpad memory partitioning , 2003, CASES '03.

[28]  John A. Fotheringham,et al.  Dynamic storage allocation in the Atlas computer, including an automatic use of a backing store , 1961, Commun. ACM.

[29]  Peter J. Denning,et al.  The working set model for program behavior , 1968, CACM.

[30]  Rajeev Barua,et al.  Heap data allocation to scratch-pad memory in embedded systems , 2005, J. Embed. Comput..