Instruction cache locking inside a binary rewriter

Cache memories in embedded systems play an important role in reducing the execution time of the applications. Various kinds of extensions have been added to cache hardware to enable software involvement in replacement decisions, thus improving the run-time over a purely hardware-managed cache. Novel embedded systems, like Intel's Xscale and ARM Cortex processors provide the facility of locking one or more lines in cache - this feature is called cache locking. This paper presents the first method in the literature for instruction-cache locking that is able to reduce the average-case run-time of the program. We devise a cost-benefit model to discover the memory addresses which should be locked in the cache. We implement our scheme inside a binary rewriter, thus widening the applicability of our scheme to binaries compiled using any compiler. Results obtained on a suite of MiBench and MediaBench benchmarks show up to 25% improvement in the instruction-cache miss rate on average and up to 13.5% improvement in the execution time on average for applications having instruction accesses as a bottleneck, depending on the cache configuration. The improvement in execution time is as high as 23.5% for some benchmarks.

[1]  Rajeev Barua,et al.  An optimal memory allocation scheme for scratch-pad-based embedded systems , 2002, TECS.

[2]  Henrik Theiling,et al.  Compile-time decided instruction cache locking using worst-case execution paths , 2007, 2007 5th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[3]  Isabelle Puaut Cache analysis vs static cache locking for schedulability analysis in multitasking real-time systems , 2002 .

[4]  Peter Marwedel,et al.  Dynamic overlay of scratchpad memory for energy minimization , 2004, International Conference on Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS 2004..

[5]  Thomas Lindgren,et al.  Allocation of Global Data Objects in On-Chip RAM , 1998 .

[6]  Peter Marwedel,et al.  Cache-aware scratchpad allocation algorithm , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[7]  Jeffrey K. Hollingsworth,et al.  An API for Runtime Code Patching , 2000, Int. J. High Perform. Comput. Appl..

[8]  Kathryn S. McKinley,et al.  Cooperative caching with keep-me and evict-me , 2005, 9th Annual Workshop on Interaction between Compilers and Computer Architectures (INTERACT'05).

[9]  Frank Vahid,et al.  Exploiting Fixed Programs in Embedded Systems: A Loop Cache Example , 2002, IEEE Computer Architecture Letters.

[10]  Peter Marwedel,et al.  Assigning program and data objects to scratchpad for energy reduction , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[11]  Björn Lisper,et al.  Data cache locking for higher program predictability , 2003, SIGMETRICS '03.

[12]  Nikil D. Dutt,et al.  On-chip vs. off-chip memory: the data partitioning problem in embedded processor-based systems , 2000, TODE.

[13]  Kristof Beyls,et al.  Generating cache hints for improved program efficiency , 2005, J. Syst. Archit..

[14]  Isabelle Puaut,et al.  Low-complexity algorithms for static cache locking in multitasking hard real-time systems , 2002, 23rd IEEE Real-Time Systems Symposium, 2002. RTSS 2002..

[15]  Guang R. Gao,et al.  Improving power efficiency with compiler-assisted cache replacement , 2005, J. Embed. Comput..

[16]  Srinivas Devadas,et al.  Application-specific memory management for embedded systems using software-controlled caches , 2000, Proceedings 37th Design Automation Conference.

[17]  Brani Vidakovic,et al.  USING GENETIC ALGORITHMS IN CONTENT SELECTION FOR LOCKING-CACHES , 2001 .

[18]  Peter Marwedel,et al.  Scratchpad memory: a design alternative for cache on-chip memory in embedded systems , 2002, Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627).

[19]  Rajeev Barua,et al.  Dynamic allocation for scratch-pad memory using compile-time decisions , 2006, TECS.