Cache-Aware SPM Allocation to Reduce Worst-Case Execution Time for Hybrid SPM-Caches

Scratch-Pad Memories (SPMs) have been increasingly used in real-time and embedded systems. However, it is still unknown and challenging to reduce the worst-case execution time (WCET) for hybrid SPM-cache architecture, where an SPM and a cache memory are placed on-chip in parallel to cooperatively improve performance and/or energy efficiency. In this paper, we study four SPM allocation strategies to reduce the WCET for hybrid SPM-caches with different complexities. These algorithms differ by whether or not they can cooperate with the cache or be aware of the WCET. Our evaluation shows that the cache-aware and WCET-oriented SPM allocation can minimize the WCET for real-time benchmarks with little or even positive impact on the average-case execution time (ACET).

[1]  Gerard J. M. Smit,et al.  A mathematical approach towards hardware design , 2010, Dynamically Reconfigurable Architectures.

[2]  Wei Zhang,et al.  Cache-aware SPM allocation algorithms for hybrid SPM-cache architectures , 2015, Sixteenth International Symposium on Quality Electronic Design.

[3]  David B. Whalley,et al.  WCET code positioning , 2004, 25th IEEE International Real-Time Systems Symposium.

[4]  Nimrod Megiddo,et al.  Advances in Economic Theory: On the complexity of linear programming , 1987 .

[5]  Heiko Falk,et al.  Optimal static WCET-aware scratchpad allocation of program code , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[6]  Nikil D. Dutt,et al.  Efficient utilization of scratch-pad memory in embedded processor applications , 1997, Proceedings European Design and Test Conference. ED & TC 97.

[7]  Neil C. Audsley,et al.  Studying the Applicability of the Scratchpad Memory Management Unit , 2010, 2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium.

[8]  Paul Lokuciejewski,et al.  WCET-driven Cache-based Procedure Positioning Optimizations , 2008, 2008 Euromicro Conference on Real-Time Systems.

[9]  Luca Benini,et al.  Polynomial-time algorithm for on-chip scratchpad memory partitioning , 2003, CASES '03.

[10]  Hui Wu,et al.  Optimal WCET-aware code selection for scratchpad memory , 2010, EMSOFT '10.

[11]  Frank Müller,et al.  Generalizing timing predictions to set-associative caches , 1997, Proceedings Ninth Euromicro Workshop on Real Time Systems.

[12]  Ting Chen,et al.  WCET centric data allocation to scratchpad memory , 2005, 26th IEEE International Real-Time Systems Symposium (RTSS'05).

[13]  Jaehyuk Huh,et al.  Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture , 2003, ISCA '03.

[14]  Jason Cong,et al.  HC-Sim: A fast and exact L1 cache simulator with scratchpad memory co-simulation support , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[15]  Reinhard Wilhelm,et al.  Cache Behavior Prediction by Abstract Interpretation , 1996, SAS.

[16]  Heiko Falk,et al.  WCET-driven cache-aware code positioning , 2011, 2011 Proceedings of the 14th International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES).

[17]  D. Spielman,et al.  Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time , 2004 .

[18]  David B. Whalley,et al.  Bounding worst-case instruction cache performance , 1994, 1994 Proceedings Real-Time Systems Symposium.

[19]  Peter Marwedel,et al.  Assigning program and data objects to scratchpad for energy reduction , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[20]  Jean-François Deverge,et al.  WCET-Directed Dynamic Scratchpad Memory Allocation of Data , 2007, 19th Euromicro Conference on Real-Time Systems (ECRTS'07).

[21]  Rajeev Barua,et al.  Scratch-pad memory allocation without compiler support for java applications , 2007, CASES '07.

[22]  Wei Zhang,et al.  Hybrid SPM-cache architectures to achieve high time predictability and performance , 2013, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors.

[23]  Mahmut T. Kandemir,et al.  Compiler-directed scratch pad memory optimization for embedded multiprocessors , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[24]  Paul Lokuciejewski,et al.  WCET-Driven Cache-Aware Memory Content Selection , 2010, 2010 13th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing.

[25]  Rajeev Barua,et al.  An optimal memory allocation scheme for scratch-pad-based embedded systems , 2002, TECS.

[26]  Qing Wan,et al.  WCET-aware data selection and allocation for scratchpad memory , 2012, LCTES 2012.

[27]  David B. Whalley,et al.  Bounding Pipeline and Instruction Cache Performance , 1999, IEEE Trans. Computers.

[28]  Sumesh Udayakumaran,et al.  Compiler-decided dynamic memory allocation for scratch-pad based embedded systems , 2003, CASES '03.

[29]  Sharad Malik,et al.  Performance estimation of embedded software with instruction cache modeling , 1999, TODE.

[30]  Alexander G. Dean,et al.  Leveraging both Data Cache and Scratchpad Memory through Synergetic Data Allocation , 2012, 2012 IEEE 18th Real Time and Embedded Technology and Applications Symposium.