Fast instruction cache modeling for approximate timed HW/SW co-simulation

Approximate timed co-simulation has been proposed as a fast solution for system modeling in early design steps. This co-simulation technique enables the simulation of systems at speeds close to functional execution, while considering timing effects. As a consequence, system performance estimations can be obtained early, enabling efficient design space exploration and system refinement. To achieve fast simulation speeds, first the SW code is pre-annotated with time information and then it is natively executed, performing what is called native-based co-simulation. To obtain sufficiently accurate performance estimations, the effect of the system components must be considered. Among them, processor caches are really important, as they have a strong impact on the overall system performance. However, no efficient techniques for cache modeling in native-based co-simulation have been proposed. Previous works considering caches apply slow cache models based on tag search, similar to ISS-based models. This solution slows down the simulation speed, greatly reducing the efficiency of native based co-simulations. In this paper, a high-level instruction cache model is proposed, along with the required instrumentation for native simulation. This model allows the designer to obtain cache hit/miss rate estimations with simulation speeds very close to native execution. Results present a speed-up of two orders of magnitude with respect to ISS and one order of magnitude regarding previous approaches in native simulation. Miss rate estimation error remains below 5%.

[1]  Christian Steger,et al.  A software performance simulation methodology for rapid system architecture exploration , 2008, 2008 15th IEEE International Conference on Electronics, Circuits and Systems.

[2]  David B. Whalley,et al.  Bounding worst-case instruction cache performance , 1994, 1994 Proceedings Real-Time Systems Symposium.

[3]  Cristina Cifuentes,et al.  Reverse compilation techniques , 1994 .

[4]  Dimitrios Soudris,et al.  An Estimation Methodology for Designing Instruction Cache Memory of Embedded Systems , 2006, 2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia.

[5]  Wolfgang Rosenstiel,et al.  High-performance timing simulation of embedded software , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[6]  Sharad Malik,et al.  Performance estimation of embedded software with instruction cache modeling , 1995, ICCAD.

[7]  Frédéric Pétrot,et al.  Automatic instrumentation of embedded software for high level hardware/software co-simulation , 2009, 2009 Asia and South Pacific Design Automation Conference.

[8]  Jörg Henkel,et al.  Fast cache and bus power estimation for parameterized system-on-a-chip design , 2000, DATE '00.

[9]  Donald E. Thomas,et al.  High level cache simulation for heterogeneous multiprocessors , 2004, Proceedings. 41st Design Automation Conference, 2004..

[10]  Nozomu Togawa,et al.  Exact and fast L1 cache simulation for embedded systems , 2009, 2009 Asia and South Pacific Design Automation Conference.

[11]  Jan Staschulat,et al.  Hybrid Cache Analysis in Running Time Verification of Embedded Software , 2002, Des. Autom. Embed. Syst..

[12]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[13]  Luciano Lavagno,et al.  Fast instruction cache simulation strategies in a hardware/software co-design environment , 1999, Proceedings of the ASP-DAC '99 Asia and South Pacific Design Automation Conference 1999 (Cat. No.99EX198).

[14]  E. S. Sorenson,et al.  Cache characterization surfaces and predicting workload miss rates , 2001 .

[15]  Edward D. Willink,et al.  Meta-compilation for C++ , 2001 .