Cycle-approximate Retargetable Performance Estimation at the Transaction Level

This paper presents a novel cycle-approximate performance estimation technique for automatically generated transaction level models (TLMs) for heterogeneous multi-core designs. The inputs are application C processes and their mapping to processing units in the platform. The processing unit model consists of pipelined datapath, memory hierarchy and branch delay model. Using the processing unit model, the basic blocks in the C processes are analyzed and annotated with estimated delays. This is followed by a code generation phase where delay-annotated C code is generated and linked with a SystemC wrapper consisting of inter-process communication channels. The generated TLM is compiled and executed natively on the host machine. Our key contribution is that the estimation technique is close to cycle-accurate, it can be applied to any multi-core platform and it produces high-speed native compiled TLMs. For experiments, timed TLMs for industrial scale designs such as MP3 decoder were automatically generated for 4 heterogeneous multi-processor platforms with up to 5 PEs under 1 minute. Each TLM simulated under 1 second, compared to 3-4 hrs of instruction set simulation (ISS) and 15-18 hrs of RTL simulation. Comparison to on-board measurement showed only 8 % error on average in estimated number of cycles.

[1]  Jeffry T. Russell,et al.  Architecture-level performance evaluation of component-based embedded systems , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[2]  Alberto L. Sangiovanni-Vincentelli,et al.  A compilation-based software estimation scheme for hardware/software co-simulation , 1999, Proceedings of the Seventh International Workshop on Hardware/Software Codesign (CODES'99) (IEEE Cat. No.99TH8450).

[3]  Andreas Gerstlauer,et al.  Retargetable profiling for rapid, early system-level design space exploration , 2004, Proceedings. 41st Design Automation Conference, 2004..

[4]  Kingshuk Karuri,et al.  A SW performance estimation framework for early system-level-design using fine-grained instrumentation , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[5]  Jong-Yeol Lee,et al.  Timed compiled-code simulation of embedded software for performance analysis of SOC design , 2002, DAC '02.

[6]  Chong-Min Kyung,et al.  System-level performance analysis of embedded system using behavioral C/C++ model , 2005, 2005 IEEE VLSI-TSA International Symposium on VLSI Design, Automation and Test, 2005. (VLSI-TSA-DAT)..

[7]  Luciano Lavagno,et al.  Software performance estimation strategies in a system-level design tool , 2000, Proceedings of the Eighth International Workshop on Hardware/Software Codesign. CODES 2000 (IEEE Cat. No.00TH8518).

[8]  Todd M. Austin,et al.  SimpleScalar: An Infrastructure for Computer System Modeling , 2002, Computer.

[9]  Donatella Sciuto,et al.  Source-level execution time estimation of C programs , 2001, Ninth International Symposium on Hardware/Software Codesign. CODES 2001 (IEEE Cat. No.01TH8571).