Fast cycle estimation methodology for instruction-level emulator

In this paper, we propose a cycle estimation methodology for fast instruction-level CPU emulators. This methodology suggests achieving accurate software performance estimation at high emulation speed by utilizing a two-phase pipeline scheduling process: a static pipeline scheduling phase performed off-line before runtime, followed by an accuracy refinement phase performed at runtime. The first phase delivers a pre-estimated CPU cycle count while limiting impact on the emulation speed. The second phase refines the pre-estimated cycle count to provide further accuracy. We implemented this methodology on QEMU and compared cycle counts with a physical ARM CPU. Our results show the efficiency of the tradeoffs between emulation speed and cycle accuracy: cycle simulation error averages 10% while the emulation latency is 3.37 times that of original QEMU.

[1]  Igor Böhm,et al.  Cycle-accurate performance modelling in an ultra-fast just-in-time dynamic binary translation instruction set simulator , 2010, 2010 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[2]  Ming-Chao Chiang,et al.  A fast cycle-accurate instruction set simulator based on QEMU and SystemC for SoC development , 2010, Melecon 2010 - 2010 15th IEEE Mediterranean Electrotechnical Conference.

[3]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.