Extracting Threaded Traces in Simulation Environments

Instruction traces play an important role in analyzing and understanding the behavior of target applications; however, existing tracing tools are built on specific platforms coupled with excessive reliance on compilers and operating systems. In this paper, we propose a precise thread level instruction tracing approach for modern chip multi-processor simulators, which inserts instruction patterns into programs at the beginning of main thread and slave threads. The target threads are identified and captured in a full system simulator using the instruction patterns without any modifications to the compiler and the operating system. We implemented our approach in the GEM5 simulator and evaluations were performed to test the accuracy on x86-Linux using standard benchmarks. We compared our traces to the ones collected by a Pin-tool. Experimental results show that traces extracted by our approach exhibit high similarity to the traces collected by the Pin-tool. Our approaches of extracting traces can be easily applied to other simulators with minor modification to the instruction execution engines.

[1]  David W. Wall,et al.  Generation and analysis of very long address traces , 1990, ISCA '90.

[2]  Peng Du,et al.  SimSight: A Virtual Machine Based Dynamic Call-Graph Generator , 2010 .

[3]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[4]  Li Liu,et al.  HMTT: a platform independent full-system memory trace monitoring system , 2008, SIGMETRICS '08.

[5]  Ramendra K. Sahoo,et al.  MemorIES: a programmable, real-time hardware emulation tool for multiprocessor server design , 2000, SIGP.

[6]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[7]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[8]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[9]  Shin-Dug Kim,et al.  Reconfigurable Address Collector and Flying Cache Simulator , 1997, Proceedings High Performance Computing on the Information Superhighway. HPC Asia '97.

[10]  Shih-Lien Lu,et al.  Real-time L3 cache simulations using the Programmable Hardware-Assisted Cache Emulator (PHA$E) , 2003, 2003 IEEE International Conference on Communications (Cat. No.03CH37441).

[11]  Trevor N. Mudge,et al.  Trace-driven memory simulation: a survey , 1997, CSUR.

[12]  Jack J. Dongarra,et al.  A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[13]  Aamer Jaleel,et al.  Analyzing Parallel Programs with PIN , 2010, Computer.