Can You Trust Your Memory Trace? A Comparison of Memory Traces from Binary Instrumentation and Simulation

Simulation is employed extensively to perform exploration of design spaces by computer designers. Contemporary simulation environments are now increasingly complex comprising of support for multiple cores and full operating systems. Resource use between simulation environments vary widely because of these different system contexts and the fact that multi-threaded applications have intrinsic non-determinism. In addition, more recent simulation environments use Dynamic Binary Instrumentation (DBI) traces collected on the system context (OS, library, threading API) of the host system. Methodologies that have been employed to validate and compare simulation frameworks are usually limited to comparing CPI and cache statistics and do not provide a detailed function-level breakdown or understanding of the source of mismatches. In this work, we attempt to identify and quantify the true sources of mismatch between a DBI framework and a full system simulation framework. We use memory traces of multithreaded applications that have been annotated with function call information to allow for a breakdown of the source of mismatch within an application. To the best of our knowledge, this level of detail in comparison has not been attempted before, especially with traces of multi-threaded applications. In this study, we find that the sources of mismatch come mainly from threading mechanisms/threading API function calls, Library/System function calls and User Space condition synchronization. Based on the results of the study, we identify specific functions in each category of mismatch. We then propose a few ways to close the gap and enable more reliable simulation for design space exploration.

[1]  Ronald G. Dreslinski,et al.  Sources of error in full-system simulation , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[2]  Michael L. Scott,et al.  Shared-Memory Synchronization , 2013, Shared-Memory Synchronization.

[3]  Gilles Sassatelli,et al.  Accuracy evaluation of GEM5 simulator system , 2012, 7th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC).

[4]  Simon W. Moore,et al.  A communication characterisation of Splash-2 and Parsec , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[5]  Matt T. Yourst PTLsim: A Cycle Accurate Full System x86-64 Microarchitectural Simulator , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[6]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[7]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[8]  Lieven Eeckhout,et al.  Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[9]  B. Jacob,et al.  CMP $ im : A Pin-Based OnThe-Fly Multi-Core Cache Simulator , 2008 .

[10]  Todd M. Austin,et al.  SimpleScalar: An Infrastructure for Computer System Modeling , 2002, Computer.

[11]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[12]  Christoforos E. Kozyrakis,et al.  ZSim: fast and accurate microarchitectural simulation of thousand-core systems , 2013, ISCA.