Tracing and Characterization of NT-based System Workloads

Trace-driven simulation is commonly used by the computer architecture research community to pursue answers to a wide variety of architectural design issues. Traces taken from benchmark execution (e.g., SPEC, Bytemark, SPLASH) have been studied extensively to optimize the design of pipelines, branch predictors, and especially cache memories. Today’s computer designs have been optimized based on the characteristics of these benchmarks. As applications become more dependent on services and APIs provided by the hosting operating system, the overall application performance becomes more dependent on efficient operating system interaction. It has been acknowledged that operating system overhead can greatly affect the benefits provided by a new design feature. The reason why the operating system interaction has, for the most part, been ignored in past architectural studies is the lack of available tools that can generate kernel-laden traces. In this contribution we describe the ongoing efforts to capture operating system rich traces on the DEC Alpha platform. We will describe the current version of the PatchWrx toolset, originally developed by Richard Sites. This tool allows us to obtain trace information of application and operating system activity, while introducing minimal overhead. We will describe the current version of the tool, and demonstrate its capabilities by characterizing a number of applications. We will also contrast the fundamental differences between using simple benchmark programs, versus studying application based programs. This paper demonstrates that for real applications (MS CD Player, MS Visual C/C++, FX!32, etc.), the operating system execution can dominate the overall execution time of the application, contributing significantly to the contents of any captured trace.

[1]  Richard E. Kessler,et al.  Generation and analysis of very long address traces , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[2]  Alan Jay Smith,et al.  Two Methods for the Efficient Analysis of Memory Address Trace Data , 1977, IEEE Transactions on Software Engineering.

[3]  Liana L. Fong,et al.  Performance analysis on a CC-NUMA prototype , 1997, IBM J. Res. Dev..

[4]  Dionisios N. Pnevmatikatos,et al.  Cache performance of the integer SPEC benchmarks on a RISC , 1990, CARN.

[5]  David R. Kaeli,et al.  Issues in Trace-Driven Simulation , 1993, Performance/SIGMETRICS Tutorials.

[6]  Michael D. Smith,et al.  Tracing with Pixie , 1991 .

[7]  David Keppel,et al.  Shade: a fast instruction-set simulator for execution profiling , 1994, SIGMETRICS.

[8]  Trevor Mudge,et al.  Monster : a tool for analyzing the interaction between operating systems and computer architectures , 1992 .

[9]  Pramod V. Argade,et al.  A technique for monitoring run-time dynamics of an operating system and a microprocessor executing user applications , 1994, ASPLOS VI.

[10]  Sharon E. Perl,et al.  Studies of Windows NT performance using dynamic execution traces , 1996, OSDI '96.

[11]  James R. Larus,et al.  Abstract execution: A technique for efficiently tracing programs , 1990, Softw. Pract. Exp..

[12]  Brian N. Bershad,et al.  Execution characteristics of desktop applications on Windows NT , 1998, ISCA.

[13]  J. B Chen,et al.  The Impact of Software Structure and Policy on CPU and Memory System Performance , 1994 .

[14]  Anoop Gupta,et al.  Complete computer system simulation: the SimOS approach , 1995, IEEE Parallel Distributed Technol. Syst. Appl..

[15]  W. Kent Fuchs,et al.  TRAPEDS: producing traces for multicomputers via execution driven simulation , 1989, SIGMETRICS '89.

[16]  Anant Agarwal,et al.  Analysis of cache performance for operating systems and multiprogramming , 1989, The Kluwer international series in engineering and computer science.

[17]  A. Argawal,et al.  Cache performance of operating systems and multiprogramming , 1988 .

[18]  Y.N. Patt,et al.  Using Hybrid Branch Predictors to Improve Branch Prediction Accuracy in the Presence of Context Switches , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[19]  Trevor N. Mudge,et al.  Trace-driven memory simulation: a survey , 1997, CSUR.

[20]  James Archibald,et al.  BACH: BYU Address Collection Hardware, The Collection of Complete Traces , 1992 .

[21]  David W. Wall,et al.  Software Methods for System Address Tracing: Implementation and Validation , 1999 .

[22]  David R. Kaeli,et al.  Real-Time Trace Generation , 1996, Int. J. Comput. Simul..

[23]  Alec Wolman,et al.  Instrumentation and optimization of Win32/intel executables using Etch , 1997 .

[24]  Scott Devine,et al.  Using the SimOS machine simulator to study complex computer systems , 1997, TOMC.

[25]  David R. Kaeli,et al.  Operating system impact on trace-driven simulation , 1998, Proceedings 31st Annual Simulation Symposium.

[26]  Brian N. Bershad,et al.  The impact of operating system structure on memory system performance , 1994, SOSP '93.

[27]  Josep Torrellas,et al.  Characterizing the caching and synchronization performance of a multiprocessor operating system , 1992, ASPLOS V.