Workload characterization: motivation, goals and methodology

Understanding the characteristics of workloads is extremely important in the design of efficient computer architectures. Accurate characterization of workload behavior leads to the design of improved architectures. The characterization of applications allows one to tune the processor micro-architecture, memory hierarchy and system architecture to suit particular features in programs. Workload characterization also has a significant impact on performance evaluation. Understanding the nature of the workload and its intrinsic features can help to interpret performance measurements and simulation results. Identifying and characterizing the intrinsic properties of an application in terms of its memory access behavior, locality, control flow behavior, instruction-level parallelism, etc. can eventually lead to a program behavior model, which can be used in conjunction with a processor model to do analytical performance modeling of computer systems. In this paper, we describe the objectives of workload characterization and emphasize the importance of obtaining architecture-independent metrics for workloads. A study of memory reference locality using some generic metrics is presented as an example.

[1]  Günter Haring,et al.  On Stochastic Models of Interactive Workloads , 1983, Performance.

[2]  B. Grayson,et al.  Characterizing instruction latency for speculative issue SMPs: a case study of varying memory system performance on the SPLASH-2 benchmarks , 1998, Workload Characterization: Methodology and Case Studies. Based on the First Workshop on Workload Characterization.

[3]  Raj Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[4]  Ramesh Radhakrishnan,et al.  Execution characteristics of object oriented programs on the UltraSPARC-II , 1998, Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238).

[5]  Mark Horowitz,et al.  ATUM: a new technique for capturing address traces using microcode , 1986, ISCA '86.

[6]  Antonio Gonzalez,et al.  A data cache with multiple caching strategies tuned to different types of locality , 1995, International Conference on Supercomputing.

[7]  Philip Heidelberger,et al.  Computer Performance Evaluation Methodology , 1984, IEEE Transactions on Computers.

[8]  John Paul Shen,et al.  A framework for statistical modeling of superscalar processor performance , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[9]  K. Sreenivasan,et al.  On the construction of a representative synthetic workload , 1974, CACM.

[10]  Ulf Grenander,et al.  Patterns in Program References , 1975, IBM J. Res. Dev..

[11]  A. Poursepanj,et al.  Generation of 3D graphics workload for system performance analysis , 1998, Workload Characterization: Methodology and Case Studies. Based on the First Workshop on Workload Characterization.

[12]  Susan J. Eggers,et al.  An analysis of database workload performance on simultaneous multithreaded processors , 1998, ISCA.

[13]  Ashok K. Agrawala,et al.  An Approach to the Workload Characterization Problem , 1976, Computer.

[14]  Lizy Kurian John,et al.  Program balance and its impact on high performance RISC architectures , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.

[15]  Edward S. Davidson,et al.  Information content of CPU memory referencing behavior , 1977, ISCA '77.

[16]  Ilkka J. Haikala,et al.  Methodology and empirical results of program behaviour measurements , 1980, PERFORMANCE '80.

[17]  Jeffrey Dean,et al.  ProfileMe: hardware support for instruction-level profiling on out-of-order processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[18]  Peter J. Denning,et al.  The working set model for program behavior , 1968, CACM.

[19]  Todd M. Austin,et al.  Dynamic dependency analysis of ordinary programs , 1992, ISCA '92.

[20]  Kishor S. Trivedi Probability and Statistics with Reliability, Queuing, and Computer Science Applications , 1984 .

[21]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[22]  José A. B. Fortes,et al.  Performance and memory-access characterization of data mining applications , 1998, Workload Characterization: Methodology and Case Studies. Based on the First Workshop on Workload Characterization.

[23]  Michael J. Flynn,et al.  Instruction Window Size Trade-Offs and Characterization of Program Parallelism , 1994, IEEE Trans. Computers.

[24]  Philip G. Emma,et al.  Characterization of Branch and Data Dependencies in Programs for Evaluating Pipeline Performance , 1987, IEEE Transactions on Computers.

[25]  Reinhold Weicker,et al.  An overview of common benchmarks , 1990, Computer.

[26]  John Zahorjan,et al.  Workload representations in queueing models of computer systems , 1983, SIGMETRICS '83.

[27]  David Kaeli,et al.  Parameter value characterization of Windows NT-based applications , 1998, Workload Characterization: Methodology and Case Studies. Based on the First Workshop on Workload Characterization.

[28]  Luiz André Barroso,et al.  Memory system characterization of commercial workloads , 1998, ISCA.

[29]  Jin-Soo Kim,et al.  Memory characterization of a parallel data mining workload , 1998, Workload Characterization: Methodology and Case Studies. Based on the First Workshop on Workload Characterization.

[30]  Yale N. Patt,et al.  An analysis of correlation and predictability: what makes two-level branch predictors work , 1998, ISCA.

[31]  Ramesh Radhakrishnan,et al.  Evaluating MMX technology using DSP and multimedia applications , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[32]  Todd C. Mowry,et al.  Predicting data cache misses in non-numeric applications through correlation profiling , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[33]  John C. Gyllenhaal,et al.  A study of code reuse and sharing characteristics of Java applications , 1998, Workload Characterization: Methodology and Case Studies. Based on the First Workshop on Workload Characterization.

[34]  Ashok K. Agrawala,et al.  The Relationship between the Pattern Recognition Problem and the Workload Characterization Problem , 1977, Int. CMG Conference.

[35]  Günter Haring On state-dependent workload characterization by software resources , 1982, SIGMETRICS '82.

[36]  Lizyamma Kurian Issues in the design of a decoupled architecture for a Risc environment , 1993 .

[37]  Frederica Darema,et al.  Memory access patterns of parallel scientific programs , 1987, SIGMETRICS '87.

[38]  Wing Shing Wong,et al.  Benchmark Synthesis Using the LRU Cache Hit Function , 1988, IEEE Trans. Computers.

[39]  Dileep Bhandarkar,et al.  Performance characterization of the Pentium Pro processor , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[40]  Sharad Malik,et al.  Cache miss equations: an analytical representation of cache misses , 1997, ICS '97.

[41]  Peter J. Denning,et al.  Experiments with program locality , 1899, AFIPS '72 (Fall, part I).

[42]  Donald J. Hatfield Experiments on Page Size, Program Access Patterns, and Virtual Memory Performance , 1972, IBM J. Res. Dev..

[43]  C. M. Krishna Performance Modeling for Computer Architects , 1995 .

[44]  Mikko H. Lipasti,et al.  Can trace-driven simulators accurately predict superscalar performance? , 1996, Proceedings International Conference on Computer Design. VLSI in Computers and Processors.

[45]  David A. Patterson,et al.  Performance characterization of a Quad Pentium Pro SMP using OLTP workloads , 1998, ISCA.

[46]  Robert B. Hagmann,et al.  Program page reference patterns , 1982, SIGMETRICS '82.

[47]  T. Conte,et al.  Fast Simulation of Computer Architectures , 1995, Springer US.

[48]  Harold S. Stone,et al.  Footprints in the cache , 1986, SIGMETRICS '86/PERFORMANCE '86.

[49]  Kirk W. Cameron,et al.  Instruction-level characterization of scientific computing applications using hardware performance counters , 1998, Workload Characterization: Methodology and Case Studies. Based on the First Workshop on Workload Characterization.

[50]  Brian N. Bershad,et al.  Execution characteristics of desktop applications on Windows NT , 1998, ISCA.

[51]  Dionisios N. Pnevmatikatos,et al.  Cache performance of the SPEC92 benchmark suite , 1993, IEEE Micro.

[52]  Louise Trevillyan,et al.  Representative traces for processor models with infinite cache , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.

[53]  Mark Horowitz,et al.  An analytical cache model , 1989, TOCS.

[54]  Dhiraj K. Pradhan,et al.  Modeling Live and Dead Lines in Cache Memory Systems , 1993, IEEE Trans. Computers.

[55]  R. Radhakrishnan,et al.  Characterizing the behavior of Windows NT Web server workloads using processor performance counters , 1998, Workload Characterization: Methodology and Case Studies. Based on the First Workshop on Workload Characterization.