Variability in architectural simulations of multi-threaded workloads
暂无分享,去创建一个
[1] Josep Torrellas,et al. A direct-execution framework for fast and accurate simulation of superscalar processors , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[2] K. Driesen,et al. Accurate indirect branch prediction , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).
[3] Alaa R. Alameldeen,et al. Timestamp snooping: an approach for extending SMPs , 2000, SIGP.
[4] H. Frank,et al. Statistics: concepts and applications , 1996 .
[5] Brad Calder,et al. Automatically characterizing large scale program behavior , 2002, ASPLOS X.
[6] Alan E. Charlesworth,et al. Starfire: extending the SMP envelope , 1998, IEEE Micro.
[7] Janak H. Patel,et al. Accurate Low-Cost Methods for Performance Evaluation of Cache Memory Systems , 1988, IEEE Trans. Computers.
[8] Mikko H. Lipasti,et al. Precise and Accurate Processor Simulation , 2002 .
[9] Yale N. Patt,et al. The effects of mispredicted-path execution on branch prediction structures , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.
[10] BarrosoLuiz Andre,et al. Memory system characterization of commercial workloads , 1998 .
[11] A. Winsor. Sampling techniques. , 2000, Nursing times.
[12] David A. Wood,et al. Full-system timing-first simulation , 2002, SIGMETRICS '02.
[13] Brad Calder,et al. Basic block distribution analysis to find periodic behavior and simulation points in applications , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.
[14] David A. Wood,et al. A Comparison of Trace-Sampling Techniques for Multi-Megabyte Caches , 1994, IEEE Trans. Computers.
[15] Fredrik Larsson,et al. Simics: A Full System Simulation Platform , 2002, Computer.
[16] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[17] Milo M. K. Martin,et al. SafetyNet: improving the availability of shared memory multiprocessors with global checkpoint/recovery , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[18] Luiz André Barroso,et al. Memory system characterization of commercial workloads , 1998, ISCA.
[19] Ann Marie Grizzaffi Maynard,et al. Contrasting characteristics and cache performance of technical and multi-user commercial workloads , 1994, ASPLOS VI.
[20] Erik Hagersten,et al. Memory Characterization of the ECperf Benchmark , 2003 .
[21] Luiz André Barroso,et al. Piranha: a scalable architecture based on single-chip multiprocessing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[22] Brad Calder,et al. Time Varying Behavior of Programs , 1999 .
[23] D. Patterson,et al. Performance characterization of a quad Pentium Pro SMP using OLTP workloads , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).
[24] Min Xu,et al. Evaluating Non-deterministic Multi-threaded Commercial Workloads , 2001 .
[25] Sandhya Dwarkadas,et al. Execution-driven simulation of multiprocessors: address and timing analysis , 1994, TOMC.
[26] Frederic T. Chong,et al. HLS: combining statistical and symbolic simulation to guide microprocessor designs , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[27] André Seznec,et al. Choosing representative slices of program execution for microarchitecture simulations: a preliminary , 2000 .
[28] Doug Burger,et al. Measuring Experimental Error in Microprocessor Simulation , 2001, ISCA 2001.
[29] Steven R. Kunkel,et al. A multithreaded PowerPC processor for commercial servers , 2000, IBM J. Res. Dev..
[30] Maged M. Michael,et al. The design of COMPASS: an execution driven simulator for commercial applications running on shared memory multiprocessors , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.
[31] Trevor N. Mudge,et al. The YAGS branch prediction scheme , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[32] Milo M. K. Martin,et al. Bandwidth adaptive snooping , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[33] A. Veidenbaum,et al. The cedar system and an initial performance study , 1993, ISCA '93.
[34] Ramendra K. Sahoo,et al. MemorIES: a programmable, real-time hardware emulation tool for multiprocessor server design , 2000, SIGP.