MemSpy: analyzing memory system bottlenecks in programs
暂无分享,去创建一个
[1] Anoop Gupta,et al. Parallel ICCG on a hierarchical memory multiprocessor - Addressing the triangular solve bottleneck , 1990, Parallel Comput..
[2] James H. Patterson,et al. Portable Programs for Parallel Processors , 1987 .
[3] Norman P. Jouppi,et al. Computer technology and architecture: an evolving interaction , 1991, Computer.
[4] Larry Rudolph,et al. PIE: A Programming and Instrumentation Environment for Parallel Processing , 1985, IEEE Software.
[5] Anoop Gupta,et al. The DASH prototype: implementation and performance , 1992, ISCA '92.
[6] Ben Zorn,et al. A memory allocation profiler for c and lisp , 1988 .
[7] Helen Davis,et al. Tango: A Multiprocessor Simulation and Tracing System , 1990 .
[8] John L. Hennessy,et al. MTOOL: A Method for Isolating Memory Bottlenecks in Shared Memory Multiprocessor Programs , 1991, ICPP.
[9] John L. Hennessy,et al. Performance debugging shared memory multiprocessor programs with MTOOL , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[10] Ilya Gertner,et al. Non-intrusive and interactive profiling in parasight , 1988, PPEALS '88.
[11] E AndersonThomas,et al. Quartz: a tool for tuning parallel program performance , 1990 .
[12] James Arthur Kohl,et al. A Tool to Aid in the Design, Implementation, and Understanding of Matrix Algorithms for Parallel Processors , 1990, J. Parallel Distributed Comput..
[13] Anoop Gupta,et al. SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.
[14] Anoop Gupta,et al. Memory-reference characteristics of multiprocessor applications under MACH , 1988, SIGMETRICS '88.
[15] MartonosiMargaret,et al. MemSpy: analyzing memory system bottlenecks in programs , 1992 .
[16] Anoop Gupta,et al. Memory-reference characteristics of multiprocessor applications under MACH , 1988, SIGMETRICS 1988.
[17] John L. Hennessy,et al. Multiprocessor Simulation and Tracing Using Tango , 1991, ICPP.
[18] Thomas E. Anderson,et al. Quartz: a tool for tuning parallel program performance , 1990, SIGMETRICS '90.
[19] Michael E. Wolf,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[20] Susan L. Graham,et al. An execution profiler for modular programs , 1983, Softw. Pract. Exp..
[21] Iain S. Duff,et al. Sparse matrix test problems , 1982 .
[22] D LamMonica,et al. The cache performance and optimizations of blocked algorithms , 1991 .
[23] Anoop Gupta,et al. An evaluation of the Chandy-Misra-Bryant algorithm for digital logic simulation , 1991, TOMC.