Memory Referencing Behavior in Compiler-Parallelized Applications
暂无分享,去创建一个
[1] Margaret Martonosi,et al. MemSpy: analyzing memory system bottlenecks in programs , 1992, SIGMETRICS '92/PERFORMANCE '92.
[2] Anoop Gupta,et al. The directory-based cache coherence protocol for the DASH multiprocessor , 1990, ISCA '90.
[3] Michael L. Scott,et al. False sharing and its effect on shared memory performance , 1993 .
[4] Mary W. Hall,et al. Interprocedural Parallelization Analysis: A Case Study , 1995, PPSC.
[5] Livio Ricciulli,et al. The detection and elimination of useless misses in multiprocessors , 1993, ISCA '93.
[6] Anoop Gupta,et al. The Stanford FLASH multiprocessor , 1994, ISCA '94.
[7] John L. Hennessy,et al. Multiprocessor Simulation and Tracing Using Tango , 1991, ICPP.
[8] Anoop Gupta,et al. SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.
[9] James R. Larus,et al. Tempest and typhoon: user-level shared memory , 1994, ISCA '94.
[10] Chau-Wen Tseng,et al. Unified compilation techniques for shared and distributed address space machines , 1995, ICS '95.
[11] Steven W. K. Tjiang,et al. SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.
[12] Rudolf Eigenmann,et al. Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs , 1992, IEEE Trans. Parallel Distributed Syst..
[13] Chau-Wen Tseng,et al. Compiler optimizations for improving data locality , 1994, ASPLOS VI.
[14] Randy H. Katz,et al. The effect of sharing on the cache and bus performance of parallel programs , 1989, ASPLOS III.
[15] Mary W. Hall,et al. Detecting Coarse - Grain Parallelism Using an Interprocedural Parallelizing Compiler , 1995, Proceedings of the IEEE/ACM SC95 Conference.
[16] Margaret Martonosi,et al. Analyzing and tuning memory performance in sequential and parallel programs , 1994 .
[17] Randy H. Katz,et al. The effect of sharing on the cache and bus performance of parallel programs , 1989, ASPLOS 1989.
[18] C. Natarajan,et al. Measurement-based characterization of global memory and network contention, operating system and parallelisation overheads: case study on a shared-memory multiprocessor , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[19] Susan J. Eggers,et al. Reducing false sharing on shared memory multiprocessors through compile time data transformations , 1995, PPOPP '95.
[20] Pen-Chung Yew,et al. The effectiveness of caches and data prefetch buffers in large-scale shared memory multiprocessors , 1987 .
[21] David J. Lilja,et al. The Impact of Parallel Loop Scheduling Strategies on Prefetching in a Shared Memory Multiprocessor , 1994, IEEE Trans. Parallel Distributed Syst..
[22] Anoop Gupta,et al. Cache Invalidation Patterns in Shared-Memory Multiprocessors , 1992, IEEE Trans. Computers.
[23] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[24] Josep Torrellas,et al. False Sharing ans Spatial Locality in Multiprocessor Caches , 1994, IEEE Trans. Computers.
[25] Harry A. G. Wijshoff,et al. Managing pages in shared virtual memory systems: getting the compiler into the game , 1993, ICS '93.
[26] Stephen R. Goldschmidt,et al. Simulation of multiprocessors: accuracy and performance , 1993 .
[27] Susan J. Eggers,et al. Eliminating False Sharing , 1991, ICPP.
[28] Monica S. Lam,et al. Global optimizations for parallelism and locality on scalable parallel machines , 1993, PLDI '93.
[29] Sanjay Sharma,et al. Measurement-based characterization of global memory and network contention, operating system and parallelization overheads , 1994, ISCA '94.