Memory Referencing Behavior in Compiler-Parallelized Applications

[1]  Jeffrey Kuskin,et al.  Retrospective: the Stanford FLASH multiprocessor , 1998, ISCA '98.

[2]  Mary W. Hall,et al.  Detecting Coarse - Grain Parallelism Using an Interprocedural Parallelizing Compiler , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[3]  Mary W. Hall,et al.  Interprocedural Parallelization Analysis: A Case Study , 1995, PPSC.

[4]  Susan J. Eggers,et al.  Reducing false sharing on shared memory multiprocessors through compile time data transformations , 1995, PPOPP '95.

[5]  Chau-Wen Tseng,et al.  Unified compilation techniques for shared and distributed address space machines , 1995, ICS '95.

[6]  J. Singh,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[7]  Steven W. K. Tjiang,et al.  SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.

[8]  Chau-Wen Tseng,et al.  Compiler optimizations for improving data locality , 1994, ASPLOS VI.

[9]  David J. Lilja,et al.  The Impact of Parallel Loop Scheduling Strategies on Prefetching in a Shared Memory Multiprocessor , 1994, IEEE Trans. Parallel Distributed Syst..

[10]  Josep Torrellas,et al.  False Sharing ans Spatial Locality in Multiprocessor Caches , 1994, IEEE Trans. Computers.

[11]  Sanjay Sharma,et al.  Measurement-based characterization of global memory and network contention, operating system and parallelization overheads , 1994, ISCA '94.

[12]  J. Larus,et al.  Tempest and Typhoon: user-level shared memory , 1994, Proceedings of 21 International Symposium on Computer Architecture.

[13]  C. Natarajan,et al.  Measurement-based characterization of global memory and network contention, operating system and parallelisation overheads: case study on a shared-memory multiprocessor , 1994, Proceedings of 21 International Symposium on Computer Architecture.

[14]  Michael L. Scott,et al.  False sharing and its effect on shared memory performance , 1993 .

[15]  Harry A. G. Wijshoff,et al.  Managing pages in shared virtual memory systems: getting the compiler into the game , 1993, ICS '93.

[16]  Monica S. Lam,et al.  Global optimizations for parallelism and locality on scalable parallel machines , 1993, PLDI '93.

[17]  Michel Dubois,et al.  The Detection And Elimination Of Useless Misses In Multiprocessors , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[18]  Anoop Gupta,et al.  Cache Invalidation Patterns in Shared-Memory Multiprocessors , 1992, IEEE Trans. Computers.

[19]  Margaret Martonosi,et al.  MemSpy: analyzing memory system bottlenecks in programs , 1992, SIGMETRICS '92/PERFORMANCE '92.

[20]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[21]  Anoop Gupta,et al.  The directory-based cache coherence protocol for the DASH multiprocessor , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[22]  Randy H. Katz,et al.  The effect of sharing on the cache and bus performance of parallel programs , 1989, ASPLOS III.

[23]  J. Singh,et al.  DRAFT VERSION : PLEASE DO NOT DISTRIBUTE 3 2 The SPLASH-2 Application Suite , 1995 .

[24]  Margaret Martonosi,et al.  Analyzing and tuning memory performance in sequential and parallel programs , 1994 .

[25]  Stephen R. Goldschmidt,et al.  Simulation of multiprocessors: accuracy and performance , 1993 .

[26]  Rudolf Eigenmann,et al.  Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs , 1992, IEEE Trans. Parallel Distributed Syst..

[27]  John L. Hennessy,et al.  Multiprocessor Simulation and Tracing Using Tango , 1991, ICPP.

[28]  Susan J. Eggers,et al.  Eliminating False Sharing , 1991, ICPP.

[29]  Pen-Chung Yew,et al.  The effectiveness of caches and data prefetch buffers in large-scale shared memory multiprocessors , 1987 .