Evaluating Two Loop Transformations for Reducing Multiple Writer False Sharing

Shared virtual memory (SVM) simplifies the programming of parallel systems with memory hierarchies and physically distributed address spaces, by providing the illusion of a flat global address space where coherency is maintained at the page level. The success of the SVM abstraction depends on efficient page management, which in turn depends on the efficient handling of false sharing and the resulting ping-pong effects that it can cause. We evaluate two loop transformations for attacking this problem. The first is a simple, new technique for reducing the ping-pong effects that result from multiple-writer false sharing. The second is our previously-proposed technique for eliminating multiple-writer false sharing itself. Both have been implemented in the Fortran-S compiler, which generates code that runs on the iPSC/2 under the KOAN SVM. Preliminary performance results are presented.

[1]  Ravi Mirchandaney,et al.  Improving the performance of DSM systems via compiler involvement , 1994, Proceedings of Supercomputing '94.

[2]  Josep Torrellas,et al.  False Sharing ans Spatial Locality in Multiprocessor Caches , 1994, IEEE Trans. Computers.

[3]  Thierry Priol,et al.  KOAN: A Shared Virtual Memory for the iPSC/2 Hypercube , 1992, CONPAR.

[4]  F. Bodin,et al.  Fortran-S: a Fortran interface for shared virtual memory architectures , 1993, Supercomputing '93.

[5]  Alan L. Cox,et al.  TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , 1994, USENIX Winter.

[6]  William F. Appelbe,et al.  Program Transformation for Locality Using Affinity Regions , 1993, LCPC.

[7]  Ken Kennedy,et al.  Optimizing for parallelism and data locality , 1992, ICS '92.

[8]  Mi Lu,et al.  A Solution of Cache Ping-Pong Problem in RISC Based Parallel Processing Systems , 1991, ICPP.

[9]  Elana D. Granston Toward a Compile-Time Methodology for Reducing False Sharing and Communication Traffic in Shared Virtual Memory Systems , 1993, LCPC.

[10]  Susan J. Eggers,et al.  Eliminating False Sharing , 1991, ICPP.

[11]  William Jalby,et al.  A Quantitative Algorithm for Data Locality Optimization , 1991, Code Generation.

[12]  Monica S. Lam,et al.  Global optimizations for parallelism and locality on scalable parallel machines , 1993, PLDI '93.

[13]  Michael L. Scott,et al.  Simple but effective techniques for NUMA memory management , 1989, SOSP '89.

[14]  Kai Li,et al.  Shared virtual memory on loosely coupled multiprocessors , 1986 .

[15]  V. Klema LINPACK user's guide , 1980 .

[16]  Paul Feautrier,et al.  A New Solution to Coherence Problems in Multicache Systems , 1978, IEEE Transactions on Computers.

[17]  Monica S. Lam,et al.  A data locality optimizing algorithm , 1991, PLDI '91.