An Evaluation of Software-Based Release Consistent Protocols

This paper presents an evaluation of three software implementations of release consistency. Release consistent protocols allow data communication to be aggregated and allow multiple writers to simultaneously modify a single page. We evaluated an eager invalidate protocol that enforces consistency when synchronization variables are released, a lazy invalidate protocol that enforces consistency when synchronization variables are acquired, and a lazy hybrid protocol that selectively uses update to reduce access misses. Our evaluation is based on implementations running on DECstation-5000/240s connected by an ATM LAN and on an execution-driven simulator that allows us to vary network parameters. Our results show that the lazy protocols consistently outperform the eager protocol for all but one application and that the lazy hybrid performs the best overall. However, the relative performance of the implementations is highly dependent on the relative speeds of the network, processor, and communication software. Lower bandwidths and high per-byte software communication costs favor the lazy invalidate protocol, while high bandwidths and low per-byte costs favor the hybrid. Performance of the eager protocol approaches that of the lazy protocols only when communication becomes essentially free.

[1]  Alan L. Cox,et al.  Lazy release consistency for software distributed shared memory , 1992, ISCA '92.

[2]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1986, PODC '86.

[3]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[4]  Alan L. Cox,et al.  Evaluation of release consistent software distributed shared memory on emerging network technology , 1993, ISCA '93.

[5]  J. Ott,et al.  Strategies for multilocus linkage analysis in humans. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[6]  James H. Patterson,et al.  Portable Programs for Parallel Processors , 1987 .

[7]  Brian N. Bershad,et al.  The Midway distributed shared memory system , 1993, Digest of Papers. Compcon Spring.

[8]  Willy Zwaenepoel,et al.  Implementation and performance of Munin , 1991, SOSP '91.

[9]  M. Hill,et al.  Weak ordering-a new definition , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[10]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[11]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[12]  A A Schäffer,et al.  Parallelization of general-linkage analysis problems. , 1994, Human heredity.

[13]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[14]  James R. Larus,et al.  Optimally profiling and tracing programs , 1992, POPL '92.