Optimizing Home-Based Software DSM Protocols

Software DSMs can be categorized into homeless and home-based systems both have strengths and weaknesses when compared to each other. This paper introduces optimization methods to exploit advantages and offset disadvantages of the home-based protocol in the home-based software DSM JIAJIA. The first optimization reduces the overhead of writes to home pages through a lazy home page write detection scheme. The normal write detection scheme write-protects shared pages at the beginning of a synchronization interval, while the lazy home page write detection delays home page write-protecting until the page is first fetched in the interval so that home pages that are not cached by remote processors do not need to be write-protected. The second optimization avoids fetching the whole page on a page fault through dividing a page into blocks and fetching only those blocks that are dirty with respect to the faulting processor. A write vector table is maintained for each shared page in its home to record for each processor which block(s) has been modified since the processor fetched the page last time. The third optimization adaptively migrates home of a page to the processor most frequently writes to the page to reduce twin and diff overhead. Migration information is piggybacked on barrier messages and no additional communication is required for the migration. Performance evaluation with some well-accepted benchmarks and real applications shows that the above optimization methods can reduce page faults, message amounts, and diffs dramatically and consequently improve performance significantly.

[1]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[2]  John K. Bennett,et al.  Brazos: a third generation DSM system , 1997 .

[3]  A A Schäffer,et al.  Avoiding recomputation in linkage analysis. , 1994, Human heredity.

[4]  Alan L. Cox,et al.  Software DSM protocols that adapt between single writer and multiple writer , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[5]  Alan L. Cox,et al.  Quantifying the Performance Differences between PVM and TreadMarks , 1997, J. Parallel Distributed Comput..

[6]  J. Ott,et al.  Strategies for multilocus linkage analysis in humans. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Srinivasan Parthasarathy,et al.  Cashmere-2L: software coherent shared memory on a clustered remote-write network , 1997, SOSP.

[8]  Peter J. Keleher Symmetry and performance in consistency protocols , 1999, ICS '99.

[9]  Weisong Shi,et al.  Reducing system overheads in home-based software DSMs , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[10]  Liviu Iftode,et al.  Shared virtual memory with automatic update support , 1999, ICS '99.

[11]  Peter J. Keleher,et al.  The relative importance of concurrent writers and weak consistency models , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[12]  Brian N. Bershad,et al.  The Midway distributed shared memory system , 1993, Digest of Papers. Compcon Spring.

[13]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[14]  Willy Zwaenepoel,et al.  Implementation and performance of Munin , 1991, SOSP '91.

[15]  Kai Li,et al.  IVY: A Shared Virtual Memory System for Parallel Computing , 1988, ICPP.

[16]  Alan L. Cox,et al.  TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , 1994, USENIX Winter.

[17]  Weisong Shi,et al.  Running real applications on software DSMs , 2000, Proceedings Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region.

[18]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, ISCA '90.

[19]  A A Schäffer,et al.  Parallelization of general-linkage analysis problems. , 1994, Human heredity.

[20]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[21]  Liviu Iftode,et al.  Home-based shared virtual memory , 1998 .

[22]  M. C. Ng Adaptive Schemes for Home-based DSM Systems , 1999 .

[23]  Eyal de Lara,et al.  A performance comparison of homeless and home-based lazy release consistency protocols in software shared memory , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[24]  Cho-Li Wang,et al.  A Migrating-Home Protocol for Implementing Scope Consistency Model on a Cluster of Workstations , 1999, PDPTA.

[25]  Liviu Iftode,et al.  Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems , 1996, OSDI '96.