An evaluation study of a link-based data diffusion machine

The Data Diffusion Machine (DDM) is a virtual shared memory architecture in which the data has no home location. In this paper we present a preliminary evaluation of a link-based DDM, show the influence of the major design parameters, and show the performance of a set of eight benchmark programs. Most programs are sensitive to item size but to varying degrees. The associativity is unimportant for system performance, apparently because of the size of the associative memories. The hardware is shown to be quite well balanced. Although the network latency is a critical factor for some programs, others gain more from a faster processor or a faster memory. Preliminary scalability results are quite encouraging. Except for a few cases, scalability is limited more by the software overheads in the application than by the DDM. The results will improve further when realistic topologies can be evaluated and when programs are run where locality has been a design issue.

[1]  Seif Haridi,et al.  Data Diffusion Machine - A Scalable Shared Virtual Memory Multiprocessor , 1988, FGCS.

[2]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[3]  Anoop Gupta,et al.  The Stanford Dash multiprocessor , 1992, Computer.

[4]  Abhiram G. Ranade,et al.  How to emulate shared memory , 1991, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[5]  H. H. Wang,et al.  A Parallel Method for Tridiagonal Equations , 1981, TOMS.

[6]  Paul W. A. Stallard,et al.  Parallel evaluation of a parallel architecture by means of calibrated emulation , 1994, Proceedings of 8th International Parallel Processing Symposium.

[7]  Paul W. A. Stallard,et al.  The Data Diffusion Machine with a Scalable Point-to-Point Network , 1993 .

[8]  Charles E. Leiserson,et al.  Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.

[9]  S. Raina,et al.  Traffic patterns in a scalable multiprocessor through transputer emulation , 1992, Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences.

[10]  Kai Hwang,et al.  Computer architecture and parallel processing , 1984, McGraw-Hill Series in computer organization and architecture.

[11]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[12]  Steven A. Przybylski,et al.  Cache and memory hierarchy design: a performance-directed approach , 1990 .

[13]  Anant Agarwal,et al.  LimitLESS directories: A scalable cache coherence scheme , 1991, ASPLOS IV.