Design of scalable shared-memory multiprocessors: the DASH approach

The DASH (directory architecture for shared-memory) multiprocessor, which combines the programmability of shared-memory machines with the scalability of message-passing machines, is described. Hardware-supported coherent caches provide for low-latency access of shared data and ease of programming. Caches are kept coherent by means of a distributed directory-based protocol. Shared memory in the machine is distributed among the processing nodes, and scalable memory bandwidth is provided by connecting the nodes through a general interconnection network. The prototype DASH machine will consists of 64 high-performance microprocessors, with an aggregate performance of over 1200 MIPS and 250 scalar MFLOPS. The fundamental premise in DASH is that it is possible to build a scalable shared-memory machine with hardware-supported coherent caches by using a distributed directory-based cache coherence protocol. The mechanisms for providing scalable memory bandwidth, reducing and tolerating memory latency, and supporting efficient synchronization are described. A brief description of the machine's implementation is given.<<ETX>>