An empirical comparison of the Kendall Square Research KSR-1 and Stanford DASH multiprocessors

Two interesting variants of large-scale shared-address-space parallel architectures are cache-coherent non-uniform-memory-access machines (CC-NUMA) and cache-only memory architectures (COMA). Both have distributed main memory and use directory-based cache coherence. While both architectures migrate and replicate data at the cache level automatically under hardware control, COMA machines do this at the main memory level as well. The authors compare the parallel performance of a recent realization of each type of architecture, the Stanford DASH multiprocessor (CC-NUMA) and the Kendall Square Research KSR-1 (COMA). Using a suite of important computational kernels and complete scientific applications, they examine performance differences resulting both from the CC-NUMA/COMA nature of the machines as well as from specific differences in system implementation.

[1]  Anant Agarwal,et al.  APRIL: a processor architecture for multiprocessing , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[2]  Anoop Gupta,et al.  Scaling parallel programs for multiprocessors: methodology and examples , 1993, Computer.