: Data Prefetching In Shared Memory Multiprocessors

The trace driven simulation of 16 numerical subroutines is used to compare instruction lookahead and data prefetching with private caches in shared memory multiprocessors with hundreds or thousands of processors and memory modules interconnected with a pipelined network. These multiprocessors are characterized by long memory access delays that create a memory access bottleneck. Using the multiprocessor cache model for comparison, data prefetching is found to be more effective than caches in addressing the memory access bottleneck. 5 refs., 6 figs.