Future building blocks for parallel architectures

Early parallel architectures where shared memory systems (UMA, NUMA), which had the disadvantage of the shared memory bottleneck that limited the scalability of the system. In contrast, distributed memory architectures with message passing (NORMAs) provided any desired scalability; however, at the cost of a substantial communication latency. The latency could be reduced by custom communication hardware (examples: SUPRENUM, MANNA) yet since there was still a software routine involved, the remaining latency was in the order of microseconds. Therefore, and because of the simpler programming model of shared memory, it became the trend of the nineties to return to UMAs and NUMAs, employing powerful communication hardware to minimize the remote memory access time.