The seamless approach to reconciling communication and locality in distributed memory parallel systems

With recent improvements in single CPU performance, several issues become more important in multiprocessor design. Two of these are interprocessor communication and locality. In parallel systems with fast CPUs, locality is vital to performance. However, traditional parallel programming models such as shared memory or message passing do not naturally lead to programs that exhibit locality. In the paper, the Seamless model for interprocessor communication is presented which is based on locality and that allows the programmer to explicitly manipulate a program's locality to optimize performance. Additionally, this model can support latency tolerance with proper hardware support. Extensions to the C programming language that support this model are also presented. Finally, a parallel program utilizing this model is provided to illustrate the paradigm.<<ETX>>