论文信息 - Towards a large number of pipeline processors in a tightly coupled multiprocessor using no cache

Towards a large number of pipeline processors in a tightly coupled multiprocessor using no cache

Need for performance exists in many scientific applications. The use of multiprocessor structures can not be avoided. The mapping of many applications on distributed supercomputers (e.g. hypercube structure) seems very difficult. On the other hand, performance on most of the large shared memory systems (CEDAR, RP3, ..) suffers from a very high latency of request on the shared memor; caches or local memories are often used to increase performance. Performance depends on a good management of the memory hierarchy (and of the synchronization mechanisms) by the programmer. In previous papers, we have pointed out that passing the WRITEs by the READs on a memory with hardware detection of Read After Write (RAW) hazards allows to reach correct performance on a pipeline processor on a very large spectrum of numerical algorithms even when using a memory with a high latency. It also enables to efficiently synchronize pipeline processors working directly on a shared memory in a relatively small tightly coupled multiprocessor (less than twenty pipeline processors). In this paper, we propose a possible structure of memory access for a tightly coupled multiprocessor with a large number of pipeline processors (64 or 256) working directly on a shared memory.

Yvon Jégou | André Seznec | André Seznec | Y. Jégou

[1] Kevin P. McAuliffe,et al. The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.

[2] Yvon Jégou,et al. Optimizing Memory Throughput In a Tightly Coupled Multiprocessor , 1987, ICPP.

[3] Yvon Jégou,et al. Data Synchronized Pipeline Architecture: Pipelining in Multiprocessor Environments , 1986, J. Parallel Distributed Comput..

[4] Yvon Jégou,et al. Synchronizing processors through memory requests in a tightly coupled multiprocessor , 1988, ISCA '88.

[5] Ralph Grishman,et al. The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer , 1983, IEEE Transactions on Computers.

[6] A. Gottleib,et al. The nyu ultracomputer- designing a mimd shared memory parallel computer , 1983 .

[7] Ron Cytron,et al. Doacross: Beyond Vectorization for Multiprocessors , 1986, ICPP.

[8] Daniel Gajski,et al. CEDAR: a large scale multiprocessor , 1983, CARN.

[9] Y. Jegou,et al. Synchronizing processors through memory requests in a tightly coupled multiprocessor , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.

[10] Yvon Jégou,et al. Address synchronized multiprocessor architecture , 1986 .