Both conventional wisdom and engineering practice hold that a massively parallel MIMD machine should be constructed using a large number of independent processors and an asynchronous interconnection network. In this paper, we suggest that it may be beneficial to implement a massively parallel MIMD using microcode on a massively parallel SIMD microengine; the synchronous nature of the system allows much higher performance to be obtained with simpler hardware. The primary disadvantage is simply that the SIMD microengine must serialize execution of different types of instructions but again the static nature of the machine allows various optimizations that can minimize this detrimental effect. In addition to presenting the theory behind construction of efficient MIMD machines using SIMD microengines, this paper discusses how the techniques were applied to create a 16,384processor shared memory barrier MIMD using a SIMD MasPar MP-1. Both the MIMD structure and benchmark results are presented. Even though the MasPar hardware is not ideal for implementing a MIMD and our microinterpreter was written in a high-level language (MPL), peak MIMD performance was 280 MFLOPS as compared to 1.2 GFLOPS for the native SIMD instruction set. Of course, comparing peak speeds is of dubious value; hence, we have also included a number of more realistic benchmark results.
[1]
Alexandru Nicolau,et al.
Advances in languages and compilers for parallel processing
,
1991
.
[2]
Harold S. Stone.
Database Applications of the FETCH-AND-ADD Instruction
,
1984,
IEEE Transactions on Computers.
[3]
Howard Jay Siegel,et al.
Instruction execution trade-offs for SIMD vs. MIMD vs. mixed mode parallelism
,
1991,
[1991] Proceedings. The Fifth International Parallel Processing Symposium.
[4]
M.J. Phillip,et al.
UNIFICATION OF SYNCHRONOUS AND ASYNCHRONOUS MODELS FOR PARALLEL PROGRAMMING LANGUAGES
,
1989
.
[5]
Alexandru Nicolau,et al.
Percolation scheduling for non-VLIW machines
,
1990
.
[6]
Henry G. Dietz,et al.
Common Subexpression Induction
,
1992,
ICPP.
[7]
Tom Blank,et al.
The MasPar MP-1 architecture
,
1990,
Digest of Papers Compcon Spring '90. Thirty-Fifth IEEE Computer Society International Conference on Intellectual Leverage.
[8]
W. Daniel Hillis,et al.
The connection machine
,
1985
.
[9]
Henry G. Dietz,et al.
PCCTS reference manual: version 1.00
,
1992,
SIGP.
[10]
H. Tanaka,et al.
MIND execution by SIMD computers
,
1990
.
[11]
Henry G. Dietz,et al.
Static synchronization beyond VLIW
,
1989,
Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).