System architecture of parallel processing system -Harry-

This paper proposes a parallel processing system -Harray- for scientific computations. Data flow computers are expected to obtain the high performance because they can extract parallelism fully from a program. However, they have many problems, such as the difficulty of controlling the sequence of execution. The -Harray- system is an array processor which adapts two levels of control mechanism; data flow execution in each processor and control flow between processors, in order to take full advantage of both mechanisms. A task which is assigned to a processor is called a “macro-block”. Three types of macro-blocking and three types of activation schemes for the macro-block which initiates its execution are introduced in order to attain the high performance. Moreover, a hardware synchronization mechanism is used to reduce synchronization overhead and to gain the liner speedup of the -Harray- system. In this paper, the system architecture of the -Harray- system and its performance evaluation by software simulation are presented.

[1]  村岡 洋一,et al.  Execution Mechanism of Parallel Processing System -Harray- (in Japanese) , 1988 .

[2]  David A. Padua,et al.  High-Speed Multiprocessors and Compilation Techniques , 1980, IEEE Transactions on Computers.

[3]  David A. Padua,et al.  Compiler Generated Synchronization for Do Loops , 1986, ICPP.

[4]  Yoichi Muraoka,et al.  Parallelism exposure and exploitation in programs , 1971 .

[5]  Kenji Nishida,et al.  Evaluation of a Prototype Data Flow Processor of the SIGMA-1 for Scientific Computations , 1986, ISCA.

[6]  Ahmed Sameh,et al.  The Illiac IV system , 1972 .

[7]  Harry F. Jordan Performance measurements on HEP - a pipelined MIMD computer , 1983, ISCA '83.

[8]  Utpal Banerjee,et al.  Time and Parallel Processor Bounds for Fortran-Like Loops , 1979, IEEE Transactions on Computers.

[9]  Taisuke Boku,et al.  (SM)2-II: a new version of the sparse matrix solving machine , 1985, ISCA '85.

[10]  Alexander V. Veidenbaum Compiler optimizations and architecture design issues for multiprocessors (parallel) , 1985 .

[11]  Kenji Nishida,et al.  Evaluation of a prototype data flow processor of the SIGMA-1 for scientific computations , 1986, ISCA 1986.

[12]  G. H. Barnes,et al.  A controllable MIMD architecture , 1986 .

[13]  Satoshi Sekiguchi,et al.  Highly Parallel Processor Array "PAX" for Wide Scientific Applications , 1983, ICPP.

[14]  Ron Cytron,et al.  Doacross: Beyond Vectorization for Multiprocessors , 1986, ICPP.

[15]  Lalit M. Patnaik,et al.  Design and Performance Evaluation of EXMAN: An EXtended MANchester Data Flow Computer , 1986, IEEE Transactions on Computers.

[16]  Kiyoshi Asai,et al.  Vectorization of the KENO-IV Code , 1986 .

[17]  David J. Kuck,et al.  HIGH-SPEED MULTIPROCESSORS AND THEIR COMPILERS. , 1979 .

[18]  Yoichi Muraoka,et al.  Major Research Activities in Parallel Processing in Japan , 1987, ICS.

[19]  Taisuke Boku,et al.  (SM) 2 -II: a new version of the sparse matrix solving machine , 1985, ISCA 1985.