论文信息 - System architecture of parallel processing system -Harry-

System architecture of parallel processing system -Harry-

This paper proposes a parallel processing system -Harray- for scientific computations. Data flow computers are expected to obtain the high performance because they can extract parallelism fully from a program. However, they have many problems, such as the difficulty of controlling the sequence of execution. The -Harray- system is an array processor which adapts two levels of control mechanism; data flow execution in each processor and control flow between processors, in order to take full advantage of both mechanisms. A task which is assigned to a processor is called a “macro-block”. Three types of macro-blocking and three types of activation schemes for the macro-block which initiates its execution are introduced in order to attain the high performance. Moreover, a hardware synchronization mechanism is used to reduce synchronization overhead and to gain the liner speedup of the -Harray- system. In this paper, the system architecture of the -Harray- system and its performance evaluation by software simulation are presented.

[1] 村岡洋一,et al. Execution Mechanism of Parallel Processing System -Harray- (in Japanese) , 1988 .

[2] David A. Padua,et al. High-Speed Multiprocessors and Compilation Techniques , 1980, IEEE Transactions on Computers.

[3] David A. Padua,et al. Compiler Generated Synchronization for Do Loops , 1986, ICPP.

[4] Yoichi Muraoka,et al. Parallelism exposure and exploitation in programs , 1971 .

[5] Kenji Nishida,et al. Evaluation of a Prototype Data Flow Processor of the SIGMA-1 for Scientific Computations , 1986, ISCA.

[6] Ahmed Sameh,et al. The Illiac IV system , 1972 .

[7] Harry F. Jordan. Performance measurements on HEP - a pipelined MIMD computer , 1983, ISCA '83.

[8] Utpal Banerjee,et al. Time and Parallel Processor Bounds for Fortran-Like Loops , 1979, IEEE Transactions on Computers.

[9] Taisuke Boku,et al. (SM)2-II: a new version of the sparse matrix solving machine , 1985, ISCA '85.

[10] Alexander V. Veidenbaum. Compiler optimizations and architecture design issues for multiprocessors (parallel) , 1985 .

[11] Kenji Nishida,et al. Evaluation of a prototype data flow processor of the SIGMA-1 for scientific computations , 1986, ISCA 1986.

[12] G. H. Barnes,et al. A controllable MIMD architecture , 1986 .

[13] Satoshi Sekiguchi,et al. Highly Parallel Processor Array "PAX" for Wide Scientific Applications , 1983, ICPP.

[14] Ron Cytron,et al. Doacross: Beyond Vectorization for Multiprocessors , 1986, ICPP.

[15] Lalit M. Patnaik,et al. Design and Performance Evaluation of EXMAN: An EXtended MANchester Data Flow Computer , 1986, IEEE Transactions on Computers.

[16] Kiyoshi Asai,et al. Vectorization of the KENO-IV Code , 1986 .

[17] David J. Kuck,et al. HIGH-SPEED MULTIPROCESSORS AND THEIR COMPILERS. , 1979 .

[18] Yoichi Muraoka,et al. Major Research Activities in Parallel Processing in Japan , 1987, ICS.

[19] Taisuke Boku,et al. (SM) 2 -II: a new version of the sparse matrix solving machine , 1985, ISCA 1985.