论文信息 - Critical performance path analysis, and efficient code generation issues, for the Seamless architecture

Critical performance path analysis, and efficient code generation issues, for the Seamless architecture

An analytical study of potential pathological performance areas of the Seamless architecture is presented. Seamless is a latency-tolerant, distributed memory, multiprocessor architecture. A key component of the philosophy of Seamless, however, is the use of standard, commodity components for a large part of the system. A discussion of the unavoidable implementation compromises imposed by this decision is presented, followed by a summary of some optimistic performance studies. Then an analytical study that parameterizes the predicts the worst-case impact of using standard components is provided. Finally, it is shown that these bottlenecks are manageable via careful generation of target machine code so that the optimistic performance studies become realistic expectations for a range of program behaviors and granularities.<<ETX>>

[1] Thomas L. Casavant,et al. Seamless - A Latency-Tolerant RISC-Based Multiprocessor Architecture , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[2] Donald Yeung,et al. THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR , 1991 .

[3] Anoop Gupta,et al. The DASH prototype: implementation and performance , 1992, ISCA '92.

[4] Thomas L. Casavant,et al. The seamless approach to reconciling communication and locality in distributed memory parallel systems , 1992, Proceedings Sixth International Parallel Processing Symposium.

[5] Thomas L. Casavant,et al. Hardware support for the Seamless programming model , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[6] Burton J. Smith,et al. The Horizon supercomputing system: architecture and software , 1988, Proceedings. SUPERCOMPUTING '88.

[7] Allan Porterfield,et al. The Tera computer system , 1990 .

[8] P. Close. The iPSC/2 node architecture , 1988, C3P.

[9] Ken Kennedy,et al. Compiler optimizations for Fortran D on MIMD distributed-memory machines , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[10] CORPORATE Ncube. The NCUBE family of high-performance parallel computer systems , 1988, C3P.

[11] Thomas L. Casavant,et al. A Preliminary Performance Evaluation of the Seamless Parallel Processing System Architecture , 1992, ICPP.

[12] H. C. Burg,et al. 1991 International Conference on Supercomputing , 1992, Parallel Comput..

[13] Thomas L. Casavant,et al. Seamless - a latency-tolerant RISC-based multiprocessor architecture (abstract) , 1992, ISCA '92.

[14] S. F. Nugent,et al. The iPSC/2 direct-connect communications technology , 1988, C3P.