Toward high communication performance through compiled communications on a circuit switched interconnection network

This paper discusses a new principle of interconnection network for massively parallel architectures in the field of numerical computation. The principle is motivated by an analysis of the application features and the need to design new kind of communication networks combining very high bandwidth, very low latency, performance independence to communication pattern or network load and a performance improvement proportional to the hardware performance improvement. Our approach is to associate compiled communications and a circuit switched interconnection network. This paper presents the motivations for this principle, the hardware and software issues and the design of a first prototype. The expected performance are a sustained aggregate bandwidth of more than 500 GBytes/s and an overall latency less than 270 ns, for a large implementation (4K inputs) with the current available technology.<<ETX>>

[1]  Robert W. Numrich,et al.  Measurement of Communication Rates on the Cray T3D Interprocessor Network , 1994, HPCN.

[2]  Quentin F. Stout,et al.  Reconfigurable SIMD massively parallel computers , 1991 .

[3]  V. Benes,et al.  Mathematical Theory of Connecting Networks and Telephone Traffic. , 1966 .

[4]  Mark Baker,et al.  An Evaluation of the Meiko CS-2 Using the GENESIS Benchmark Suite , 1994, HPCN.

[5]  Roger W. Hockney,et al.  Performance parameters and benchmarking of supercomputers , 1991, Parallel Comput..

[6]  Manoj Kumar,et al.  Unique design concepts in GF11 and their impact on performance , 1992, IBM J. Res. Dev..

[7]  Denis Trystram,et al.  Parallel algorithms and architectures , 1995 .

[8]  W. Daniel Hillis,et al.  The Network Architecture of the Connection Machine CM-5 , 1996, J. Parallel Distributed Comput..

[9]  Franck Delaplace Compilation des communications dans un langage data-parallèle pour les architectures à réseau à communications compilées , 1994 .

[10]  Sartaj Sahni,et al.  A Self-Routing Benes Network and Parallel Permutation Algorithms , 1981, IEEE Transactions on Computers.

[11]  W. Daniel Hillis,et al.  The network architecture of the Connection Machine CM-5 (extended abstract) , 1992, SPAA '92.

[12]  P. Messina,et al.  Architectural requirements of parallel scientific applications with explicit communication , 1993, ISCA '93.

[13]  Zhiyu Shen,et al.  An Empirical Study on Array Subscripts and Data Dependencies , 1989, ICPP.

[14]  Marek Behr,et al.  Parallel finite-element computation of 3D flows , 1993, Computer.

[15]  Seth Copen Goldstein,et al.  Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.

[16]  Anja Feldmann,et al.  Supporting sets of arbitrary connections on iWarp through communication context switches , 1993, SPAA '93.