Multiphase Complete Exchange: A Theoretical Analysis

Complete exchange requires each of N processors to send a unique message to each of the remaining N-1 processors. For a circuit switched hypercube with N=2/sup d/ processors, the direct and standard algorithms for complete exchange are time optimal for very large and very small message sizes, respectively. For intermediate sizes, a hybrid multiphase algorithm is better. This carries out direct exchanges on a set of subcubes whose dimensions are a partition of the integer d. The best such algorithm for a given message size m could hitherto only be found by enumerating all partitions of d. The multiphase algorithm is analyzed assuming a high performance communication network. It is proved that only algorithms corresponding to equipartitions of d (partitions in which the maximum and minimum elements differ by at most one) can possibly be optimal. The runtimes of these algorithms plotted against m form a hull of optimality. It is proved that, although there is an exponential number of partitions: (1) the number of faces on this hull is /spl Theta/(/spl radic/(d)); (2) the hull can be found in /spl Theta/(/spl radic/(d)) time; and (3) once it has been found, the optimal algorithm for any given m can be found in /spl Theta/(log d) time. These results provide a very fast technique for minimizing communication overhead in many important applications, such as matrix transpose, fast Fourier transform, and alternating directions implicit (ADI).

[1]  Shahid H. Bokhari,et al.  Complete exchange on a circuit switched mesh , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..

[2]  I. Olkin,et al.  Inequalities: Theory of Majorization and Its Applications , 1980 .

[3]  E. Grosswald Topics from the theory of numbers , 1966 .

[4]  H. H. Rachford,et al.  The Numerical Solution of Parabolic and Elliptic Differential Equations , 1955 .

[5]  S. Lennart Johnsson,et al.  Algorithms for Matrix Transposition on Boolean n-Cube Configured Ensemble Architectures , 1988, ICPP.

[6]  J. Douglas,et al.  A general formulation of alternating direction methods , 1964 .

[7]  Shahid H. Bokhari,et al.  Multiphase Complete Exchange on a Circuit Switched Hypercube , 1994, ICPP.

[8]  Emil Grosswald,et al.  The Theory of Partitions , 1984 .

[9]  S. R. Seidel,et al.  Concurrent Bidirectional Communication On The Intel iPSC/860 And iPSC/2 , 1991, The Sixth Distributed Memory Computing Conference, 1991. Proceedings.

[10]  D. S. Scott,et al.  Efficient All-to-All Communication Patterns in Hypercube and Mesh Topologies , 1991, The Sixth Distributed Memory Computing Conference, 1991. Proceedings.

[11]  C. T. Howard Ho,et al.  Efficient communication primitives on hypercubes , 1992, Concurr. Pract. Exp..

[12]  Thomas H. Dunigan,et al.  Hypercube clock synchronization , 1991, Concurr. Pract. Exp..

[13]  Harold S. Stone,et al.  PAX Computer; High-Speed Parallel Processing and Scientific Computing , 1989 .