Transposition of Banded Matrices in Hypercubes: A Nearly Isotropic Task

Abstract A class of communication tasks, called isotropic, was introduced in [15], and minimum completion time algorithms for all tasks in this class were found. Isotropic tasks are characterized by a type of symmetry with respect to origin node. In this paper we consider the problem of transposing a sparse matrix of size N × N with a diagonal band of size 2β + 1 + 1, which is stored by columns in a hypercube network of N = 2d processors. We propose an assignment of matrix columns to hypercube nodes such that the transposition becomes a ‘nearly isotropic’ task, that is, it looks ‘almost identical’ to all nodes. Under this assignment, we give an algorithm to transpose the matrix in 2β steps. We prove that the algorithm given is optimal over all affine assignments of columns to processors. We also derive a lower bound on the minimum number of steps required to transpose a banded matrix, which holds for any possible assignment of matrix columns to hypercube processors. In the case that 2β + 1 + 1 = Θ(Nc), for some constant c ϵ (0, 1], we prove that the completion time of our transposition algorithm is of the same order of magnitude with the lower bound. We further show that [ d β ] banded matrices, each of bandwidth 2β + 1 + 1, can be stored by columns in a hypercube so that all of them can be concurrently transposed in 2β + 1 steps. Finally, we modify our algorithms so that they apply to arbitrary matrix bandwidths and multiple column storage by each processor, while maintaining their efficiency.

[1]  Emmanouel A. Varvarigos,et al.  Communication algorithms for isotropic tasks in hypercubes and wraparound meshes , 1992, Parallel Comput..

[2]  Alan Edelman,et al.  Optimal Matrix Transposition and Bit Reversal on Hypercubes: All-to-All Personalized Communication , 1991, J. Parallel Distributed Comput..

[3]  M. H. Schultz,et al.  Topological properties of hypercubes , 1988, IEEE Trans. Computers.

[4]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[5]  S. Lennart Johnsson,et al.  Optimum Broadcasting and Personalized Communication in Hypercubes , 1989, IEEE Trans. Computers.

[6]  Leslie G. Valiant,et al.  Universal schemes for parallel communication , 1981, STOC '81.

[7]  John N. Tsitsiklis,et al.  Optimal Communication Algorithms for Hypercubes , 1991, J. Parallel Distributed Comput..

[8]  Dimitri P. Bertsekas,et al.  Linear network optimization - algorithms and codes , 1991 .

[9]  S. Lennart Johnsson,et al.  Algorithms for Matrix Transposition on Boolean n-Cube Configured Ensemble Architectures , 1988, ICPP.

[10]  C. T. Howard Ho,et al.  Efficient communication primitives on hypercubes , 1992, Concurr. Pract. Exp..

[11]  Yousef Saad,et al.  Data Communication in Hypercubes , 1989, J. Parallel Distributed Comput..

[12]  Oliver A. McBryan,et al.  Hypercube Algorithms and Implementations , 1985, PPSC.

[13]  Yousef Saad,et al.  Data communication in parallel architectures , 1989, Parallel Comput..

[14]  Alan Edelman,et al.  Index Transformation Algorithms in a Linear Algebra Framework , 1994, IEEE Trans. Parallel Distributed Syst..