Parallel Algorithms and Subcube Embedding on a Hypercube
暂无分享,去创建一个
It is well known that the connection in a hypercube multiprocessor is rich enough to allow the embedding of a variety of topologies within it. For a given problem, the best choice of topology is naturally the one that incurs the least amount of communication and allows parallel execution of as many tasks as possible. In a previous paper we proposed efficient parallel algorithms for performing QR factorization on a hypercube multiprocessor, where the hypercube network is configured as a two-dimensional subcube-grid with an aspect ratio optimally chosen for each problem. In view of the very substantial net saving in execution time and storage usage obtained in performing QR factorization on an optimally configured subcube-grid, similar strategies are developed in this work to provide highly efficient implementations for three fundamental numerical algorithms: Gaussian elimination with partial pivoting, QR factorization with column pivoting, and multiple least squares updating. Timing results on Intel iPSC/2...