A sorter-based architecture for a parallel implementation of communication intensive algorithms

This paper deals with the parallel execution of algorithms with global and/or irregular data dependencies on a regularly and locally connected processor array. The associated communication problems are solved by the use of a two-dimensional sorting algorithm. The proposed architecture, which is based on a two-dimensional sorting network, offers a high degree of flexibility and allows an efficient mapping of many irregularly structured algorithms. In this architecture a one-dimensional processor array performs all required control and arithmetic operations, whereas the sorter solves complex data transfer problems. The storage capability of the sorting network is also used as memory for data elements. The algorithms for sparse matrix computations, fast Fourier transformation and for the convex hull problem, which are mapped onto this architecture, as well as the simulation of a shared-memory computer show that the utilization of the most complex components, the processors, is O(1).

[1]  Donald E. Knuth,et al.  The Art of Computer Programming, Vol. 3: Sorting and Searching , 1974 .

[2]  U. Schwiegelshohn A shortperiodic two-dimensional systolic sorting algorithm , 1988, [1988] Proceedings. International Conference on Systolic Arrays.

[3]  Viktor K. Prasanna,et al.  Efficient VLSI Implementation of Iterative Solutions to Sparse Linear Systems , 1993, Parallel Comput..

[4]  Marshall C. Pease,et al.  An Adaptation of the Fast Fourier Transform for Parallel Processing , 1968, JACM.

[5]  H. T. Kung,et al.  Sorting on a mesh-connected parallel computer , 1977, CACM.

[6]  Josef G. Krammer,et al.  A fault-tolerant two-dimensional sorting network , 1990, [1990] Proceedings of the International Conference on Application Specific Array Processors.

[7]  Selim G. Akl,et al.  Design and analysis of parallel algorithms , 1985 .

[8]  Harold S. Stone,et al.  Parallel Processing with the Perfect Shuffle , 1971, IEEE Transactions on Computers.

[9]  J. G. Krammer Parallel processing with a sorting network , 1990, IEEE International Symposium on Circuits and Systems.

[10]  Marshall C. Pease,et al.  The Indirect Binary n-Cube Microprocessor Array , 1977, IEEE Transactions on Computers.

[11]  Sartaj Sahni,et al.  Bitonic Sort on a Mesh-Connected Parallel Computer , 1979, IEEE Transactions on Computers.

[12]  Uwe Schwiegelshohn,et al.  Sparse matrix-vector multiplication on a systolic array , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[13]  K. Wojtek Przytula,et al.  Fast Fourier Transform Algorithm For Two-Dimensional Array Of Processors , 1988, Optics & Photonics.

[14]  Sartaj Sahni,et al.  An optimal routing algorithm for mesh-connected Parallel computers , 1980, JACM.

[15]  Guy E. Blelloch,et al.  Scans as Primitive Parallel Operations , 1989, ICPP.

[16]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[17]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[18]  Kai Hwang,et al.  An Orthogonal Multiprocessor for Parallel Scientific Computations , 1989, IEEE Trans. Computers.

[19]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.

[20]  Isaac D. Scherson,et al.  Parallel Sorting in Two-Dimensional VLSI Models of Computation , 1989, IEEE Trans. Computers.

[21]  Thompson The VLSI Complexity of Sorting , 1983, IEEE Transactions on Computers.

[22]  Fenguangzhai Song CD , 1992 .