BEAMFORMER USING PARALLELISM : ANALYSES AND EXPERIMENTS

Data-parallel implementations of the computationally intensive task of solving multiple quadratic forms (MQFs) have been examined. Coupled and uncoupled parallel methods are investigated, where coupling relates to the degree of interaction among the processors. Also, the impact of partitioning a large MQF problem into smaller non-interacting subtasks is studied. Trade-offs among the implementations for various data-size/machine-size ratios are categorized in terms of complex arithmetic operation counts, communication overhead, and memory storage requirements. Furthermore, the impact on performance of the mode of parallelism used is considered, specifically, SIMD versus MIMD versus SIMD/MIMD mixed-mode. From the complexity analyses, it is shown that none of the algorithms presented in this paper is best for all datasize/machine-size ratios. Thus, to achieve scalability (i.e., good performance as the number of processors available in a machine increases), instead of using a single algorithm, the approach proposed is to have a set of algorithms from which the most appropriate algorithm or combination of algorithms is selected based on the ratio calculated from the scaled machine size. The analytical results have been verified from experiments on the MasPar MP-1 (SIMD), nCUBE 2 (MIMD), and PASM (mixed-mode) prototype.

[1]  Michael J. Flynn,et al.  Very high-speed computing systems , 1966 .

[2]  Ahmed Sameh,et al.  The Illiac IV system , 1972 .

[3]  Kenneth E. Batcher STARAN parallel processor system hardware , 1974, AFIPS '74.

[4]  Duncan H. Lawrie,et al.  Access and Alignment of Data in an Array Processor , 1975, IEEE Transactions on Computers.

[5]  Kenneth E. Batcher,et al.  The flip network in staran , 1976 .

[6]  Marshall C. Pease,et al.  The Indirect Binary n-Cube Microprocessor Array , 1977, IEEE Transactions on Computers.

[7]  Kenneth E. Batcher,et al.  Design of a Massively Parallel Processor , 1980, IEEE Transactions on Computers.

[8]  Tse-Yun Feng,et al.  On a Class of Multistage Interconnection Networks , 1980, IEEE Transactions on Computers.

[9]  Sartaj Sahni,et al.  Parallel Matrix and Graph Algorithms , 1981, SIAM J. Comput..

[10]  Howard Jay Siegel,et al.  PASM: A Partitionable SIMD/MIMD System for Image Processing and Pattern Recognition , 1981, IEEE Transactions on Computers.

[11]  Kenneth E. Batcher,et al.  Bit-Serial Parallel Processing Systems , 1982, IEEE Transactions on Computers.

[12]  Howard Jay Siegel,et al.  Performance Measures for Evaluating Algorithms for SIMD Machines , 1982, IEEE Transactions on Software Engineering.

[13]  A. Gottleib,et al.  The nyu ultracomputer- designing a mimd shared memory parallel computer , 1983 .

[14]  Charles L. Seitz,et al.  The cosmic cube , 1985, CACM.

[15]  Robert H. Thomas,et al.  Performance Measurements on a 128-Node Butterfly Parallel Processor , 1985, ICPP.

[16]  Kevin P. McAuliffe,et al.  The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.

[17]  Robert Schreiber,et al.  Implementation of adaptive array algorithms , 1986, IEEE Trans. Acoust. Speech Signal Process..

[18]  Gilbert Strang,et al.  Introduction to applied mathematics , 1988 .

[19]  D J Kuck,et al.  Parallel Supercomputing Today and the Cedar Approach , 1986, Science.

[20]  G. Jack Lipovski,et al.  Parallel computing - theory and comparisons , 1987 .

[21]  Geoffrey C. Fox,et al.  Matrix algorithms on a hypercube I: Matrix multiplication , 1987, Parallel Comput..

[22]  Thomas L. Casavant,et al.  Non-Deterministic Instruction Time Experiments on the PASM System Prototype , 1988, ICPP.

[23]  R. Arlauskas,et al.  iPSC/2 system: a second generation hypercube , 1988, C3P.

[24]  L. W. Tucker,et al.  Architecture and applications of the Connection Machine , 1988, Computer.

[25]  G. C. Fox,et al.  Solving Problems on Concurrent Processors , 1988 .

[26]  D. J. Hunt AMT DAP—a processor array in a workstation environment , 1989 .

[27]  Trevor Mudge,et al.  Hypercube supercomputers , 1989, Proc. IEEE.

[28]  Tom Blank,et al.  The MasPar MP-1 architecture , 1990, Digest of Papers Compcon Spring '90. Thirty-Fifth IEEE Computer Society International Conference on Intellectual Leverage.

[29]  Thomas L. Casavant,et al.  Experimental Application-Driven Architecture Analysis of an SIMD/MIMD Parallel Processing System , 1990, IEEE Trans. Parallel Distributed Syst..

[30]  Cherri M. Pancake,et al.  Software Support for Parallel Computing: Where Are We headed? , 1991 .

[31]  Howard Jay Siegel,et al.  Instruction execution trade-offs for SIMD vs. MIMD vs. mixed mode parallelism , 1991, [1991] Proceedings. The Fifth International Parallel Processing Symposium.

[32]  Thomas L. Casavant,et al.  Experimental Analysis of a Mixed-Mode Parallel Architecture Using Bitonic Sequence Sorting , 1991, J. Parallel Distributed Comput..

[33]  Howard Jay Siegel,et al.  Modeling Overlapped Operation between the Control Unit and Processing Elements in an SIMD Machine , 1991, J. Parallel Distributed Comput..

[34]  Howard Jay Siegel,et al.  Limitations Imposed on Mixed-Mode Performance of Optimized Phases Due to Temporal Juxtaposition , 1991, J. Parallel Distributed Comput..

[35]  Howard Jay Siegel,et al.  Mapping computer-vision-related tasks onto reconfigurable parallel-processing systems , 1992, Computer.

[36]  Jack Dongarra,et al.  ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[37]  Howard Jay Siegel,et al.  Software Issues for the PASM Parallel Processing System , 1993 .

[38]  Michael Philippsen,et al.  Project Triton: towards improved programmability of parallel machines , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[39]  P. Sadayappan,et al.  Communication-Efficient Matrix Multiplication on Hypercubes , 1996, Parallel Comput..

[40]  Steve Rogers,et al.  Adaptive Filter Theory , 1996 .

[41]  Ross Smith,et al.  Efficient mapping and implementation of matrix algorithms on a hypercube , 1988, The Journal of Supercomputing.