Experimental analysis of communication/data-conditional aspects of a mixed-mode parallel architecture via synthetic computations

Experimentation aimed at determining the potential benefit of mixed-mode SIMD/MIMD (single instruction, multiple data/multiple instruction, multiple data) parallel architectures is reported. The experimentation is based on timing measurements made on the PASM (Partitionable SIMD/MIMD) system prototype utilizing carefully coded synthetic variations of a well-known algorithm. The synthetic algorithms used to measure and evaluate this system were based on bitonic sorting of sequences stored in the processing elements. This computation was mapped to both the hybrids of the SIMD and MIMD modes. The computations were coded in these four ways and experiments were performed that explore the tradeoffs among them. The results of these experiments are presented and discussed, with special consideration of the effects of the system's architecture. The goal is to obtain implementation-independent analyses of the attributes of mixed-mode parallel processing with respect to the computational characteristics of the application being examined.<<ETX>>

[1]  Brian T. Smith,et al.  Matrix Eigensystem Routines — EISPACK Guide , 1974, Lecture Notes in Computer Science.

[2]  Thomas L. Casavant,et al.  Efficient masking techniques for large-scale SIMD architectures , 1990, [1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation.

[3]  Anoop Gupta,et al.  The VMP multiprocessor: initial experience, refinements, and performance evaluation , 1988, ISCA '88.

[4]  Howard Jay Siegel,et al.  Interconnection networks for large-scale parallel processing: theory and case studies (2nd ed.) , 1985 .

[5]  B. S. Garbow,et al.  Matrix Eigensystem Routines — EISPACK Guide , 1974, Lecture Notes in Computer Science.

[6]  Wen-mei W. Hwu,et al.  Exploiting parallel microprocessor microarchitectures with a compiler code generator , 1988, ISCA '88.

[7]  Robert E. Benner,et al.  Development of Parallel Methods for a $1024$-Processor Hypercube , 1988 .

[8]  M. Auguin,et al.  The OPSILA computer , 1986 .

[9]  William Jalby,et al.  Optimizing Matrix Operations on a Parallel Multiprocessor with a Memory Hierarchical System , 1986, ICPP.

[10]  Jack J. Dongarra,et al.  Performance of various computers using standard linear equations software in a FORTRAN environment , 1988, CARN.

[11]  Kai Hwang,et al.  A bit-plane architecture for optical computing with two-dimensional symbolic substitution , 1988, ISCA '88.

[12]  J. H. Griffin,et al.  Los Alamos National Laboratory computer benchmarking, 1983 , 1984 .

[13]  Jack J. Dongarra,et al.  Performance of various computers using standard linear equations software in a Fortran environment , 1987, SGNM.

[14]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.

[15]  F. H. Mcmahon,et al.  The Livermore Fortran Kernels: A Computer Test of the Numerical Performance Range , 1986 .

[16]  Yoichi Muraoka,et al.  On the Number of Operations Simultaneously Executable in Fortran-Like Programs and Their Resulting Speedup , 1972, IEEE Transactions on Computers.

[17]  Thomas L. Casavant,et al.  Experimental Benchmarks and Initial Evaluation of the Performance of the PASM System Prototype , 1988 .

[18]  Gérard Giraudon,et al.  Image processing on a SIMD/SPMD architecture: OPSILA , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[19]  Michael J. Quinn,et al.  Designing Efficient Algorithms for Parallel Computers , 1987 .

[20]  Daniel P. Siewiorek,et al.  Parallel processing: the Cm* experience , 1986 .

[21]  Henry G. Dietz,et al.  Static synchronization beyond VLIW , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[22]  Stavros A. Zenios,et al.  The connection machines CM-1 and CM-2: solving nonlinear network problems , 1988, ICS '88.

[23]  Edward F. Gehringer,et al.  The Cm* Testbed , 1982, Computer.

[24]  B. S. Garbow,et al.  Matrix Eigensystem Routines — EISPACK Guide , 1974, Lecture Notes in Computer Science.

[25]  V. Klema LINPACK user's guide , 1980 .

[26]  Harold S. Stone,et al.  Parallel Processing with the Perfect Shuffle , 1971, IEEE Transactions on Computers.

[27]  Robert H. Thomas,et al.  Performance Measurements on a 128-Node Butterfly Parallel Processor , 1985, ICPP.

[28]  Howard Jay Siegel,et al.  The Extra Stage Cube: A Fault-Tolerant Interconnection Network for Supersystems , 1982, IEEE Transactions on Computers.

[29]  Howard Jay Siegel,et al.  A Model of SIMD Machines and a Comparison of Various Interconnection Networks , 1979, IEEE Transactions on Computers.

[30]  Donald E. Knuth,et al.  The Art of Computer Programming, Vol. 3: Sorting and Searching , 1974 .

[31]  Stanley Y. W. Su,et al.  Matrix Operations on a Multicomputer System with Switchable Main Memory Modules and Dynamic Control , 1987, IEEE Transactions on Computers.

[32]  Thomas L. Casavant,et al.  Non-Deterministic Instruction Time Experiments on the PASM System Prototype , 1988, ICPP.

[33]  H. T. Kung,et al.  The Warp Computer: Architecture, Implementation, and Performance , 1987, IEEE Transactions on Computers.

[34]  Lalit M. Patnaik,et al.  Design and Performance Evaluation of EXMAN: An EXtended MANchester Data Flow Computer , 1986, IEEE Transactions on Computers.

[35]  Philip Heidelberger,et al.  Computer Performance Evaluation Methodology , 1984, IEEE Transactions on Computers.

[36]  Alexander V. Veidenbaum,et al.  EFFECTS OF PROGRAM RESTRUCTURING, ALGORITHM CHANGE, AND ARCHITECTURE CHOICE ON PROGRAM PERFORMANCE. , 1984 .

[37]  G. H. Barnes,et al.  A controllable MIMD architecture , 1986 .

[38]  Vason P. Srini,et al.  Analysis of Cray-1S architecture , 1983, ISCA '83.

[39]  Kim P. Gostelow,et al.  Performance of a Simulated Dataflow Computer , 1980, IEEE Transactions on Computers.

[40]  Jack L. Rosenfeld,et al.  A case study in programming for parallel-processors , 1969, CACM.

[41]  Howard Jay Siegel,et al.  PASM: A Partitionable SIMD/MIMD System for Image Processing and Pattern Recognition , 1981, IEEE Transactions on Computers.

[42]  Thomas L. Casavant,et al.  Experimental Application-Driven Architecture Analysis of an SIMD/MIMD Parallel Processing System , 1990, IEEE Trans. Parallel Distributed Syst..