Large-scale parallel processing systems

Abstract Parallel processing is an area of growing interest to the computer science and engineering communities. This paper is an introduction to some of the concepts involved in the design and use of large-scale parallel systems. Parallel machines that are classified as SIMD (synchronous) and MIMD (asynchronous) systems, composed of a large number of microprocessors, are explored. Parallel algorithms are examined, using image smoothing, recursive doubling and contour tracing as examples. Single stage and multistage networks are discussed. The single stage Cube, PM21, Four Nearest Neighbor and Shuffle-Exchange networks are presented, and the multistage Cube network is described. Case studies of three microprocessor-based systems are given as examples of parallel machine designs, specifically the MPP SIMD machine, the Ultracomputer MIMD system, and the PASM SIMD/MIMD machine.

[1]  Stephen F. Lundstrom,et al.  Design and Validation of a Connection Network for Many-Processor Multiprocessor Systems , 1981, Computer.

[2]  Kenneth J. Thurber Distributed-processor communication architecture , 1979 .

[3]  Tomás Lang,et al.  Interconnections Between Processors and Memory Modules Using the Shuffle-Exchange Network , 1976, IEEE Transactions on Computers.

[4]  M. Yoder,et al.  Simulation of a highly parallel system for word recognition , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Marshall C. Pease,et al.  The Indirect Binary n-Cube Microprocessor Array , 1977, IEEE Transactions on Computers.

[6]  W. Daniel Hillis,et al.  The connection machine , 1985 .

[7]  Sartaj Sahni,et al.  Optimal BPC Permutations on a Cube Connected SIMD Computer , 1982, IEEE Transactions on Computers.

[8]  Kevin P. McAuliffe,et al.  The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.

[9]  Tse-yun Feng,et al.  A Survey of Interconnection Networks , 1981, Computer.

[10]  SahniSartaj,et al.  An optimal routing algorithm for mesh-connected Parallel computers , 1980 .

[11]  Cauligi S. Raghavendra,et al.  The Gamma network: A multiprocessor interconnection network with redundant paths , 1982, ISCA.

[12]  Jeffrey A. Fessler,et al.  PARALLEL IMAGE THINNING AND VECTORIZATION ON PASM. , 1985 .

[13]  David A. Padua,et al.  Interconnection Networks Using Shuffles , 1981, Computer.

[14]  Howard Jay Siegel,et al.  Parallel algorithm for contour extraction: advantages and architectural implications , 1983 .

[15]  Kai Hwang,et al.  Computer architecture and parallel processing , 1984, McGraw-Hill Series in computer organization and architecture.

[16]  Kenneth E. Batcher,et al.  Design of a Massively Parallel Processor , 1980, IEEE Transactions on Computers.

[17]  Solomon W. Golomb,et al.  Permutations by Cutting and Shuffling , 1961 .

[18]  Howard Jay Siegel,et al.  The PASM Parallel System Prototype , 1985, COMPCON.

[19]  Svetlana P. Kartashev,et al.  A Multicomputer System with Dynamic Architecture , 1979, IEEE Transactions on Computers.

[20]  Ahmed Sameh,et al.  The Illiac IV system , 1972 .

[21]  Howard Jay Siegel,et al.  An emulator network for SIMD machine interconnection networks , 1979, ISCA '79.

[22]  Paul B. Johnson Congruences and Card Shuffling , 1956 .

[23]  Tse-Yun Feng,et al.  Data Manipulating Functions in Parallel Processors and Their Implementations , 1974, IEEE Transactions on Computers.

[24]  J. Robert Heath,et al.  Classification Categories and Historical Development of Circuit Switching Topologies , 1983, CSUR.

[25]  V. Benes,et al.  Mathematical Theory of Connecting Networks and Telephone Traffic. , 1966 .

[26]  Leah J. Siegel,et al.  SIMD Image Resampling , 1982, IEEE Transactions on Computers.

[27]  Howard Jay Siegel,et al.  Analysis Techniques for SIMD Machine Interconnection Networks and the Effects of Processor Address Masks , 1977, IEEE Transactions on Computers.

[28]  Robert J. McMillen,et al.  A survey of interconnection methods for reconfigurable parallel processing systems* , 1899, 1979 International Workshop on Managing Requirements Knowledge (MARK).

[29]  Charles L. Seitz,et al.  The cosmic cube , 1985, CACM.

[30]  S. Levialdi,et al.  Languages and architectures for image processing , 1981 .

[31]  Kenneth E. Batcher STARAN parallel processor system hardware , 1974, AFIPS '74.

[32]  Sartaj Sahni,et al.  An optimal routing algorithm for mesh-connected Parallel computers , 1980, JACM.

[33]  Karsten Schwan,et al.  Software management of Cm*: a distributed multiprocessor , 1977, AFIPS '77.

[34]  Harold S. Stone,et al.  Parallel Processing with the Perfect Shuffle , 1971, IEEE Transactions on Computers.

[35]  Gerald M. Masson,et al.  A Sampler of Circuit Switching Networks , 1979, Computer.

[36]  Janak H. Patel Performance of Processor-Memory Interconnections for Multiprocessors , 1981, IEEE Transactions on Computers.

[37]  Gary J. Nutt A Parallel Processor Operating System Comparison , 1977, IEEE Transactions on Software Engineering.

[38]  Howard Jay Siegel,et al.  Interconnection networks for large-scale parallel processing: theory and case studies (2nd ed.) , 1985 .

[39]  Howard Jay Siegel,et al.  PASM: A Partitionable SIMD/MIMD System for Image Processing and Pattern Recognition , 1981, IEEE Transactions on Computers.

[40]  Robert J. McMillen,et al.  The Multistage Cube: A Versatile Interconnection Network , 1981, Computer.

[41]  Howard Jay Siegel The Theory Underlying the Partitioning of Permutation Networks , 1980, IEEE Transactions on Computers.

[42]  Jean-Loup Baer,et al.  Computer systems architecture , 1980 .

[43]  Richard J. Swan,et al.  The implementation of the Cm* multi-microprocessor , 1899, AFIPS '77.

[44]  Mark S. Gerhardt,et al.  Programmable Radar Signal Processing Using the Rap , 1974, Sagamore Computer Conference.

[45]  Howard Jay Siegel,et al.  Parallel Processing Approaches to Image Correlation , 1982, IEEE Transactions on Computers.

[46]  Pen-Chung Yew,et al.  Performance of packet switching in buffered single-stage shuffle-exchange networks , 1982, ICDCS.

[47]  Tse-Yun Feng,et al.  The Universality of the Shuffle-Exchange Network , 1981, IEEE Transactions on Computers.

[48]  H. T. Kung,et al.  Sorting on a mesh-connected parallel computer , 1977, CACM.

[49]  Howard Jay Siegel,et al.  The Extra Stage Cube: A Fault-Tolerant Interconnection Network for Supersystems , 1982, IEEE Transactions on Computers.

[50]  Sartaj Sahni,et al.  Bitonic Sort on a Mesh-Connected Parallel Computer , 1979, IEEE Transactions on Computers.

[51]  A. E. Filip A distributed signal processing architecture , 1982, ICDCS.

[52]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[53]  Jack B. Dennis,et al.  Building blocks for data flow prototypes , 1980, ISCA '80.

[54]  Howard Jay Siegel,et al.  Study of multistage SIMD interconnection networks , 1978, ISCA '78.

[55]  Sartaj Sahni,et al.  Parallel permutation and sorting algorithms and a new generalized connection network , 1982, JACM.

[56]  Howard Jay Siegel,et al.  A Model of SIMD Machines and a Comparison of Various Interconnection Networks , 1979, IEEE Transactions on Computers.

[57]  Theodore R. Bashkow,et al.  A large scale, homogeneous, fully distributed parallel machine, I , 1977, ISCA '77.

[58]  Howard Jay Siegel,et al.  FFT Algorithms for SIMD Parallel Processing Systems , 1986, J. Parallel Distributed Comput..

[59]  Robert J. McMillen,et al.  Using the Augmented Data Manipulator Network in PASM , 1981, Computer.

[60]  TOMAS LANG,et al.  A Shuffle-Exchange Network with Simplified Control , 1976, IEEE Transactions on Computers.

[61]  Kenneth E. Batcher,et al.  The flip network in staran , 1976 .

[62]  Gary J. Nutt Microprocessor implementation of a parallel processor , 1977, ISCA 1977.

[63]  Robert H. Thomas,et al.  Performance Measurements on a 128-Node Butterfly Parallel Processor , 1985, ICPP.

[64]  John F. Beetem,et al.  The GF11 supercomputer , 1985, ISCA '85.

[65]  Mark A. Yoder,et al.  Dynamic time warping algorithms for SIMD machines and VLSI processor arrays , 1982, ICASSP.

[66]  Samuel E. Orcutt Implementation of Permutation Functions in Illiac IV-Type Computers , 1976, IEEE Transactions on Computers.

[67]  Kenneth E. Batcher,et al.  Bit-Serial Parallel Processing Systems , 1982, IEEE Transactions on Computers.

[68]  Michael J. Flynn,et al.  Very high-speed computing systems , 1966 .

[69]  Yoshikuni Okada,et al.  A Reconfigurable Parallel Processor with Microprogram Control , 1982, IEEE Micro.

[70]  Sartaj Sahni,et al.  Data broadcasting in SIMD computers , 1981, IEEE Transactions on Computers.

[71]  G. Jack Lipovski,et al.  An overview of the Texas reconfigurable array computer , 1899, AFIPS '80.

[72]  Larry Rudolph,et al.  Issues related to MIMD shared-memory computers: the NYU ultracomputer approach , 1985, ISCA '85.

[73]  Tse-Yun Feng,et al.  On a Class of Multistage Interconnection Networks , 1980, IEEE Transactions on Computers.

[74]  A. Gottleib,et al.  The nyu ultracomputer- designing a mimd shared memory parallel computer , 1983 .

[75]  Howard Jay Siegel,et al.  PASM: a reconfigurable parallel system for image processing , 1984, CARN.

[76]  Dhiraj K. Pradhan,et al.  A Uniform Representation of Single-and Multistage Interconnection Networks Used in SIMD Machines , 1980, IEEE Transactions on Computers.

[77]  Kai Hwang Tutorial supercomputers : design and applications , 1984 .

[78]  Duncan H. Lawrie,et al.  Access and Alignment of Data in an Array Processor , 1975, IEEE Transactions on Computers.

[79]  Richard M. Brown,et al.  The ILLIAC IV Computer , 1968, IEEE Transactions on Computers.

[80]  Ralph Grishman,et al.  The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer , 1983, IEEE Transactions on Computers.

[81]  John P. Fishburn,et al.  Quotient Networks , 1982, IEEE Transactions on Computers.

[82]  Samuel H. Fuller,et al.  Cm*: a modular, multi-microprocessor , 1977, AFIPS '77.