Scalable parallel computers for real-time signal processing

We assess the state-of-the-art technology in massively parallel processors (MPPs) and their variations in different architectural platforms. Architectural and programming issues are identified in using MPPs for time-critical applications such as adaptive radar signal processing. We review the enabling technologies. These include high-performance CPU chips and system interconnects, distributed memory architectures, and various latency hiding mechanisms. We characterize the concept of scalability in three areas: resources, applications, and technology. Scalable performance attributes are analytically defined. Then we compare MPPs with symmetric multiprocessors (SMPs) and clusters of workstations (COWs). The purpose is to reveal their capabilities, limits, and effectiveness in signal processing. We evaluate the IBM SP2 at MHPCC, the Intel Paragon at SDSC, the Gray T3D at Gray Eagan Center, and the Gray T3E and ASCI TeraFLOP system proposed by Intel. On the software and programming side, we evaluate existing parallel programming environments, including the models, languages, compilers, software tools, and operating systems. Some guidelines for program parallelization are provided. We examine data-parallel, shared-variable, message-passing, and implicit programming models. Communication functions and their performance overhead are discussed. Available software tools and communication libraries are also introduced.

[1]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[2]  John L. Gustafson,et al.  Reevaluating Amdahl's law , 1988, CACM.

[3]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[4]  Rice UniversityCORPORATE,et al.  High performance Fortran language specification , 1993 .

[5]  Justin R. Rattner Desktops and Teraflops: A New Mainstream for Scalable Computing , 1993, IEEE Parallel Distributed Technol. Syst. Appl..

[6]  Lionel M. Ni,et al.  Scalable Problems and Memory-Bounded Speedup , 1993, J. Parallel Distributed Comput..

[7]  Gordon Bell Why there won't be apps: The problem with MPPs , 1994, IEEE Parallel & Distributed Technology: Systems & Applications.

[8]  Roger W. Hockney,et al.  The Communication Challenge for MPP: Intel Paragon and Meiko CS-2 , 1994, Parallel Computing.

[9]  David E. Culler,et al.  A case for NOW (networks of workstation) , 1995, PODC '95.

[10]  Thomas Thomas,et al.  The PowerPC 620 microprocessor: a high performance superscalar RISC microprocessor , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.

[11]  Jack Dongarra,et al.  PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing , 1995 .

[12]  Bowen Alpern,et al.  High-Performance Parallel Implementations of the NAS Kernel Benchmarks on the IBM SP2 , 1995, IBM Syst. J..

[13]  Marc Tremblay,et al.  UltraSPARC: the next generation superscalar 64-bit SPARC , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.

[14]  John H. Edmondson,et al.  Superscalar instruction execution in the 21164 Alpha microprocessor , 1995, IEEE Micro.

[15]  Tilak Agerwala,et al.  SP2 System Architecture , 1999, IBM Syst. J..

[16]  David E. Culler,et al.  A case for NOW (networks of workstation) , 1995, PODC '95.

[17]  Stamatis Vassiliadis,et al.  Instruction-level parallel processors , 1995 .

[18]  Zhiwei Xu,et al.  Modeling communication overhead: MPI and MPL performance on the IBM SP2 , 1996, IEEE Parallel Distributed Technol. Syst. Appl..

[19]  Zhiwei Xu,et al.  Benchmark Evaluation of the IBM SP2 for Parallel Signal Processing , 1996, IEEE Trans. Parallel Distributed Syst..

[20]  David Scott,et al.  A TeraFLOP supercomputer in 1996: the ASCI TFLOP system , 1996, Proceedings of International Conference on Parallel Processing.

[21]  Zhiwei Xu,et al.  Early Prediction of MPP Performance: Th SP2, T3D, and Paragon Experiences , 1996, Parallel Comput..