The performance realities of massively parallel processors: a case study

The authors present the results of an architectural comparison of SIMD (single-instruction multiple-data) massive parallelism, as implemented in the Thinking Machines Corp. CM-2, and vector or concurrent-vector processing, as implemented in the Cray Research Inc., Y-MP/8. The comparison is based primarily upon three application codes taken from the LANL (Los Alamos National Laboratory) CM-2 workload. Tests were run by porting CM Fortran codes to the Y-MP, so that nearly the same level of optimization was obtained on both machines. The results for fully configured systems, using measured data rather than scaled data from smaller configurations, show that the Y-MP/8 is faster than the 64 k CM-2 for all three codes. A simple model that accounts for the relative characteristic computational speeds of the two machines, and reduction in overall CM-2 performance due to communication or SIMD conditional execution, accurately predicts the performance of two of the three codes. The authors show the similarity of the CM-2 and Y-MP programming models and comment on selected future massively parallel processor designs.<<ETX>>

[1]  Briesmeister MCNP: a general Monte Carlo code for neutron and photon transport. Version 3A. Revision 2 , 1986 .

[2]  Harold E. Trease,et al.  The free-Lagrange method on the connection machine , 1992 .

[3]  S. Lennart Johnsson,et al.  QCD with dynamical fermions on the connection machine , 1989, Supercomputing '89.

[4]  David H. Bailey,et al.  Twelve ways to fool the masses when giving performance results on parallel computers , 1991 .

[5]  W. Daniel Hillis,et al.  The connection machine , 1985 .

[6]  R. J. Beynon,et al.  Computers , 1985, Comput. Appl. Biosci..

[7]  Barbara M. Chapman,et al.  Supercompilers for parallel and vector computers , 1990, ACM Press frontier series.

[8]  Bruce M. Boghosian,et al.  Computational physics on the Connection Machine , 1989 .

[9]  Patrick J. Burns,et al.  Vectorization on Monte Carlo particle transport: an architectural study using the LANL benchmark “GAMTEB” , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[10]  D. W. Walker A Fortran 90 code for magnetohydrodynamics. Part 1, Banded convolution , 1992 .

[11]  Creon Levit Grid Communication on the Connection Machine: Analysis, Performance, and Improvements , 1989, Int. J. High Speed Comput..

[12]  Guy L. Steele,et al.  Fortran at ten gigaflops: the connection machine convolution compiler , 1991, PLDI '91.

[13]  S. Lennart Johnsson,et al.  QCD on the connection machine: beyond *LISP , 1991 .

[14]  C. Zemach,et al.  CAVEAT: A computer code for fluid dynamics problems with large distortion and internal slip. Revision 1 , 1992 .

[15]  Patrick J. Burns,et al.  Vectorization of Monte Carlo particle transport , 1989 .

[16]  H. E. Trease,et al.  A three-dimensional free-Lagrange code for multimaterial flow simulations , 1991 .

[17]  P J Denning,et al.  Highly Parallel Computation , 1990, Science.

[18]  Robert C. Malone,et al.  Ocean modeling on the connection machine , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[19]  R. A. Forster,et al.  MCNP - a general Monte Carlo code for neutron and photon transport , 1985 .

[20]  D. W. Walker A Fortran 90 code for magnetohydrodynamics , 1992 .

[21]  Ralph G. Brickner,et al.  QCD with dynamical fermions on the connection machine , 1990 .

[22]  Jill P. Mesirov,et al.  Parallel approaches to short range molecular dynamics simulations , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[23]  Olaf M. Lubeck,et al.  Supercomputer Performance: The Theory, Practice, and Results , 1988, Adv. Comput..

[24]  Jill P. Mesirov,et al.  Computing turbulent flow in complex geometries on a massively parallel processor , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).