HIGH-PERFORMANCE ALGORITHMS AND APPLICATIONS FOR SMP CLUSTERS

The future of high-performance computing relies on the efficient and scalable use of clusters with symmetric multiprocessor (SMP) nodes and low-latency, high-bandwidth interconnection networks. Current examples of such platforms include Sun Ultra HPC machines, Compaq AlphaServers with Quadrics switches, SGI Origins, and the IBM SP system with SMP nodes. Moreover, the future of NASA missioncritical computing for computational aerosciences relies on the success of computational clusters (e.g., SMP Linux clusters at Goddard Space Flight Center, and large SGI Origin arrays at Ames Research Center). Hardware benchmark results reveal awesome performance rates for each component; however, few applications on SMP clusters ever reach a fraction of these peak speeds. While methodologies for symmetric multiprocessors (e.g., OpenMP [21] or POSIX threads [22]) and message-passing primitives for clusters (e.g., MPI [20]) are well developed, performance dictates the use of a hybrid solution. We present preliminary results of our complexity model and programming methodology that is based hierarchically upon realistic model components for message-passing and for symmetric multiprocessor parallel architectures. The current deployment of teraflops and the future development of petaflops systems will certainly require the exploitation of similar hybrid programming models.

[1]  David A. Bader,et al.  Practical parallel algorithms for dynamic data redistribution, median finding, and selection , 1995, Proceedings of International Conference on Parallel Processing.

[2]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[3]  Openmp: a Proposed Industry Standard Api for Shared Memory Programming , 2022 .

[4]  David A. Bader,et al.  A new deterministic parallel sorting algorithm with an experimental evaluation , 1998, JEAL.

[5]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[6]  Dennis J. Volper,et al.  Geometric retrieval in parallel , 1988 .

[7]  A BaderDavid,et al.  Practical parallel algorithms for personalized communication and integer sorting , 1996 .

[8]  S.N.V. Kalluri,et al.  A hierarchical data archiving and processing system to generate custom tailored products from AVHRR data , 1999, IEEE 1999 International Geoscience and Remote Sensing Symposium. IGARSS'99 (Cat. No.99CH36293).

[9]  David A. Bader,et al.  A Randomized Parallel Sorting Algorithm with an Experimental Study , 1998, J. Parallel Distributed Comput..

[10]  Maurice Yarrow,et al.  The NAS Parallel Benchmarks 2.1 Results , 1996 .

[11]  David A. Bader,et al.  Practical parallel algorithms for personalized communication and integer sorting , 1996, JEAL.

[12]  S. Sitharama Iyengar,et al.  Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.

[13]  David A. Bader,et al.  High Performance Computing Applications in Remote Sensing Studies for Land Cover Dynamics , 1998 .

[14]  David A. Bader,et al.  Design and analysis of the Alliance/University of New Mexico Roadrunner Linux SMP SuperCluster , 1999, ICWC 99. IEEE Computer Society International Workshop on Cluster Computing.

[15]  David A. Bader,et al.  SIMPLE: A Methodology for Programming High Performance Algorithms on Clusters of Symmetric Multiprocessors (SMPs) , 1998, J. Parallel Distributed Comput..

[16]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[17]  David A. Bader,et al.  Kronos: A Java-Based Software System for the Processing and Retrieval of Large Scale AVHRR Data Sets 5 , 1999 .

[18]  Alan L. Cox,et al.  TreadMarks: shared memory computing on networks of workstations , 1996 .

[19]  Tom Goodale,et al.  The Cactus computational collaboratory: enabling technologies for relativistic astrophysics, and a toolkit for solving PDE's by communities in science and engineering , 1999, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[20]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[21]  Joseph JáJá,et al.  An Introduction to Parallel Algorithms , 1992 .

[22]  David A. Bader A Practical Parallel Algorithm for Cycle Detection in Partitioned Digraphs , 1999 .

[23]  Corporate Ieee,et al.  Information Technology-Portable Operating System Interface , 1990 .

[24]  A. Agarwal,et al.  MGS: A Multigrain Shared Memory System , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[25]  David E. Culler,et al.  Multi Protocol Active Messages on a Cluster of SMP , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[26]  E. L. Lusk,et al.  A taxonomy of programming models for symmetric multiprocessors and SMP clusters , 1995, Programming Models for Massively Parallel Computers.

[27]  Larry S. Davis,et al.  Parallel algorithms for image enhancement and segmentation by region growing, with an experimental study , 1996, Proceedings of International Conference on Parallel Processing.

[28]  David A. Bader,et al.  Parallel Algorithms for Image Histogramming and Connected Components with an Experimental Study , 1996, J. Parallel Distributed Comput..