Experiences u:;ing OpenMP based on Compiler Directed S ftware DSM on a PC Cluster

In this work we report on our experiences running OpenMP programs on a commodity cluster of PCs running a software distributed shared memory (DSM) system. We describe our test environment and report on the performance of a subset of the NAS Parallel Benchmarks that have been automatically parallelized for OpenMP. We compare the performance of the OpenMP implementations with that of their message passing counterparts and discuss performance differences.

[1]  Hiroshi Tezuka,et al.  Pin-down cache: a virtual memory management technique for zero-copy communication , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[2]  Kourosh Gharachorloo,et al.  Fine-grain software distributed shared memory on SMP clusters , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[3]  Phillip Ezolt A Study in Malloc: A Case of Excessive Minor Faults , 2001, Annual Linux Showcase & Conference.

[4]  Alan L. Cox,et al.  OpenMP for networks of SMPs , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[5]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[6]  Alan L. Cox,et al.  TreadMarks: shared memory computing on networks of workstations , 1996 .

[7]  Cos S. Ierotheou,et al.  Computer Aided Parallelisation Tools (CAPTools) - Conceptual Overview and Performance on the Parallelisation of Structured Mesh Codes , 1996, Parallel Comput..

[8]  Leonid Oliker,et al.  A Comparison of Three Programming Models for Adaptive Applications on the Origin2000 , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[9]  Jaswinder Pal Singh,et al.  A Comparison of MPI, SHMEM and Cache-Coherent Shared Address Space Programming Models on a Tightly-Coupled Multiprocessors , 2001, International Journal of Parallel Programming.

[10]  Satoshi Matsuoka,et al.  StackThreads: An Abstract Machine for Scheduling Fine-Grain Threads on Stock CPUs , 1994, Theory and Practice of Parallel Programming.

[11]  Alan L. Cox,et al.  Quantifying the Performance Differences between PVM and TreadMarks , 1997, J. Parallel Distributed Comput..

[12]  Srinivasan Parthasarathy,et al.  Cashmere-2L: software coherent shared memory on a clustered remote-write network , 1997, SOSP.

[13]  Yutaka Ishikawa,et al.  Dynamic home node reallocation on software distributed shared memory , 2000, Proceedings Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region.

[14]  Michael A. Frumkin,et al.  Automatic Generation of OpenMP Directives and Its Application to Computational Fluid Dynamics Codes , 2000, ISHPC.

[15]  Hiroshi Tezuka PM : A High-Performance Communication Library for Multi-user Parallel Environments , 1996 .