Running OpenMP applications efficiently on an everything-shared SDSM

Summary form only given. Traditional software distributed shared memory (SDSM) systems modify the semantics of a real hardware shared memory system by relaxing the coherence semantic and by limiting the memory regions that are actually shared. These semantic modifications are done to improve performance of the applications using it. We show that a SDSM system that behaves like a real shared memory system (without the afore mentioned relaxations) can also be used to execute OpenMP applications and achieve similar speedups as the ones obtained by traditional SDSM systems. This performance can be achieved by encouraging the cooperation between the SDSM and the OpenMP runtime instead of relaxing the semantics of the shared memory. In addition, techniques like boundaries alignment and page presend are demonstrated as very useful to overcome the limitations of the current SDSM systems.

[1]  John K. Bennett,et al.  Brazos: a third generation DSM system , 1997 .

[2]  Eduard Ayguadé,et al.  A Library Implementation of the Nano-Threads Programming Model , 1996, Euro-Par, Vol. II.

[3]  Xavier Martorell,et al.  NanosCompiler: A Research Platform for OpenMP Extensions , 1999 .

[4]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[5]  R. Bianchini,et al.  Adaptive Techniques for Home-Based Software DSMs , 2001 .

[6]  Alan L. Cox,et al.  TreadMarks: shared memory computing on networks of workstations , 1996 .

[7]  Jin-Soo Kim,et al.  ParADE: An OpenMP Programming Environment for SMP Cluster Systems , 2003, SC.

[8]  Eduard Ayguadé,et al.  NanosCompiler: supporting flexible multilevel parallelism exploitation in OpenMP , 2000, Concurr. Pract. Exp..

[9]  Christian Pérez,et al.  PadicoTM: An Open Integration Framework for Communication Middleware and Runtimes , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[10]  Mitsuhisa Sato,et al.  Design of OpenMP Compiler for an SMP Cluster , 1999 .

[11]  Matthias S. Müller,et al.  Experiences u:;ing OpenMP based on Compiler Directed S ftware DSM on a PC Cluster , 2022 .

[12]  Alan L. Cox,et al.  OpenMP for networks of SMPs , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[13]  Rudolf Eigenmann,et al.  SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance , 2001, WOMPAT.

[14]  André Ribes,et al.  Padico: a component-based software infrastructure for Grid computing , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[15]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[16]  Weisong Shi,et al.  JIAJIA: A Software DSM System Based on a New Cache Coherence Protocol , 1999, HPCN Europe.

[17]  Mitsuhisa Sato,et al.  Openmp Compiler for a Software Distributed Shared Memory System Scash , 2000 .

[18]  Eduard Ayguadé,et al.  Thread fork/join techniques for multi-level parallelism exploitation in NUMA multiprocessors , 1999, ICS '99.

[19]  Mitsuhisa Sato,et al.  Cluster-enabled OpenMP: An OpenMP compiler for the SCASH software distributed shared memory system , 2001, Sci. Program..

[20]  H. Jin,et al.  - 3-The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance , 1999 .