A Transparent Distributed Shared Memory for Clustered Symmetric Multiprocessors

A transparent distributed shared memory (DSM) system must achieve complete transparency in data distribution, workload distribution, and reconfiguration respectively. The transparency of data distribution allows programmers to be able to access and allocate shared data using the same user interface as is used in shared-memory systems. The transparency of workload distribution and reconfiguration can optimize the parallelism at both the user-level and the kernel-level, and also improve the efficiency of run-time reconfiguration. In this paper, a transparent DSM system referred to as Teamster is proposed and is implemented for clustered symmetric multiprocessors. With the transparency provided by Teamster, programmers can exploit all the computing power of the clustered SMP nodes in a transparent way as they do in single SMP computer. Compared with the results of previous researches, Teamster can realize the transparency of cluster computing and obtain satisfactory system performance.

[1]  Liria Matsumoto Sato,et al.  CPAR-Cluster: a runtime system for heterogeneous clusters with mono and multiprocessor nodes , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..

[2]  Ce-Kuen Shieh,et al.  Load balancing in distributed shared memory systems , 1997, 1997 IEEE International Performance, Computing and Communications Conference.

[3]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[4]  Miron Livny,et al.  The Available Capacity of a Privately Owned Workstation Environmont , 1991, Perform. Evaluation.

[5]  Peter J. Keleher,et al.  Multi-threading and remote latency in software DSMs , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[6]  Weng-Fai Wong,et al.  SilkRoad II: mixed paradigm cluster computing with RC_dag consistency , 2003, Parallel Comput..

[7]  Robert Olson,et al.  Nexus: An interoperability layer for parallel and distributed computer systems , 1994 .

[8]  John L. Hennessy,et al.  SoftFLASH: analyzing the performance of clustered distributed virtual shared memory , 1996, ASPLOS VII.

[9]  Alan L. Cox,et al.  OpenMP for networks of SMPs , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[10]  Liviu Iftode,et al.  Home-based SVM protocols for SMP clusters: Design and performance , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[11]  Peter J. Keleher,et al.  Thread migration and communication minimization in DSM systems , 1999 .

[12]  Yvon Jégou Implementation of page management in Mome, a user-level DSM , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[13]  Morris Sloman,et al.  Workload characteristics for process migration and load balancing , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[14]  Assaf Schuster,et al.  Thread migration and its applications in distributed shared memory systems , 1998, J. Syst. Softw..

[15]  Willy Zwaenepoel,et al.  Munin: distributed shared memory based on type-specific memory coherence , 1990, PPOPP '90.

[16]  John K. Bennett,et al.  Brazos: a third generation DSM system , 1997 .

[17]  Jyh-Biau Chang,et al.  An efficient thread architecture for a distributed shared memory on symmetric multiprocessor clusters , 1998, Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250).

[18]  Edward Mascarenhas,et al.  Ariadne: Architecture of a Portable Threads System Supporting Thread Migration , 1996 .

[19]  Kourosh Gharachorloo,et al.  Fine-grain software distributed shared memory on SMP clusters , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[20]  Jeffrey S. Chase,et al.  The Amber system: parallel programming on a network of multiprocessors , 1989, SOSP '89.

[21]  Peter J. Keleher,et al.  Thread migration and load balancing in non-dedicated environments , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[22]  Jyh-Biau Chang,et al.  Proteus: an efficient runtime reconfigurable distributed shared memory system , 2001, J. Syst. Softw..

[23]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[24]  Alan L. Cox,et al.  TreadMarks: shared memory computing on networks of workstations , 1996 .

[25]  James H. Patterson,et al.  Portable Programs for Parallel Processors , 1987 .

[26]  Brian N. Bershad,et al.  PRESTO: A system for object‐oriented parallel programming , 1988, Softw. Pract. Exp..

[27]  Brian N. Bershad,et al.  The Midway distributed shared memory system , 1993, Digest of Papers. Compcon Spring.

[28]  I-Hsin Chung,et al.  Active Harmony: Towards Automated Performance Tuning , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[29]  G. C. Fox,et al.  Solving Problems on Concurrent Processors , 1988 .

[30]  Weng-Fai Wong,et al.  SilkRoad: a multithreaded runtime system with software distributed shared memory for SMP clusters , 2000, Proceedings IEEE International Conference on Cluster Computing. CLUSTER 2000.

[31]  Weisong Shi,et al.  Dynamic computation scheduling for load balancing in home-based software DSMs , 1999, Proceedings Fourth International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'99).

[32]  Andrea C. Arpaci-Dusseau,et al.  The interaction of parallel and sequential workloads on a network of workstations , 1995, SIGMETRICS '95/PERFORMANCE '95.

[33]  Srinivasan Parthasarathy,et al.  Cashmere-2L: software coherent shared memory on a clustered remote-write network , 1997, SOSP.

[34]  Weng-Fai Wong,et al.  The performance model of SilkRoad - a multithreaded DSM system for clusters , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[35]  Willy Zwaenepoel,et al.  Implementation and performance of Munin , 1991, SOSP '91.

[36]  Sumit Roy,et al.  Strings: a high-performance distributed shared memory for symmetrical multiprocessor clusters , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[37]  Kai Li,et al.  IVY: A Shared Virtual Memory System for Parallel Computing , 1988, ICPP.

[38]  Weng-Fai Wong,et al.  SilkRoad II: a multi-paradigm runtime system for cluster computing , 2002, Proceedings. IEEE International Conference on Cluster Computing.

[39]  Jingwen Wang,et al.  Utopia: A load sharing facility for large, heterogeneous distributed computer systems , 1993, Softw. Pract. Exp..

[40]  Keith H. Randall,et al.  Cilk: efficient multithreaded computing , 1998 .

[41]  Gabriel Antoniu,et al.  Making a DSM consistency protocol hierarchy-aware: an efficient synchronization scheme , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[42]  Jeffrey K. Hollingsworth,et al.  Prediction and adaptation in Active Harmony , 2004, Cluster Computing.