Using a cluster as a memory resource: A fast and large virtual memory on MPI

The 64-bit OS provides ample memory address space that is beneficial for applications using a large amount of data. This paper proposes using a cluster as a memory resource for sequential applications requiring a large amount of memory. This system is an extension of our previously proposed socket-based Distributed Large Memory System (DLM), which offers large virtual memory by using remote memory distributed over nodes in a cluster. The newly designed DLM is based on MPI (Message Passing Interface) to exploit higher portability. MPI-based DLM provides fast and large virtual memory on widely available open clusters managed with an MPI batch queuing system. To access this remote memory, we rely on swap protocols adequate for MPI thread support levels. In experiments, we confirmed that it achieves 493 MB/s and 613 MB/s of remote memory bandwidth with the STREAM benchmark on 2.5 GB/s and 5 GB/s links (Myri-10G x2, x4) and high performance of applications with NPB and Himeno benchmarks. Additionally, this system enables users unfamiliar with parallel programming to use a cluster.

[1]  Mitsuhisa Sato,et al.  DLM: A distributed Large Memory System using remote memory swapping over cluster nodes , 2008, 2008 IEEE International Conference on Cluster Computing.

[2]  Bill Nitzberg,et al.  Distributed shared memory: a survey of issues and algorithms , 1991, Computer.

[3]  Michael Stumm,et al.  Algorithms implementing distributed shared memory , 1990, Computer.

[4]  Scott Pakin,et al.  Performance analysis of a user-level memory server , 2007, 2007 IEEE International Conference on Cluster Computing.

[5]  Kuzman Ganchev,et al.  Nswap: A Network Swapping Module for Linux Clusters , 2003, Euro-Par.

[6]  Dhabaleswar K. Panda,et al.  Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device , 2005, 2005 IEEE International Conference on Cluster Computing.

[7]  Hiroko Midorikawa,et al.  The design and implementation of user-level software distributed shared memory system: SMS - implicit binding entry consistency model , 2001, 2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (IEEE Cat. No.01CH37233).