Evaluation of RDMA Opportunities in an Object-Oriented DSM

Remote Direct Memory Access (RDMA) is a technology to update a remote machine's memory without intervention at the receiver side. We evaluate where RDMA can be usefully applied and where it is a loss in Object-Oriented DSM systems. RDMA is difficult to use in modern OO-DSMs due to their support for large address spaces, advanced protocols, and heterogeneity. First, a communication pattern that is based on objects reduces the applicability of bulk RDMA. Second, large address spaces (meaning far larger than that of a single machine) and large numbers of machines require an address space translation scheme to map an object at different addresses on different machines. Finally, RDMA usage is hard since without polling (which would require source code modifications), incoming RDMA messages are hard to notice on time. Our results show that even with RDMA, update protocols are slower than invalidation protocols. But RDMA can be successfully applied to fetching of objects in an invalidation protocol and improves performance by 20.6%.

[1]  Cho-Li Wang,et al.  A novel adaptive home migration protocol in home-based DSM , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[2]  James R. Larus,et al.  Application-specific protocols for user-level shared memory , 1994, Proceedings of Supercomputing '94.

[3]  Willy Zwaenepoel,et al.  Techniques for reducing consistency-related communication in distributed shared-memory systems , 1995, TOCS.

[4]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[5]  Dhabaleswar K. Panda,et al.  Reducing Diff Overhead in Software DSM Systems using RDMA Operations in InfiniBand , 2004 .

[6]  Veljko M. Milutinovic,et al.  A survey of distributed shared memory systems , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[7]  Assaf Schuster,et al.  Multithreaded home-based lazy release consistency over VIA , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[8]  Carsten Trinitis,et al.  Implementation of a DSM-system on top of InfiniBand , 2006, 14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP'06).

[9]  Henri E. Bal,et al.  Source-level global optimizations for fine-grain distributed shared memory systems , 2001, PPoPP '01.

[10]  Henri E. Bal,et al.  Runtime optimizations for a Java DSM implementation , 2001, JGI '01.

[11]  Dhabaleswar K. Panda,et al.  Design and implementation of MPICH2 over InfiniBand with RDMA support , 2003, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..