Optimized Distributed Data Sharing Substrate in Multi-core Commodity Clusters: A Comprehensive Study with Applications

Distributed applications tend to have a complex design due to issues such as concurrency, synchronization and communication. Researchers in the past have proposed simpler abstractions to hide these complexities. However, many of the proposed techniques use messaging protocols which incur high overhead and are not very scalable. To address these limitations, in our previous work [20], we proposed an efficient Distributed Data Sharing Substrate (DDSS) using the features of high-speed networks. In this paper, we propose several design optimizations for DDSS in multi-core systems such as the combination of shared memory and message queues for inter-process communication, dedicated thread for communication progress and for onloading DDSS operations such as get and put. Our micro-benchmark results not only show a very low latency in DDSS operations but also demonstrate the scalability of DDSS with increasing number of processes. Application evaluations with R-Tree and B-Tree query processing and distributed STORM shows an improvement of up to 56%, 45% and 44%, respectively, as compared to traditional implementations. Evaluations with application checkpointing using DDSS demonstrate the scalability with increasing number of checkpointing applications. Further, in our evaluations, we demonstrate the portability of DDSS across multiple modern interconnects including InfiniBand and iWARP-capable 10-Gigabit Ethernet networks (applicable for both LAN/WAN environments).

[1]  Srinivasan Parthasarathy,et al.  InterAct: Virtual Sharing for Interactive Client-Server Applications , 1998, LCR.

[2]  Hyun-Wook Jin,et al.  Exploiting RDMA operations for Providing Efficient Fine-Grained Resource Monitoring in Cluster-based Servers , 2006, 2006 IEEE International Conference on Cluster Computing.

[3]  John B. Carter,et al.  Khazana: an infrastructure for building distributed services , 1998, Proceedings. 18th International Conference on Distributed Computing Systems (Cat. No.98CB36183).

[4]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[5]  Marcos K. Aguilera,et al.  Sinfonia: a new paradigm for building scalable distributed systems , 2007, SOSP.

[6]  Stephen Bailey,et al.  An Overview of RDMA over IP , 2002 .

[7]  Michael L. Scott,et al.  Integrating remote invocation and distributed shared state , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[8]  Norman P. Jouppi,et al.  High-performance ethernet-based communications for future multi-core processors , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[9]  Darren J. Kerbyson,et al.  Efficient offloading of collective communications in large-scale systems , 2007, 2007 IEEE International Conference on Cluster Computing.

[10]  Joel H. Saltz,et al.  Database Support for Data-Driven Scientific Applications in the Grid , 2003, Parallel Process. Lett..

[11]  Dhabaleswar K. Panda,et al.  DDSS: A Low-Overhead Distributed Data Sharing Substrate for Cluster-Based Data-Centers over Modern Interconnects , 2006, HiPC.

[12]  Rudolf Bayer,et al.  Organization and maintenance of large ordered indexes , 1972, Acta Informatica.

[13]  Michael L. Scott,et al.  Efficient distributed shared state for heterogeneous machine architectures , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..