Loose Synchronization for Large-Scale Networked Systems

Traditionally, synchronization barriers ensure that no cooperating process advances beyond a specified point until all processes have reached that point. In heterogeneous large-scale distributed computing environments, with unreliable network links and machines that may become overloaded and unresponsive, traditional barrier semantics are too strict to be effective for a range of emerging applications. In this paper, we explore several relaxations, and introduce a partial barrier, a synchronization primitive designed to enhance liveness in loosely coupled networked systems. Partial barriers are robust to variable network conditions; rather than attempting to hide the asynchrony inherent to wide-area settings, they enable appropriate application-level responses. We evaluate the improved performance of partial barriers by integrating them into three publicly available distributed applications running across PlanetLab. Further, we show how partial barriers simplify a re-implementation of MapReduce that targets wide-area environments.

[1]  G. K. Smelser The structure of the eye , 1961 .

[2]  Edsger W. Dijkstra,et al.  The structure of the “THE”-multiprogramming system , 1968, CACM.

[3]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[4]  H. F. Jordan A Special Purpose Architecture for Finite Element Analysis , 1978 .

[5]  Kenneth P. Birman,et al.  Replication and fault-tolerance in the ISIS system , 1985, SOSP '85.

[6]  Kenneth P. Birman,et al.  Exploiting virtual synchrony in distributed systems , 1987, SOSP '87.

[7]  Rajiv Gupta The fuzzy barrier: a mechanism for high speed synchronization of processors , 1989, ASPLOS III.

[8]  Michael Williams,et al.  Replication in the harp file system , 1991, SOSP '91.

[9]  Richard A. Golding A Weak-Consistency Architecture for Distributed Information Services , 1992, Comput. Syst..

[10]  Al Geist,et al.  Network-based concurrent computing on the PVM system , 1992, Concurr. Pract. Exp..

[11]  Kenneth P. Birman,et al.  The process group approach to reliable distributed computing , 1992, CACM.

[12]  Brian N. Bershad,et al.  The Midway distributed shared memory system , 1993, Digest of Papers. Compcon Spring.

[13]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[14]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[15]  Alan L. Cox,et al.  TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , 1994, USENIX Winter.

[16]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[17]  W. Daniel Hillis,et al.  The Network Architecture of the Connection Machine CM-5 , 1996, J. Parallel Distributed Comput..

[18]  Andrea C. Arpaci-Dusseau,et al.  Effective distributed scheduling of parallel workloads , 1996, SIGMETRICS '96.

[19]  Steven L. Scott,et al.  Synchronization and communication in the T3E multiprocessor , 1996, ASPLOS VII.

[20]  Synchronization and Communication in the T3E Multiprocessor , 1996, ASPLOS.

[21]  Louise E. Moser,et al.  Totem: a fault-tolerant multicast group communication system , 1996, CACM.

[22]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[23]  Chung-Ta King,et al.  Designing Tree-Based Barrier Synchronization on 2D Mesh Networks , 1998, IEEE Trans. Parallel Distributed Syst..

[24]  Eric A. Brewer,et al.  Harvest, yield, and scalable tolerant systems , 1999, Proceedings of the Seventh Workshop on Hot Topics in Operating Systems.

[25]  Michel Raynal,et al.  Timed consistency for shared distributed objects , 1999, PODC '99.

[26]  Amin Vahdat,et al.  Design and evaluation of a continuous consistency model for replicated services , 2000, OSDI.

[27]  Hee Yong Youn,et al.  Four-Ary Tree-Based Barrier Synchronization for 2D Meshes without Nonmember Involvement , 2001, IEEE Trans. Computers.

[28]  David E. Culler,et al.  SEDA: an architecture for well-conditioned, scalable internet services , 2001, SOSP.

[29]  Amin Vahdat,et al.  Design and evaluation of a conit-based continuous consistency model for replicated services , 2002, TOCS.

[30]  Steven Tuecke,et al.  The Physiology of the Grid An Open Grid Services Architecture for Distributed Systems Integration , 2002 .

[31]  Amin Vahdat,et al.  Bullet: high bandwidth data dissemination using an overlay mesh , 2003, SOSP '03.

[32]  Robbert van Renesse,et al.  Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining , 2003, TOCS.

[33]  David E. Culler,et al.  A blueprint for introducing disruptive technology into the Internet , 2003, CCRV.

[34]  M. Dahlin,et al.  A scalable distributed information management system , 2004, SIGCOMM '04.

[35]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[36]  Rajiv Gupta,et al.  A scalable implementation of barrier synchronization using an adaptive combining tree , 1990, International Journal of Parallel Programming.

[37]  Flaviu Cristian,et al.  Reaching agreement on processor-group membrship in synchronous distributed systems , 1991, Distributed Computing.

[38]  Hari Balakrishnan,et al.  Improving web availability for clients with MONET , 2005, NSDI.