GUESSTIMATE: a programming model for collaborative distributed systems

We present a new programming model GUEESSTIMATE for developing collaborative distributed systems. The model allows atomic, isolated operations that transform a system from consistent state to consistent state, and provides a shared transactional store for a collection of such operations executed by various machines in a distributed system. In addition to "committed state" which is identical in all machines in the distributed system, GUESSTIMATE allows each machine to have a replicated local copy of the state (called "guesstimated state") so that operations on shared state can be executed locally without any blocking, while also guaranteeing that eventually all machines agree on the sequences of operations executed. Thus, each operation is executed multiple times, once at the time of issue when it updates the guesstimated state of the issuing machine, once when the operation is committed (atomically) to the committed state of all machines, and several times in between as the guesstimated state converges toward the committed state. While we expect the results of these executions of the operation to be identical most of the time in the class of applications we study, it is possible for an operation to succeed the first time when it is executed on the guesstimated state, and fail when it is committed. GUESSTIMATE provides facilities that allow the programmer to deal with this potential discrepancy. This paper presents our programming model, its operational semantics, its realization as an API in C#, and our experience building collaborative distributed applications with this model.

[1]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[2]  Mahadev Satyanarayanan,et al.  Categories and Subject Descriptors: D.4.3 [Software]: File Systems Management—Distributed , 2022 .

[3]  Philip A. Bernstein,et al.  Recovery Algorithms for Database Systems , 1983, IFIP Congress.

[4]  Bor-Yuh Evan Chang,et al.  Boogie: A Modular Reusable Verifier for Object-Oriented Programs , 2005, FMCO.

[5]  Brian T. Lewis,et al.  Compiler and runtime support for efficient software transactional memory , 2006, PLDI '06.

[6]  Bradford L. Chamberlain,et al.  Software transactional memory for large scale clusters , 2008, PPoPP.

[7]  K. Rustan M. Leino,et al.  The Spec# Programming System: An Overview , 2004, CASSIS.

[8]  Antony I. T. Rowstron,et al.  The IceCube approach to the reconciliation of divergent replicas , 2001, PODC '01.

[9]  Marvin Theimer,et al.  Bayou: replicated database services for world-wide applications , 1996, EW 7.

[10]  Yasushi Saito,et al.  Optimistic replication , 2005, CSUR.

[11]  Benjamin C. Pierce,et al.  What's in Unison? A Formal Specification and Reference Implementation of a File Synchronizer , 2004 .

[12]  Vivek Sarkar,et al.  X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.

[13]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[14]  Clarence A. Ellis,et al.  Concurrency control in groupware systems , 1989, SIGMOD '89.

[15]  GhemawatSanjay,et al.  The Google file system , 2003 .

[16]  Amin Vahdat,et al.  The costs and limits of availability for replicated services , 2001, TOCS.