Synchronization and recovery of actions

We introduce an approach to robust computation in distributed systems. This approach is the foundation for reliability in the Clouds decentralized operating system. It is based on atomic actions operating on instances of abstract data types (objects). We present an event-based model of computation in which scheduling of responses to operation invocations is controlled by objects. We discuss an integrated strategy for synchronization and recovery which uses relationships between the abstract states of objects to track dependencies between actions. Serializability is defined in terms of the semantics of operations. This permits high concurrency to be obtained in non-serializable implementations without deviation from serializable abstract behavior. We define a class of schedulers that allows objects to make autonomous scheduling decisions. We present the use of non-serializable operation semantics. Finally, we discuss implementation of the model, including action synchronization, object operation ordering using action-based counting semaphores, and action recovery.

[1]  Alfred Z. Spector,et al.  Synchronizing shared abstract types , 1984, TOCS.

[2]  Roy H. Campbell,et al.  The specification of process synchronization by path expressions , 1974, Symposium on Operating Systems.

[3]  C. Mohan,et al.  Compatibility and commutativity in non-two-phase locking protocols , 1982, PODS '82.

[4]  Brian Randell,et al.  Reliability Issues in Computing System Design , 1978, CSUR.

[5]  James E. Allchin,et al.  An architecture for reliable decentralized systems , 1983 .

[6]  Lui Sha,et al.  Distributed co-operating processes and transactions , 1983, SIGCOMM 1983.

[7]  E. B. Moss,et al.  Nested Transactions: An Approach to Reliable Distributed Computing , 1985 .

[8]  James E. Allchin,et al.  Architecture for a Global Operating System , 1983, INFOCOM.

[9]  Philip A. Bernstein,et al.  Two Part Proof Schema for Database Concurrency Control , 1981, Berkeley Workshop.

[10]  David P. Reed,et al.  Naming and synchronization in a decentralized computer system , 1978 .

[11]  Philip A. Bernstein,et al.  Concurrency Control in Distributed Database Systems , 1986, CSUR.

[12]  Irving L. Traiger,et al.  The notions of consistency and predicate locks in a database system , 1976, CACM.

[13]  H. T. Kung,et al.  An optimality theory of concurrency control for databases , 1979, SIGMOD '79.

[14]  Barbara Liskov,et al.  Guardians and Actions: Linguistic Support for Robust, Distributed Programs , 1983, TOPL.

[15]  Charles T. Davies,et al.  Recovery semantics for a DB/DC system , 1973, ACM Annual Conference.

[16]  Jim Gray,et al.  Notes on Data Base Operating Systems , 1978, Advanced Course: Operating Systems.

[17]  Henry F. Korth,et al.  Locking Primitives in a Database System , 1983, JACM.

[18]  Philip A. Bernstein,et al.  Formal Aspects of Serializability in Database Concurrency Control , 1979, IEEE Transactions on Software Engineering.

[19]  George G. Robertson,et al.  Accent: A communication oriented network operating system kernel , 1981, SOSP.

[20]  William E. Weihl,et al.  Specification and implementation of resilient, atomic data types , 1983, ACM SIGPLAN Notices.

[21]  Christos H. Papadimitriou,et al.  The serializability of concurrent database updates , 1979, JACM.