Simultaneous regions: a framework for the consistent monitoring of distributed systems

A technique is presented by which state information can be organized into unified, consistent representations of the system state through the creation of simultaneous regions. This method provides a general, yet efficient means of establishing the simultaneous relationship necessary for the monitoring and recognition of event occurrences. The types of events for which a computation can be monitored are described. The methods of utilizing logical clocks and global snapshots are then presented and the reasons why they are not appropriate for use in event evaluation are discussed. The technique for establishing simultaneous regions is then presented and the behaviour of the monitoring and recognition protocol is examined in the context of specific monitoring examples. The correctness of the protocol is proved.<<ETX>>

[1]  Larry D. Wittie,et al.  BUGNET: A Debugging system for parallel programming environments , 1982, ICDCS.

[2]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[3]  Leslie Lamport,et al.  Distributed snapshots: determining global states of distributed systems , 1985, TOCS.

[4]  W. Weigel,et al.  Global events and global breakpoints in distributed systems , 1988, [1988] Proceedings of the Twenty-First Annual Hawaii International Conference on System Sciences. Volume II: Software track.

[5]  Richard J. LeBlanc,et al.  Event-Driven Monitoring of Distributed Programs , 1985, ICDCS.

[6]  Hector Garcia-Molina,et al.  Debugging a Distributed Computing System , 1984, IEEE Transactions on Software Engineering.

[7]  Jack C. Wileden,et al.  High-level debugging of distributed systems: The behavioral abstraction approach , 1983, J. Syst. Softw..