Combination of clock-state and clock-rate correction in fault-tolerant distributed systems

This paper proposes the integration of internal and external clock synchronization by a combination of a fault-tolerant distributed algorithm for clock state correction with a central algorithm for clock rate correction. By means of hardware and simulation experiments it is shown that this combination improves the precision of the global time base in a distributed single cluster system while reducing the need for high-quality oscillators. Simulation results have shown that the rate-correction algorithm contributes not only in the internal clock synchronization of a single cluster system, but it can be used for external clock synchronization of a multi-cluster system with a reference clock. Therefore, deployment of the rate-correction algorithm integrates internal and external clock synchronization in one mechanism. Experimental results show that a failure in the clock rate correction will not hinder the distributed fault-tolerant clock state synchronization algorithm, since the state correction operates independently from the rate correction. The paper introduces new algorithms and presents experimental results on the achieved improvements in the precision measured in a time-triggered system. Results of simulation experiments of the new algorithms in single-cluster and multi-cluster configurations are also presented.

[1]  Eberhardt Rechtin,et al.  The art of systems architecting (2nd ed.) , 2000 .

[2]  Holger Zeltwanger,et al.  Time-Triggered Communication on CAN , 2002 .

[3]  E. Rechtin,et al.  The art of systems architecting , 1996, IEEE Spectrum.

[4]  I. Puaut,et al.  A Taxonomy of Clock Synchronization Algorithms , 1997 .

[5]  Klaus Schossmaier,et al.  An algorithm for fault-tolerant clock state & rate synchronization , 1999, Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems.

[6]  Shai Halevi,et al.  Clock synchronization with faults and recoveries (extended abstract) , 2000, PODC '00.

[7]  Flaviu Cristian,et al.  Probabilistic clock synchronization , 1989, Distributed Computing.

[8]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[9]  Hermann Kopetz,et al.  The time-triggered architecture , 2003 .

[10]  P. M. Melliar-Smith,et al.  Synchronizing clocks in the presence of faults , 1985, JACM.

[11]  Ulrich Schmid,et al.  Synchronized universal time coordinated for distributed real-time systems , 1995 .

[12]  Fred B. Schneider,et al.  Inexact agreement: accuracy, precision, and graceful degradation , 1985, PODC '85.

[13]  Boaz Barak,et al.  Clock Synchronization with Faults and Recoveries ( Extended , 2000 .

[14]  Nancy A. Lynch,et al.  A new fault-tolerant algorithm for clock synchronization , 1984, PODC '84.

[15]  Flaviu Cristian,et al.  Continuous clock amortization need not affect the precision of a clock synchronization algorithm , 1990, PODC '90.

[16]  Hermann Kopetz,et al.  A synchronization strategy for a time-triggered multicluster real-time system , 1995, Proceedings. 14th Symposium on Reliable Distributed Systems.

[17]  Hermann Kopetz,et al.  Integration of internal and external clock synchronization by the combination of clock-state and clock-rate correction in fault-tolerant distributed systems , 2004, 25th IEEE International Real-Time Systems Symposium.

[18]  David L. Mills,et al.  Internet time synchronization: the network time protocol , 1991, IEEE Trans. Commun..

[19]  Flaviu Cristian,et al.  Probabilistic internal clock synchronization , 1994, Proceedings of IEEE 13th Symposium on Reliable Distributed Systems.

[20]  Danny Dolev,et al.  Fault-tolerant clock synchronization , 1984, PODC '84.

[21]  Antonio Casimiro,et al.  CesiumSpray: a Precise and Accurate Global Time Service for Large-scale Systems , 1997, Real-Time Systems.

[22]  Fred B. Schneider,et al.  A Paradigm for Reliable Clock Synchronization , 1986 .

[23]  Flaviu Cristian,et al.  Fault-tolerant external clock synchronization , 1995, Proceedings of 15th International Conference on Distributed Computing Systems.

[24]  Hermann Kopetz,et al.  Real-time systems , 2018, CSC '73.

[25]  Parameswaran Ramanathan,et al.  Fault-tolerant clock synchronization in distributed systems , 1990, Computer.

[26]  Flaviu Cristian,et al.  Integrating External and Internal Clock Synchronization , 2004, Real-Time Systems.

[27]  Nancy A. Lynch,et al.  A New Fault-Tolerance Algorithm for Clock Synchronization , 1988, Inf. Comput..

[28]  Astrit Ademaj Slightly-off-specification failures in the time-triggered architecture , 2002, Seventh IEEE International High-Level Design Validation and Test Workshop, 2002..

[29]  Flaviu Cristian,et al.  An optimal internal clock synchronization algorithm , 1995, COMPASS '95 Proceedings of the Tenth Annual Conference on Computer Assurance Systems Integrity, Software Safety and Process Security'.

[30]  H. Pfeifer,et al.  Formal verification for time-triggered clock synchronization , 1999, Dependable Computing for Critical Applications 7.

[31]  Flaviu Cristian,et al.  Clock Synchronization in the Presence of Omission and Performance Faults, and Processor Joins , 1986 .

[32]  onio Casimiro CesiumSpray : a Precise and Accurate Global Clock Service for Large-scale Systems , 1997 .

[33]  Hermann Kopetz,et al.  The time-triggered architecture , 1998, Proceedings First International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC '98).

[34]  Sam Toueg,et al.  Optimal clock synchronization , 1985, PODC '85.

[35]  Michael Paulitsch,et al.  Fault-tolerant clock synchronization for embedded distributed multi-cluster systems , 2003, 15th Euromicro Conference on Real-Time Systems, 2003. Proceedings..

[36]  Hermann Kopetz,et al.  The time-triggered Ethernet (TTE) design , 2005, Eighth IEEE International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC'05).

[37]  Peter H. Dana Global Positioning System (GPS) Time Dissemination for Real-Time Applications , 1997, Real-Time Systems.

[38]  Fabio A. Schreiber,et al.  Is Time a Real Time? An Overview of Time Ontology in Informatics , 1992, NATO ASI RTC.

[39]  Neeraj Suri,et al.  Advances in ULTRA-Dependable Distributed Systems , 1994 .

[40]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.