Resolving state inconsistency in distributed fault-tolerant real-time dynamic TDMA architectures

State consistency in safety-critical distributed systems is mandatory for synchronizing distributed decisions as found in dynamic time division multiple access (TDMA) schedules in the presence of faults. A TDMA schedule that supports networked systems making decisions at run time is sensitive to transient faults, because stations can make incorrect local decisions at run time and cause state inconsistency and collisions. We refer to this type of TDMA schedule as a dynamic TDMA schedule. Faulty decisions are especially undesirable for safety-critical systems with hard real-time constraints. Hence, real-time communication schedules must have the capability of detecting state inconsistency within a fixed amount of time. In this paper, we show through experimentation that state inconsistency is a real problem, and we propose a solution for resolving state inconsistency in TDMA schedules.

[1]  Insup Lee,et al.  Hardware Acceleration for Conditional State-Based Communication Scheduling on Real-Time Ethernet , 2009, IEEE Transactions on Industrial Informatics.

[2]  Peng Ning,et al.  Secure and resilient clock synchronization in wireless sensor networks , 2006, IEEE Journal on Selected Areas in Communications.

[3]  Andreas Steininger,et al.  Safely Stimulating the Clock Synchronization Algorithm in Time-Triggered Systems–A Combined Formal and Experimental Approach , 2009, IEEE Transactions on Industrial Informatics.

[4]  Yves Sorel,et al.  Generation of fault-tolerant static scheduling for real-time distributed embedded systems with multi-point links , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[5]  Hervé Lacheray,et al.  QBOT: An educational mobile robot controlled in MATLAB Simulink environment , 2009, 2009 Canadian Conference on Electrical and Computer Engineering.

[6]  Ting Zhang,et al.  Simple clock synchronization for distributed real-time systems , 2008, 2008 IEEE International Conference on Industrial Technology.

[7]  Christian Poellabauer,et al.  A Light Weight Method for Maintaining Clock Synchronization for Networked Systems , 2008, 2008 Proceedings of 17th International Conference on Computer Communications and Networks.

[8]  Robert de Simone,et al.  Clock-driven distributed real-time implementation of endochronous synchronous programs , 2009, EMSOFT '09.

[9]  Yves Sorel,et al.  Fault-tolerant static scheduling for real-time distributed embedded systems , 2001, Proceedings 21st International Conference on Distributed Computing Systems.

[10]  Andreas Steininger,et al.  Remote measurement of local oscillator drifts in FlexRay networks , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[11]  Insup Lee,et al.  Plug-and-play for medical devices: experiences from a case study. , 2009, Biomedical instrumentation & technology.

[12]  Diane Easter Data Communications and Networking. Curriculum Improvement Project. Region II. , 1987 .

[13]  E. Nett,et al.  Continuous clock synchronization in wireless real-time applications , 2000, Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000.

[14]  J. H. Lala,et al.  Architectural principles for safety-critical real-time applications , 1994, Proc. IEEE.

[15]  Hermann Kopetz,et al.  Tolerating Arbitrary Node Failures in the Time-Triggered Architecture , 2001 .

[16]  Sebastian Fischmeister,et al.  Semantics-preserving implementation of synchronous specifications over dynamic TDMA distributed architectures , 2010, EMSOFT '10.

[17]  Philip Koopman,et al.  Cyclic redundancy code (CRC) polynomial selection for embedded networks , 2004, International Conference on Dependable Systems and Networks, 2004.

[18]  Insup Lee,et al.  A Verifiable Language for Programming Real-Time Communication Schedules , 2007, IEEE Transactions on Computers.

[19]  Robert E. Lyons,et al.  The Use of Triple-Modular Redundancy to Improve Computer Reliability , 1962, IBM J. Res. Dev..

[20]  Robert Griesemer,et al.  Paxos made live: an engineering perspective , 2007, PODC '07.

[21]  Raoul Velazco,et al.  Injecting bit flip faults by means of a purely software approach: a case studied , 2002, 17th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, 2002. DFT 2002. Proceedings..

[22]  A. Jean-Louis Camus,et al.  A verifiable architecture for multitask , multi-rate synchronous software , 2007 .

[23]  A. Udaya Shankar,et al.  An Empirical Characterization of Instantaneous Throughput in 802.11b WLANs , 2002 .

[24]  Petru Eles,et al.  Design optimization of time- and cost-constrained fault-tolerant distributed embedded systems , 2005, Design, Automation and Test in Europe.

[25]  Brian Randell,et al.  Fundamental Concepts of Computer System Dependability , 2001 .