PADRE: a Protocol for Asymmetric Duplex REdundancy

Safety and availability are issues of major importance in many critical systems. Simultaneously ensuring both attributes is sometimes difficult. Indeed, the introduction of redundancy to increase the overall system availability can lead to safety problems that would not otherwise exist. We present a protocol for duplex redundancy management in critical systems that aims to increase the system availability without jeopardizing its safety. An application to a fully automated train control system is described.

[1]  Torleiv Kløve,et al.  Error detecting codes , 1995 .

[2]  Flaviu Cristian,et al.  Fail-aware datagram service , 1999, IEE Proc. Softw..

[3]  Flaviu Cristian,et al.  Understanding fault-tolerant distributed systems , 1991, CACM.

[4]  Andrea Bondavalli,et al.  State restoration in a COTS-based N-modular architecture , 1998, Proceedings First International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC '98).

[5]  Flaviu Cristian,et al.  The Timed Asynchronous Distributed System Model , 1998, IEEE Trans. Parallel Distributed Syst..

[6]  Flaviu Cristian,et al.  Using fail-awareness to design adaptive real-time applications , 1997, Proceedings of the IEEE 1997 National Aerospace and Electronics Conference. NAECON 1997.

[7]  John F. Wakerly,et al.  Error detecting codes, self-checking circuits and applications , 1978 .

[8]  Jim Gray,et al.  Notes on Data Base Operating Systems , 1978, Advanced Course: Operating Systems.