A Comparison of Bus Architectures for Safety-Critical Embedded Systems

We describe and compare the architectures of four fault-tolerant, safety-critical buses with a view to deducing principles common to all of them, the main differences in their design choices, and the tradeoffs made. Two of the buses come from an avionics heritage, and two from automobiles, though all four strive for similar levels of reliability and assurance. The avionics buses considered are the Honeywell SAFEbus (the backplane data bus used in the Boeing 777 Airplane Information Management System) and the NASA SPIDER (an architecture being developed as a demonstrator for certification under the new DO-254 guidelines); the automobile buses considered are the TTTech Time-Triggered Architecture (TTA), recently adopted by Audi for automobile applications, and by Honeywell for avionics and aircraft control functions, and FlexRay, which is being developed by a consortium of BMW, DaimlerChrysler, Motorola, and Philips.

[1]  Ulrich Schmid How to model link failures: a perception-based fault model , 2001, 2001 International Conference on Dependable Systems and Networks.

[2]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[3]  Leslie Lamport,et al.  The Weak Byzantine Generals Problem , 1983, JACM.

[4]  Nancy A. Lynch,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[5]  J. Goldberg,et al.  SIFT: Design and analysis of a fault-tolerant computer for aircraft control , 1978, Proceedings of the IEEE.

[6]  John Rushby,et al.  Formal Methods and their Role in the Certification of Critical Systems , 1997 .

[7]  E Lloyd,et al.  Systematic safety : safety assessment of aircraft systems , 1982 .

[8]  Philip M. Thambidurai,et al.  Interactive consistency with multiple failure modes , 1988, Proceedings [1988] Seventh Symposium on Reliable Distributed Systems.

[9]  S S Brilliant,et al.  The consistent comparison problem in N-version software , 1987, SOEN.

[10]  John Rushby,et al.  Formal Methods and the Certification of Critical Systems , 2004 .

[11]  John Rushby A formally verified algorithm for clock synchronization under a hybrid fault model , 1994, PODC '94.

[12]  F. Yuan,et al.  SPONSORING / MONITORING AGENCY NAME(S) AND ADDRESS(ES) , 1999 .

[13]  Monika Müllerburg,et al.  "No collision" in a protocol with n stations: a comparative study of formal proofs , 1999 .

[14]  Wilfried Elmenreich,et al.  A universal smart transducer interface: TTP/A , 2000, Proceedings Third IEEE International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC 2000) (Cat. No. PR00607).

[15]  Danny Dolev,et al.  The Byzantine Generals Strike Again , 1981, J. Algorithms.

[16]  Rushby John,et al.  Formal Methods and Digital Systems Validation for Airborne Systems , 2003 .

[17]  Hermann Kopetz,et al.  Transparent redundancy in the time-triggered architecture , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[18]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[19]  Agathe Merceron Proving "no cliques" in a protocol , 2001, Proceedings 24th Australian Computer Science Conference. ACSC 2001.

[20]  Neeraj Suri,et al.  Formally Verified On-Line Diagnosis , 1997, IEEE Trans. Software Eng..

[21]  Edsger W. Dijkstra,et al.  Self-stabilizing systems in spite of distributed control , 1974, CACM.

[22]  Friedrich W. von Henke,et al.  Mechanical Verification of Clock Synchronization Algorithms , 1998, FTRTFT.

[23]  Michael Stonebraker,et al.  The Morgan Kaufmann Series in Data Management Systems , 1999 .

[24]  John Rushby A FAULT-MASKING AND TRANSIENT-RECOVERY MODEL FOR DIGITAL FLIGHT-CONTROL SYSTEMS , 1993 .

[25]  Stefan Poledna,et al.  Time-Triggered Architecture: A Consistent Computing Platform , 2002, IEEE Micro.

[26]  Dale A. Mackall Development and flight test experiences with a flight-crucial digital control system , 1988 .

[27]  Danny Dolev,et al.  On the possibility and impossibility of achieving clock synchronization , 1984, STOC '84.

[28]  John F. Wakerly,et al.  Synchronization and Matching in Redundant Systems , 1978, IEEE Transactions on Computers.

[29]  D. A. Mackall Qualification needs for advanced integrated aircraft , 1985 .

[30]  J. Rushby,et al.  Formal verification of an interactive consistency algorithm for the Draper FTP architecture under a hybrid fault model , 1994, Proceedings of COMPASS'94 - 1994 IEEE 9th Annual Conference on Computer Assurance.

[31]  Jennifer L. Welch,et al.  Self-stabilizing clock synchronization with Byzantine faults , 1995, PODC '95.

[32]  Chris J. Walter,et al.  The MAFT Architecture for Distributed Fault Tolerance , 1988, IEEE Trans. Computers.

[33]  Development Guidelines for Vehicle Based Software , 2022 .

[34]  Ricky W. Butler,et al.  The SURE approach to reliability analysis , 1992 .

[35]  Hoyt Lougee,et al.  SOFTWARE CONSIDERATIONS IN AIRBORNE SYSTEMS AND EQUIPMENT CERTIFICATION , 2001 .

[36]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[37]  Hermann Kopetz,et al.  Tolerating Arbitrary Node Failures in the Time-Triggered Architecture , 2001 .

[38]  Holger Pfeifer Formal Verification of the TTP Group Membership Algorithm , 2000, FORTE.

[39]  Shlomi Dolev,et al.  Self Stabilization , 2004, J. Aerosp. Comput. Inf. Commun..

[40]  Mathai Joseph,et al.  Formal Techniques in Real-Time and Fault-Tolerant Systems , 2002, Lecture Notes in Computer Science.

[41]  Ahmed Bouajjani,et al.  Parametric Verification of a Group Membership Algorithm , 2002, FTRTFT.

[42]  Sam Toueg,et al.  Optimal clock synchronization , 1985, PODC '85.

[43]  Robert Mores,et al.  FlexRay - The Communication System for Advanced Automotive Control Systems , 2001 .

[44]  SpectorAlfred,et al.  The space shuttle primary computer system , 1984 .

[45]  Fred B. Schneider,et al.  Understanding Protocols for Byzantine Clock Synchronization , 1987 .

[46]  P. M. Melliar-Smith,et al.  Synchronizing clocks in the presence of faults , 1985, JACM.

[47]  H. Pfeifer,et al.  Formal verification for time-triggered clock synchronization , 1999, Dependable Computing for Critical Applications 7.

[48]  Günter Grünsteidl,et al.  TTP - A Protocol for Fault-Tolerant Real-Time Systems , 1994, Computer.

[49]  Michael Paulitsch,et al.  An investigation of membership and clique avoidance in TTP/C , 2000, Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000.

[50]  H. Kopetz,et al.  A Comparison of TTP/C and FlexRay , 2001 .

[51]  Kenneth J. Perry,et al.  Unifying self-stabilization and fault-tolerance , 1993, PODC '93.

[52]  Mahyar R. Malekpour,et al.  A conceptual design for a Reliable Optical Bus (ROBUS) , 2002, Proceedings. The 21st Digital Avionics Systems Conference.

[53]  P. Lincoln,et al.  Byzantine Agreement with Authentication : Observations andApplications in Tolerating Hybrid and Link Faults , 1995 .

[54]  Nancy A. Lynch,et al.  Easy impossibility proofs for distributed consensus problems , 1985, PODC '85.

[55]  Anish Arora,et al.  Closure and Convergence: A Foundation of Fault-Tolerant Computing , 1993, IEEE Trans. Software Eng..

[56]  Nancy A. Lynch,et al.  A new fault-tolerant algorithm for clock synchronization , 1984, PODC '84.

[57]  John M. Rushby,et al.  An Overview of Formal Verification for the Time-Triggered Architecture , 2002, FTRTFT.

[58]  John Rushby Formal Verification of Transmission Window Timing for the Time-Triggered Architecture , 2001 .

[59]  G. Norris Boeing's seventh wonder , 1995 .

[60]  David Bradbury,et al.  Simulation of a Time Triggered Protocol , 2000 .

[61]  Thomas Thurner,et al.  Time-triggered architecture for safety-related distributed real-time systems in transportation systems , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).

[62]  Anders P. Ravn,et al.  Formal Techniques in Real-Time and Fault-Tolerant Systems , 1994, Lecture Notes in Computer Science.

[63]  S. D. Ishmael,et al.  Design implications from AFTI/F-16 flight test , 1984 .

[64]  S Miner Paul,et al.  Verification of Fault-Tolerant Clock Synchronization Systems , 2003 .

[65]  John M. Rushby Reconfiguration and transient recovery in state machine architectures , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[66]  Hermann Kopetz,et al.  TTP - A time-triggered protocol for fault-tolerant real-time systems , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.