The Scalable Processor-Independent Design for Electromagnetic Resilience (SPIDER) is a new family of fault-tolerant architectures under development at NASA Langley Research Center (LaRC). The SPIDER is a general-purpose computational platform suitable for use in ultrareliable embedded control applications. The design scales from a small configuration supporting a single aircraft function to a large distributed configuration capable of supporting several functions simultaneously. SPIDER consists of a collection of simplex processing elements communicating via a Reliable Optical Bus (ROBUS). The ROBUS is an ultra-reliable, time-division multiple access broadcast bus with strictly enforced write access providing basic fault-tolerant services using formally verified fault-tolerance protocols including Interactive Consistency (Byzantine Agreement), Internal Clock Synchronization, and Distributed Diagnosis. The conceptual design of the ROBUS is presented in this paper including requirements, topology, protocols, and the block-level design. Verification activities, including the use of formal methods, are also discussed.
[1]
Neeraj Suri,et al.
Formally Verified On-Line Diagnosis
,
1997,
IEEE Trans. Software Eng..
[2]
Kang G. Shin,et al.
DIAGNOSIS OF PROCESSORS WITH BYZANTINE FAULTS IN A DISTRIBUTED COMPUTING SYSTEM.
,
1987
.
[3]
Ricky W. Butler,et al.
SURE reliability analysis: Program and mathematics
,
1988
.
[4]
Chris J. Walter,et al.
The MAFT Architecture for Distributed Fault Tolerance
,
1988,
IEEE Trans. Computers.
[5]
Leslie Lamport,et al.
Reaching Agreement in the Presence of Faults
,
1980,
JACM.
[6]
Hermann Kopetz,et al.
Real-time systems
,
2018,
CSC '73.
[7]
John Rushby,et al.
A Comparison of Bus Architectures for Safety-Critical Embedded Systems
,
2003
.