High Availability in the Advanced Automation System

The Advanced Automation System is a distributed real-time system under development by IBM's Systems Integration Division for the US Federal Aviation Administration. The system is intended to replace the present en-route and terminal approach US air traac control computer systems over the next decade. High availability of air traac control services is an essential requirement of the system. This paper discusses the general approach to fault-tolerance adopted in AAS, by reviewing some of the questions which were asked during the system design, various alternative solutions considered, and the reasons for the design choices made.

[1]  Flaviu Cristian,et al.  A Rigorous Approach to Fault-Tolerant Programming , 1985, IEEE Transactions on Software Engineering.

[2]  R. Katz Proceedings of the Sixth Symposium on Reliability in distributed software and database systems , 1987 .

[3]  F. Cristian Reaching Agreement on Processor Group Membership in Synchronous Distributed Systems Key Words: Communication Network { Distributed System { Failure Detection { Fault Tolerance { Real Time System { Replicated Data , 1991 .

[4]  David Lorge Parnas,et al.  Review of David L. Parnas' "Designing Software for Ease of Extension and Contraction" , 2004 .

[5]  Brian Randell,et al.  System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.

[6]  Philip A. Bernstein,et al.  Sequoia: a fault-tolerant tightly coupled multiprocessor for transaction processing , 1988, Computer.

[7]  Flaviu Cristian,et al.  Understanding fault-tolerant distributed systems , 1991, CACM.

[8]  Jim Gray,et al.  Why Do Computers Stop and What Can Be Done About It? , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[9]  D. L. Palumbo,et al.  Measurement of SIFT operating system overhead , 1985 .

[10]  Flaviu Cristian,et al.  Handshake Protocols , 1987, ICDCS.

[11]  Algirdas Avizienis,et al.  On the Achievement of a Highly Dependable and Fault-Tolerant Air Traffic Control System , 1987, Computer.

[12]  J. Goldberg,et al.  SIFT: Design and analysis of a fault-tolerant computer for aircraft control , 1978, Proceedings of the IEEE.