A general method of applying error correction to synchronous digital systems

A general method is presented for applying error correction to synchronous binary digital systems to improve reliability. It includes the familiar scheme of triplication and “rote taking” as a special case. In principle, the method permits the system to operate continuously, even when a fault is present or maintenance is being performed. An efficient maintenance routine, including rapid repair of faults, is an essential adjunct to the scheme if the potentially large increase in reliability made possible by error correction is to be realized. The percentage redundancy needed to realize the scheme decreases as the complexity of the system to which it is applied increases, but may amount to triplication of equipment even for moderately large systems. The paper describes some error-correcting codes to implement the scheme, discusses error-correcting circuits in a general way, indicates how to estimate the redundancy, and presents a formula for determining the reliability improvement obtainable with a particular maintenance routine. In a companion paper,1 D. K. Ray-Chaudhuri develops a general theory of minimally redundant codes for this application.