The Join Algorithm: Ordering Messages in Replicated Systems

Abstract The need to ensure correct input-output behaviour and a higher level of fault-masking in the case of real-time systems has led designers to consider the application of N-Modular Redundancy (NMR) in the construction of software. This approach permits redundant systems to be robust with respect to failures in replicated processors, and also permits the use of software fault tolerance techniques such as N-version programming. In order to ensure consistent behaviour of all nonfaulty replicated processors, these must process input requests in the same order. A suitable distributed algorithm, the 'join algorithm', is proposed that allows nonfaulty processors to agree on the order in which their input requests will be processed.