Algorithmic fault tolerance for matrix operations on triangular arrays

In this paper the technique of algorithm-based fault tolerance which is used to detect and correct transient or permanent hardware faults by checksum matrices is reconsidered for triangular systolic arrays. Linear error detecting arrays are developed for both matrix product and triangular factorisation and are shown to interface neatly with triangular schemes. The overheads associated with error detecting redundancy is offset by hardware reduction due to the folding of the array to produce triangular rather than the standard hex connected arrays. The result is shown to be improved efficiency and area efficient fault tolerant arrays.