A System Solution to the Memory Soft Error Problem

High-density and/or high-performance memory chip designs often create new reliability problems; one good example is the alpha-particle problem for high-density RAM and CCD chips, the problem being that soft errors may “line up” with existing hard errors, giving rise to double errors which are not correctable with conventionally implemented single-error-correcting double-error-detecting codes. In this paper it is shown that an overall system approach based on error-correcting codes and system maintenance strategy will reduce the main memory failure rate at the system level as if the alpha-particle problem had not occurred. This system solution is designed to be compatible with most existing memory designs so that there should be minimal additional cost for implementing it. The procedure described herein uses the capability of a single-error-correcting and double-error-detecting code to detect one hard and one soft error; then a microcode and hardware algorithm performs the correction of both errors. Results of both analytical and simulation modeling of the method and its comparison with other techniques are also included.