Detection and diagnosis of software malfunctions

Abstract Today, providing computer software involves greater cost and risk than providing computer equipment, because hardware is mass produced by industry using proven technology, while software is still produced mostly by the craft of individual computer programmers. Software reliability improvement is achieved through not only structure and care in design, implementation and verification of software, but also through effective use of redundancy in the form of robust data structures and information about what constitutes expected behaviour of software. Quick detection of malfunctions minimizes the damage caused by malfunction and leads to a rapid recovery. This paper summarizes techniques and tools used for detection and diagnosis of malfunctions occuring in software systems.