Common Trends in Software Fault and Failure Data

The benefits of the analysis of software faults and failures have been widely recognized. However, detailed studies based on empirical data are rare. In this paper, we analyze the fault and failure data from two large, real-world case studies. Specifically, we explore: 1) the localization of faults that lead to individual software failures and 2) the distribution of different types of software faults. Our results show that individual failures are often caused by multiple faults spread throughout the system. This observation is important since it does not support several heuristics and assumptions used in the past. In addition, it clearly indicates that finding and fixing faults that lead to such software failures in large, complex systems are often difficult and challenging tasks despite the advances in software development. Our results also show that requirement faults, coding faults, and data problems are the three most common types of software faults. Furthermore, these results show that contrary to the popular belief, a significant percentage of failures are linked to late life cycle activities. Another important aspect of our work is that we conduct intra- and interproject comparisons, as well as comparisons with the findings from related studies. The consistency of several main trends across software systems in this paper and several related research efforts suggests that these trends are likely to be intrinsic characteristics of software faults and failures rather than project specific.

[1]  Elliot Soloway,et al.  Where the bugs are , 1985, CHI '85.

[2]  Inderpal S. Bhandari,et al.  Orthogonal Defect Classification - A Concept for In-Process Measurements , 1992, IEEE Trans. Software Eng..

[3]  Katerina Goseva-Popstojanova,et al.  Large empirical case study of architecture-based software reliability , 2005, 16th IEEE International Symposium on Software Reliability Engineering (ISSRE'05).

[4]  Elaine J. Weyuker,et al.  The distribution of faults in a large industrial software system , 2002, ISSTA '02.

[5]  John C. Knight,et al.  What Should Aviation Safety Incidents Teach Us ? , 1999 .

[6]  Dewayne E. Perry,et al.  Classification and evaluation of defects in a project retrospective , 2002, J. Syst. Softw..

[7]  Gerard J. Holzmann,et al.  Conquering Complexity , 2007, Computer.

[8]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[9]  Victor R. Basili,et al.  Software errors and complexity: an empirical investigation0 , 1984, CACM.

[10]  Albert Endres,et al.  An analysis of errors and their causes in system programs , 1975, IEEE Transactions on Software Engineering.

[11]  Barry Boehm,et al.  Top 10 list [software development] , 2001 .

[12]  Peter Neumann,et al.  Safeware: System Safety and Computers , 1995, SOEN.

[13]  Victor R. Basili,et al.  The role of experimentation in software engineering: past, current, and future , 1996, Proceedings of IEEE 18th International Conference on Software Engineering.

[14]  Xuan Wang,et al.  Adequacy, Accuracy, Scalability, and Uncertainty of Architecture-based Software Reliability: Lessons Learned from Large Empirical Case Studies , 2006, 2006 17th International Symposium on Software Reliability Engineering.

[15]  Victor R. Basili,et al.  Software errors and complexity: an empirical investigation , 1993 .

[16]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[17]  Robyn R. Lutz,et al.  Ongoing requirements discovery in high-integrity systems , 2004, IEEE Software.

[18]  Barry W. Boehm,et al.  Some experience with automated aids to the design of large-scale reliable software , 1975, IEEE Transactions on Software Engineering.

[19]  Ram Chillarege,et al.  Generation of an error set that emulates software faults based on field data , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[20]  Barry W. Boehm,et al.  Software Defect Reduction Top 10 List , 2001, Computer.

[21]  Weider D. Yu A software fault prevention approach in coding and root cause analysis , 1998, Bell Labs Technical Journal.

[22]  Henrique Madeira,et al.  Emulation of Software Faults: A Field Data Study and a Practical Approach , 2006, IEEE Transactions on Software Engineering.

[23]  Katerina Goseva-Popstojanova,et al.  Architecture-based approach to reliability assessment of software systems , 2001, Perform. Evaluation.

[24]  Robyn R. Lutz,et al.  Empirical analysis of safety-critical anomalies during operations , 2004, IEEE Transactions on Software Engineering.

[25]  Albert Endres,et al.  A handbook of software and systems engineering - empirical observations, laws and theories , 2003, The Fraunhofer IESE series on software engineering.

[26]  Per Runeson,et al.  A Replicated Quantitative Analysis of Fault Distributions in Complex Software Systems , 2007, IEEE Transactions on Software Engineering.