Key Questions in Building Defect Prediction Models in Practice

The information about which modules of a future version of a software system are defect-prone is a valuable planning aid for quality managers and testers. Defect prediction promises to indicate these defect-prone modules. However, constructing effective defect prediction models in an industrial setting involves a number of key questions. In this paper we discuss ten key questions identified in context of establishing defect prediction in a large software development project. Seven consecutive versions of the software system have been used to construct and validate defect prediction models for system test planning. Furthermore, the paper presents initial empirical results from the studied project and, by this means, contributes answers to the identified questions.

[1]  Constantin V. Negoita,et al.  On Fuzzy Systems , 1978 .

[2]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[3]  Stefan Biffl,et al.  A Framework for Defect Prediction in Specific Software Project Contexts , 2008, CEE-SET.

[4]  Taghi M. Khoshgoftaar,et al.  Analogy-Based Practical Classification Rules for Software Quality Estimation , 2003, Empirical Software Engineering.

[5]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[6]  Ayse Basar Bener,et al.  Software Defect Prediction Using Call Graph Based Ranking (CGBR) Framework , 2008, 2008 34th Euromicro Conference Software Engineering and Advanced Applications.

[7]  Elaine J. Weyuker,et al.  Software engineering research: from cradle to grave , 2007, ESEC-FSE '07.

[8]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[9]  Eugene Miya,et al.  On "Software engineering" , 1985, SOEN.

[10]  G. Denaro,et al.  An empirical evaluation of fault-proneness models , 2002, Proceedings of the 24th International Conference on Software Engineering. ICSE 2002.

[11]  Felix Kossak,et al.  Extracting knowledge and computable models from data - needs, expectations, and experience , 2004, 2004 IEEE International Conference on Fuzzy Systems (IEEE Cat. No.04CH37542).

[12]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[13]  Rudolf Ramler,et al.  Issues and effort in integrating data from heterogeneous software repositories and corporate databases , 2008, ESEM '08.

[14]  Mary Shaw,et al.  Experiences and results from initiating field defect prediction and product test prioritization efforts at ABB Inc. , 2006, ICSE.

[15]  Yi Zhang,et al.  Classifying Software Changes: Clean or Buggy? , 2008, IEEE Transactions on Software Engineering.

[16]  Rudolf Ramler The impact of product development on the lifecycle of defects , 2008, DEFECTS '08.

[17]  Andreas Zeller,et al.  Mining metrics to predict component failures , 2006, ICSE.

[18]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[19]  Hongfang Liu,et al.  Building effective defect-prediction models in practice , 2005, IEEE Software.

[20]  Akif Günes Koru,et al.  An empirical comparison and characterization of high defect and high complexity modules , 2003, J. Syst. Softw..

[21]  Ramanath Subramanyam,et al.  Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects , 2003, IEEE Trans. Software Eng..

[22]  Tim Menzies,et al.  When can we test less? , 2003, Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No.03EX717).

[23]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[24]  Ayse Basar Bener,et al.  A Two-Step Model for Defect Density Estimation , 2007, 33rd EUROMICRO Conference on Software Engineering and Advanced Applications (EUROMICRO 2007).