A Review of Metrics and Modeling Techniques in Software Fault Prediction Model Development

This paper surveys different software fault predictions progressed through different data analytic techniques reported in the software engineering literature. This study split in three broad areas; (a) The description of software metrics suites reported and validated in the literature. (b) A brief outline of previous research published in the development of software fault prediction model based on various analytic techniques. This utilizes the taxonomy of analytic techniques while summarizing published research. (c) A review of the advantages of using the combination of metrics. Though, this area is comparatively new and needs more research efforts.

[1]  Mohammad Zulkernine,et al.  Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities , 2011, J. Syst. Archit..

[2]  Claes Wohlin,et al.  Systematic literature reviews in software engineering , 2013, Inf. Softw. Technol..

[3]  Yong Rae Kwon,et al.  Empirical evaluation of a fuzzy logic-based software quality prediction model , 2002, Fuzzy Sets Syst..

[4]  H. Altay Güvenir,et al.  An overview of regression techniques for knowledge discovery , 1999, The Knowledge Engineering Review.

[5]  Rudolf Ferenc,et al.  Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems , 2008, IEEE Transactions on Software Engineering.

[6]  Walter F. Tichy,et al.  Should Computer Scientists Experiment More? , 1998, Computer.

[7]  Bart Baesens,et al.  Mining software repositories for comprehensible software fault prediction models , 2008, J. Syst. Softw..

[8]  Alejandro Zunino,et al.  A Suite of Cognitive Complexity Metrics , 2012, ICCSA.

[9]  Yuanyuan Zhang,et al.  Search Based Software Engineering: A Comprehensive Analysis and Review of Trends Techniques and Applications , 2009 .

[10]  Banu Diri,et al.  A systematic review of software fault prediction studies , 2009, Expert Syst. Appl..

[11]  Richard H. Carver,et al.  An Evaluation of the MOOD Set of Object-Oriented Software Metrics , 1998, IEEE Trans. Software Eng..

[12]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[13]  K. K. Aggarwal,et al.  Empirical Study of Object-Oriented Metrics , 2006, J. Object Technol..

[14]  Sanjay Misra,et al.  Object-Oriented Cognitive Complexity Measures: An Analysis , 2014 .

[15]  Yong Wang,et al.  Using Model Trees for Classification , 1998, Machine Learning.

[16]  Taghi M. Khoshgoftaar,et al.  An empirical study of predicting software faults with case-based reasoning , 2006, Software Quality Journal.

[17]  Diane Kelly,et al.  More testing should be taught , 2001, CACM.

[18]  Ioannis Stamelos,et al.  Regression via Classification applied on software defect estimation , 2008, Expert Syst. Appl..

[19]  Witold Pedrycz,et al.  Identification of defect-prone classes in telecommunication software systems using design metrics , 2006, Inf. Sci..

[20]  Helen Sharp,et al.  Motivation in Software Engineering: A systematic literature review , 2008, Inf. Softw. Technol..

[21]  Luigi Lavazza,et al.  The role of the measure of functional complexity in effort estimation , 2010, PROMISE '10.

[22]  Giuseppe Visaggio,et al.  Evaluating predictive quality models derived from software measures: Lessons learned , 1997, J. Syst. Softw..

[23]  Yingxu Wang,et al.  Measurement of the cognitive functional complexity of software , 2003, The Second IEEE International Conference on Cognitive Informatics, 2003. Proceedings..

[24]  Tong-Seng Quah,et al.  Application of neural networks for software quality prediction using object-oriented metrics , 2005, J. Syst. Softw..

[25]  Michele Lanza,et al.  Evaluating defect prediction approaches: a benchmark and an extensive comparison , 2011, Empirical Software Engineering.

[26]  Joanne Bechta Dugan,et al.  Empirical Analysis of Software Fault Content and Fault Proneness Using Bayesian Methods , 2007, IEEE Transactions on Software Engineering.

[27]  Danielle Azar,et al.  An ant colony optimization algorithm to improve software quality prediction models: Case of class stability , 2011, Inf. Softw. Technol..

[28]  Mohammad Alshayeb,et al.  An Empirical Validation of Object-Oriented Metrics in Two Different Iterative Software Processes , 2003, IEEE Trans. Software Eng..

[29]  Michele Lanza,et al.  An extensive comparison of bug prediction approaches , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[30]  Paula Gomes Mian,et al.  Systematic Review in Software Engineering , 2005 .

[31]  Rachel Harrison,et al.  On software engineering repositories and their open problems , 2012, 2012 First International Workshop on Realizing AI Synergies in Software Engineering (RAISE).

[32]  Per Runeson,et al.  What do we know about defect detection methods? [software testing] , 2006, IEEE Software.

[33]  Banu Diri,et al.  Practical development of an Eclipse-based software fault prediction tool using Naive Bayes algorithm , 2011, Expert Syst. Appl..

[34]  Hong-Zhong Huang,et al.  Early Software Quality Prediction Based on a Fuzzy Neural Network Model , 2007, Third International Conference on Natural Computation (ICNC 2007).

[35]  Taghi M. Khoshgoftaar,et al.  Application of neural networks to software quality modeling of a very large telecommunications system , 1997, IEEE Trans. Neural Networks.

[36]  Andreas Zeller,et al.  Mining metrics to predict component failures , 2006, ICSE.

[37]  David P. Tegarden,et al.  A software complexity model of object-oriented systems , 1995, Decis. Support Syst..

[38]  Kerong Ben,et al.  Software Metrics Reduction for Fault-Proneness Prediction of Software Modules , 2010, NPC.

[39]  Shari Lawrence Pfleeger,et al.  Preliminary Guidelines for Empirical Research in Software Engineering , 2002, IEEE Trans. Software Eng..

[40]  Marco Tulio Valente,et al.  Static correspondence and correlation between field defects and warnings reported by a bug finding tool , 2011, Software Quality Journal.

[41]  Lionel C. Briand,et al.  A systematic and comprehensive investigation of methods to build and evaluate fault prediction models , 2010, J. Syst. Softw..

[42]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[43]  Capers Jones Applied Software Measurement: Global Analysis of Productivity and Quality , 1991 .

[44]  Hongfang Liu,et al.  Building effective defect-prediction models in practice , 2005, IEEE Software.

[45]  Carolyn B. Seaman,et al.  Qualitative Methods in Empirical Studies of Software Engineering , 1999, IEEE Trans. Software Eng..

[46]  Adam A. Porter,et al.  Empirical studies of software engineering: a roadmap , 2000, ICSE '00.

[47]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[48]  Karim O. Elish,et al.  Predicting defect-prone software modules using support vector machines , 2008, J. Syst. Softw..

[49]  Youngki Hong,et al.  Prediction of defect distribution based on project characteristics for proactive project management , 2010, PROMISE '10.

[50]  Jun Zheng,et al.  Cost-sensitive boosting neural networks for software defect prediction , 2010, Expert Syst. Appl..

[51]  L. Vasil’ev Third International Conference on Heat Pipes , 1978 .

[52]  Sunil Vadera,et al.  Artificial Intelligence Applications for Improved Software Engineering Development: New Prospects , 2009 .

[53]  Warren Harrison,et al.  Coordinating models and metrics to manage software projects , 2000 .

[54]  Danny Ho,et al.  An Empirical Validation of Object-Oriented Design Metrics for Fault Prediction , 2008 .

[55]  Djuradj Babic Adaptive software fault prediction approach using object-oriented metrics , 2012 .

[56]  Giuliano Antoniol,et al.  A novel composite model approach to improve software quality prediction , 2010, Inf. Softw. Technol..

[57]  Mei-Hwa Chen,et al.  An empirical study on object-oriented metrics , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[58]  Vadlamani Ravi,et al.  Software reliability prediction by soft computing techniques , 2008, J. Syst. Softw..

[59]  Neeraj Kumar Goyal,et al.  A Fuzzy Model for Early Software Fault Prediction Using Process Maturity and Software Metrics , 2009 .

[60]  Ayse Basar Bener,et al.  Defect prediction from static code features: current results, limitations, new approaches , 2010, Automated Software Engineering.

[61]  Janice Singer,et al.  Ethical Issues in Empirical Studies of Software Engineering , 2002, IEEE Trans. Software Eng..

[62]  Nan-Hsing Chiu,et al.  Combining techniques for software quality classification: An integrated decision network approach , 2011, Expert Syst. Appl..

[63]  Michael R. Lyu,et al.  A novel method for early software quality prediction based on support vector machine , 2005, 16th IEEE International Symposium on Software Reliability Engineering (ISSRE'05).

[64]  Taghi M. Khoshgoftaar,et al.  Using regression trees to classify fault-prone software modules , 2002, IEEE Trans. Reliab..

[65]  Tim Menzies,et al.  Special issue on repeatable results in software engineering prediction , 2012, Empirical Software Engineering.

[66]  Emilia Mendes,et al.  How effective is Tabu search to configure support vector regression for effort estimation? , 2010, PROMISE '10.

[67]  Tim Menzies,et al.  Case-based reasoning vs parametric models for software quality optimization , 2010, PROMISE '10.

[68]  BaesensBart,et al.  Benchmarking Classification Models for Software Defect Prediction , 2008 .

[69]  Bojan Cukic,et al.  Robust prediction of fault-proneness by random forests , 2004, 15th International Symposium on Software Reliability Engineering.

[70]  Abraham Kandel,et al.  Data mining in software metrics databases , 2004, Fuzzy Sets Syst..

[71]  G. De’ath,et al.  CLASSIFICATION AND REGRESSION TREES: A POWERFUL YET SIMPLE TECHNIQUE FOR ECOLOGICAL DATA ANALYSIS , 2000 .

[72]  Miriam A. M. Capretz,et al.  Extension of Object-Oriented Metrics Suite for , 2013 .

[73]  Martin Hitz,et al.  Measuring coupling and cohesion in object-oriented systems , 1995 .

[74]  Sašo Džeroski,et al.  Using regression trees to identify the habitat preference of the sea cucumber (Holothuria leucospilota) on Rarotonga, Cook Islands , 2003 .

[75]  Katrina D. Maxwell,et al.  Applied Statistics for Software Managers , 2002 .

[76]  Linda M. Laird,et al.  Software Measurement and Estimation: A Practical Approach , 2006 .

[77]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[78]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[79]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[80]  Moataz A. Ahmed,et al.  Software development effort prediction: A study on the factors impacting the accuracy of fuzzy logic systems , 2010, Inf. Softw. Technol..

[81]  Anh Nguyen Duc The impact of design complexity on software cost and quality , 2010 .

[82]  Norman E. Fenton,et al.  Software Measurement: Uncertainty and Causal Modeling , 2002, IEEE Softw..

[83]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[84]  Tracy Halla,et al.  A Systematic Review of Fault Prediction Performance in Software Engineering a Systematic Review of Fault Prediction Performance in Software Engineering , 2011 .

[85]  Cem Kaner,et al.  Software Engineering Metrics: What Do They Measure and How Do We Know? , 2004 .

[86]  Witold Pedrycz,et al.  Practical assessment of the models for identification of defect-prone classes in object-oriented commercial systems using design metrics , 2003, J. Syst. Softw..

[87]  Arvinder Kaur,et al.  Empirical validation of object-oriented metrics for predicting fault proneness at different severity levels using support vector machines , 2010, Int. J. Syst. Assur. Eng. Manag..

[88]  Arvinder Kaur,et al.  Empirical analysis for investigating the effect of object-oriented metrics on fault proneness: a replicated case study , 2009 .

[89]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A systematic literature review , 2009, Inf. Softw. Technol..

[90]  Richard Torkar,et al.  Software fault prediction metrics: A systematic literature review , 2013, Inf. Softw. Technol..

[91]  Carolyn Mair,et al.  Human judgement and software metrics: vision for the future , 2011, WETSoM '11.

[92]  Mark Harman,et al.  The relationship between search based software engineering and predictive modeling , 2010, PROMISE '10.

[93]  Gregory Gay A baseline method for search-based software engineering , 2010, PROMISE '10.

[94]  Vishal Sharma,et al.  Handling imprecision in inputs using fuzzy logic to predict effort in software development , 2010, 2010 IEEE 2nd International Advance Computing Conference (IACC).

[95]  Alaa F. Sheta,et al.  Predicting the Reliability of Software Systems Using Fuzzy Logic , 2011, 2011 Eighth International Conference on Information Technology: New Generations.

[96]  Swapna S. Gokhale,et al.  Regression Tree Modeling For The Prediction Of Software Quality , 1997 .

[97]  Carl G. Davis,et al.  A Hierarchical Model for Object-Oriented Design Quality Assessment , 2002, IEEE Trans. Software Eng..

[98]  Hoh Peter In,et al.  Micro interaction metrics for defect prediction , 2011, ESEC/FSE '11.

[99]  Helen Sharp,et al.  Protocol for a Systematic Literature Review of Motivation in Software Engineering , 2006 .

[100]  Filomena Ferrucci,et al.  A Genetic Algorithm to Configure Support Vector Machines for Predicting Fault-Prone Components , 2011, PROFES.

[101]  Pearl Brereton,et al.  Evaluation and assessment in software engineering , 1997, J. Syst. Softw..

[102]  Bart Baesens,et al.  Toward Comprehensible Software Fault Prediction Models Using Bayesian Network Classifiers , 2013, IEEE Transactions on Software Engineering.

[103]  Thomas A. Runkler,et al.  Data Analytics: Models and Algorithms for Intelligent Data Analysis , 2020 .

[104]  Chenggang Bai,et al.  Software failure prediction based on a Markov Bayesian network model , 2005, J. Syst. Softw..

[105]  Stefan Biffl,et al.  Defect Prediction using Combined Product and Project Metrics - A Case Study from the Open Source "Apache" MyFaces Project Family , 2008, 2008 34th Euromicro Conference Software Engineering and Advanced Applications.

[106]  Xin Yan,et al.  Linear Regression Analysis: Theory and Computing , 2009 .

[107]  Kai Petersen,et al.  Systematic Mapping Studies in Software Engineering , 2008, EASE.

[108]  Akito Monden,et al.  An analysis of developer metrics for fault prediction , 2010, PROMISE '10.

[109]  Wei Li,et al.  Object-Oriented Metrics Which Predict Maintainability , 1993 .

[110]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[111]  Sanjay Misra Evaluation Criteria for Object-oriented Metrics , 2011 .

[112]  Taghi M. Khoshgoftaar,et al.  Fault Prediction Modeling for Software Quality Estimation: Comparing Commonly Used Techniques , 2003, Empirical Software Engineering.

[113]  Mark Lorenz,et al.  Object-oriented software metrics - a practical guide , 1994 .

[114]  Thilo Mende,et al.  Replication of defect prediction studies: problems, pitfalls and recommendations , 2010, PROMISE '10.