Software Fault Dataset

Machine learning and statistical techniques are used in software fault prediction to predict the presence or the absence of faults in the given software modules. In order to make the predictions, a software fault prediction learns upon the software fault data having the information about the software system (software metrics) augmented with the fault value. An implicit requirement to perform effective software fault prediction is the availability of reasonable quality fault data. However, obtaining quality software fault data is difficult as in general software development companies are not keen to share their software development information or they are not having any software repository in first place.

[1]  Cagatay Catal,et al.  Software fault prediction: A literature review and current trends , 2011, Expert Syst. Appl..

[2]  Hongfang Liu,et al.  An investigation of the effect of module size on defect prediction using static measures , 2005, PROMISE@ICSE.

[3]  Bruce Christianson,et al.  The misuse of the NASA metrics data program data sets for automated software defect prediction , 2011, EASE.

[4]  Ayse Basar Bener,et al.  An algorithmic approach to missing data problem in modeling human aspects in software development , 2013, PROMISE.

[5]  Ruchika Malhotra Empirical Research in Software Engineering: Concepts, Analysis, and Applications , 2015 .

[6]  Francisco Herrera,et al.  A unifying view on dataset shift in classification , 2012, Pattern Recognit..

[7]  Charu C. Aggarwal,et al.  Outlier Analysis , 2013, Springer New York.

[8]  Karim O. Elish,et al.  Predicting defect-prone software modules using support vector machines , 2008, J. Syst. Softw..

[9]  Rachel Harrison,et al.  On software engineering repositories and their open problems , 2012, 2012 First International Workshop on Realizing AI Synergies in Software Engineering (RAISE).

[10]  Sunghun Kim,et al.  Reducing Features to Improve Bug Prediction , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[11]  Paolo Ciancarini,et al.  The evolution of configuration management and version control , 1990, Softw. Eng. J..

[12]  Taghi M. Khoshgoftaar,et al.  Improving Software-Quality Predictions With Data Sampling and Boosting , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[13]  Bojan Cukic,et al.  Predicting fault prone modules by the Dempster-Shafer belief networks , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[14]  Taghi M. Khoshgoftaar,et al.  Choosing software metrics for defect prediction: an investigation on feature selection techniques , 2011, Softw. Pract. Exp..

[15]  Harald C. Gall,et al.  Detection of logical coupling based on product release history , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[16]  Taghi M. Khoshgoftaar,et al.  Building Useful Models from Imbalanced Data with Sampling and Boosting , 2008, FLAIRS.

[17]  Oral Alan,et al.  An outlier detection algorithm based on object-oriented metrics thresholds , 2009, 2009 24th International Symposium on Computer and Information Sciences.