CrossPare: A Tool for Benchmarking Cross-Project Defect Predictions

During the last decade, many papers on defect prediction were published. One still for the most part unresolved issue are cross-project defect predictions. Here, the aim is to predict the defects of a project, with data from other projects. Many approaches were suggested and evaluated in recent years. However, due to the usage of different implementations and data sets, the comparison between the work is a hard task. Within this paper, we present the tool CrossPare. CrossPare is designed to facilitate benchmarks for cross-project defect predictions. The tool already implements many techniques proposed within the current state of the art of cross-project defect predictions. Moreover, the tool is able to load different data sets that are commonly used for the evaluation of techniques and supports all major performance metrics. Through the usage of CrossPare other reseachers can improve the comparability of their results and possibly also reduce their implementation efforts for new cross-project defect prediction techniques by reusing features already offered by CrossPare.

[1]  Koichiro Ochimizu,et al.  Towards logistic regression models for predicting fault-prone code across software projects , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[2]  Jens Grabowski,et al.  Automated Deployment and Parallel Execution of Legacy Applications in Cloud Environments (Short Paper) , 2015, 2015 IEEE 8th International Conference on Service-Oriented Computing and Applications (SOCA).

[3]  Audris Mockus,et al.  Amassing and indexing a large sample of version control systems: Towards the census of public source code history , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[4]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[5]  Michele Lanza,et al.  An extensive comparison of bug prediction approaches , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[6]  Lech Madeyski,et al.  Towards identifying software project clusters with regard to defect prediction , 2010, PROMISE '10.

[7]  Franz Wotawa,et al.  A Novel Industry Grade Dataset for Fault Prediction Based on Model-Driven Developed Automotive Embedded Software , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[8]  Premkumar T. Devanbu,et al.  Recalling the "imprecision" of cross-project defect prediction , 2012, SIGSOFT FSE.

[9]  Audris Mockus,et al.  Towards building a universal defect prediction model , 2014, MSR 2014.

[10]  Ayse Basar Bener,et al.  On the relative value of cross-company and within-company data for defect prediction , 2009, Empirical Software Engineering.

[11]  Jens Grabowski,et al.  Towards a Model-based Software Mining Infrastructure , 2015, SOEN.

[12]  Steffen Herbold,et al.  Training data selection for cross-project defect prediction , 2013, PROMISE.

[13]  Qinbao Song,et al.  Data Quality: Some Comments on the NASA Software Defect Datasets , 2013, IEEE Transactions on Software Engineering.

[14]  Forrest Shull,et al.  Local versus Global Lessons for Defect Prediction and Effort Estimation , 2013, IEEE Transactions on Software Engineering.

[15]  Harald C. Gall,et al.  Cross-project defect prediction: a large scale experiment on data vs. domain vs. process , 2009, ESEC/SIGSOFT FSE.

[16]  Tim Menzies,et al.  Learning from Open-Source Projects: An Empirical Study on Defect Prediction , 2013, 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement.

[17]  Haruhiko Kaiya,et al.  Adapting a fault prediction model to allow inter languagereuse , 2008, PROMISE '08.

[18]  Guangchun Luo,et al.  Transfer learning for cross-company software defect prediction , 2012, Inf. Softw. Technol..

[19]  Sinno Jialin Pan,et al.  Transfer defect learning , 2013, 2013 35th International Conference on Software Engineering (ICSE).