Applying Swarm Ensemble Clustering Technique for Fault Prediction Using Software Metrics

Number of defects remaining in a system provides an insight into the quality of the system. Defect detection systems predict defects by using software metrics and data mining techniques. Clustering analysis is adopted to build the software defect prediction models. Cluster ensembles have emerged as a prominent method for improving robustness, stability and accuracy of clustering solutions. The clustering ensembles combine multiple partitions generated by different clustering algorithms into a single clustering solution. In this paper, the clustering ensemble using Particle Swarm Optimization algorithm (PSO) solution is proposed to improve the prediction quality. An empirical study shows that the PSO can be a good choice to build defect prediction software models.

[1]  Xin Zheng,et al.  Software metrics data clustering for quality prediction , 2006 .

[2]  Shanthini. A Chandrasekaran,et al.  Applying Machine Learning for Fault Prediction Using Software Metrics , 2012 .

[3]  Scott Dick,et al.  Fuzzy Clustering of Open-Source Software Quality Data: A Case Study of Mozilla , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[4]  S. M. Fakhrahmad,et al.  Applying Mining Schemes to Software Fault Prediction : A Proposed Approach Aimed at Test Cost Reduction , .

[5]  Taghi M. Khoshgoftaar,et al.  Analyzing software measurement data with clustering techniques , 2004, IEEE Intelligent Systems.

[6]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[7]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[8]  Thomas J. Ostrand,et al.  \{PROMISE\} Repository of empirical software engineering data , 2007 .

[9]  Bruce Christianson,et al.  Software defect prediction using static code metrics underestimates defect-proneness , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[10]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[11]  Umeshwar Dayal,et al.  K-Harmonic Means - A Data Clustering Algorithm , 1999 .

[12]  N. Manikandan,et al.  Defect association and complexity prediction by mining association and clustering rules , 2010, 2010 2nd International Conference on Computer Engineering and Technology.

[13]  Dilson Lucas Pereira,et al.  Study of different approach to clustering data by using the Particle Swarm Optimization Algorithm , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[14]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[15]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[16]  Anil K. Jain,et al.  Clustering ensembles: models of consensus and weak partitions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Gillian Dobbie,et al.  An Evolutionary Particle Swarm Optimization algorithm for data clustering , 2008, 2008 IEEE Swarm Intelligence Symposium.

[18]  Lech Madeyski,et al.  Towards identifying software project clusters with regard to defect prediction , 2010, PROMISE '10.

[19]  Stan Matwin,et al.  A review on particle swarm optimization algorithm and its variants to clustering high-dimensional data , 2013, Artificial Intelligence Review.

[20]  A. C. Patil,et al.  Software Quality Analysis with Clustering Method , 2013 .

[21]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[22]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[23]  Leandro N. de Castro,et al.  Data Clustering with Particle Swarms , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[24]  Ping Guo,et al.  Software Metrics Analysis with Genetic Algorithm and Affinity Propagation Clustering , 2008, DMIN.

[25]  Bruce Christianson,et al.  The misuse of the NASA metrics data program data sets for automated software defect prediction , 2011, EASE.

[26]  Andries Petrus Engelbrecht,et al.  Data clustering using particle swarm optimization , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..