Finding Effective Software Metrics to Classify Maintainability Using a Parallel Genetic Algorithm

The ability to predict the quality of a software object can be viewed as a classification problem, where software metrics are the features and expert quality rankings the class labels. Evolutionary computational techniques such as genetic algorithms can be used to find a subset of metrics that provide an optimal classification for the quality of software objects. Genetic algorithms are also parallelizable, in that the fitness function (how well a set of metrics can classify the software objects) can be calculated independently from other possible solutions. A manager-worker parallel version of a genetic algorithm to find optimal metrics has been implemented using MPI and tested on a Beowulf cluster resulting in an efficiency of 0.94. Such a speed-up facilitated using larger populations for longer generations. Sixty-four source code metrics from a 366 class Java-based biomedical data analysis program were used and resulted in classification accuracy of 78.4%.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Taghi M. Khoshgoftaar,et al.  Genetic programming model for software quality classification , 2001, Proceedings Sixth IEEE International Symposium on High Assurance Systems Engineering. Special Topic: Impact of Networking.

[3]  Shari Lawrence Pfleeger,et al.  Software metrics (2nd ed.): a rigorous and practical approach , 1997 .

[4]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[5]  Taghi M. Khoshgoftaar,et al.  Using the genetic algorithm to build optimal neural networks for fault-prone module detection , 1996, Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering.

[6]  Tsutomu Ishida,et al.  Metrics and Models in Software Quality Engineering , 1995 .

[7]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[8]  David G. Stork,et al.  Pattern Classification , 1973 .

[9]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[10]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[11]  R L Somorjai,et al.  Near‐optimal region selection for feature space reduction: novel preprocessing methods for classifying MR spectra , 1998, NMR in biomedicine.

[12]  Mark Harman,et al.  The SEMINAL workshop: reformulating software engineering as a metaheuristic search problem , 2001, SOEN.

[13]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[14]  Hesham El-Rewini,et al.  Message Passing Interface (MPI) , 2005 .