Mapping Software Metrics to Module Complexity: A Pattern Classification Approach

A desirable software engineering goal is the prediction of software module complexity (a qualitative concept) using automatically generated software metrics (quantitative measurements). This goal may be couched in the language of pattern classification; namely, given a set of metrics (a pattern) for a software module, predict the class (level of complexity) to which the module belongs. To find this mapping from metrics to complexity, we present a classification strategy, stochastic metric selection, to determine the subset of software metrics that yields the greatest predictive power with respect to module complexity. We demonstrate the effectiveness of this strategy by empirically evaluating it using a publicly available dataset of metrics compiled from a medical imaging system and comparing the prediction results against several classification system benchmarks.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Witold Pedrycz,et al.  Effective classification using feature selection and fuzzy integration , 2008, Fuzzy Sets Syst..

[3]  Maurice H. Halstead,et al.  Elements of software science , 1977 .

[4]  J. E. Glynn,et al.  Numerical Recipes: The Art of Scientific Computing , 1989 .

[5]  J. Fleiss Measuring agreement between two judges on the presence or absence of a trait. , 1975, Biometrics.

[6]  Xin Yao,et al.  Gene selection algorithms for microarray data based on least squares support vector machine , 2006, BMC Bioinformatics.

[7]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[8]  V. Vapnik Pattern recognition using generalized portrait method , 1963 .

[9]  Brian Everitt,et al.  MOMENTS OF THE STATISTICS KAPPA AND WEIGHTED KAPPA , 1968 .

[10]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[11]  K. Vairavan,et al.  An Experimental Investigation of Software Metrics and Their Relationship to Software Development Effort , 1989, IEEE Trans. Software Eng..

[12]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[13]  Witold Pedrycz,et al.  Software Engineering: An Engineering Approach , 1999 .

[14]  Roger S. Pressman,et al.  Software Engineering: A Practitioner's Approach , 1982 .

[15]  Thomas McGinn,et al.  Tips for learners of evidence-based medicine: 3. Measures of observer variability (kappa statistic) , 2004, Canadian Medical Association Journal.

[16]  Witold Pedrycz,et al.  Software quality analysis with the use of computational intelligence , 2003, Inf. Softw. Technol..

[17]  Ramanathan Gnanadesikan,et al.  Methods for statistical data analysis of multivariate observations , 1977, A Wiley publication in applied statistics.

[18]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[19]  William H. Press,et al.  Numerical recipes in C. The art of scientific computing , 1987 .

[20]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[21]  Geert Poels,et al.  Distance-based software measurement: necessary and sufficient properties for software measures , 2000, Inf. Softw. Technol..

[22]  Qingzhong Liu,et al.  Feature mining and pattern classification for steganalysis of LSB matching steganography in grayscale images , 2008, Pattern Recognit..

[23]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[24]  Elaine J. Weyuker,et al.  Evaluating Software Complexity Measures , 2010, IEEE Trans. Software Eng..

[25]  Norman Fenton,et al.  Metrics and software structure , 1987 .

[26]  Nikola K. Kasabov,et al.  DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction , 2002, IEEE Trans. Fuzzy Syst..

[27]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[28]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[29]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[30]  Ramanathan Gnanadesikan Methods for Statistical Data Analysis of Multivariate Observations: Gnanadesikan/Methods , 1997 .

[31]  David G. Stork,et al.  Pattern Classification , 1973 .

[32]  Barbara A. Kitchenham,et al.  Modeling Software Measurement Data , 2001, IEEE Trans. Software Eng..

[33]  Massimiliano Pontil,et al.  Support Vector Machines: Theory and Applications , 2001, Machine Learning and Its Applications.

[34]  Roger S. Pressman,et al.  Software Engineering: A Practitioner's Approach (McGraw-Hill Series in Computer Science) , 2004 .

[35]  Shari Lawrence Pfleeger,et al.  Software metrics (2nd ed.): a rigorous and practical approach , 1997 .