Entropies as measures of software information

This paper investigates the use of entropies as measures of software information content. Several entropies, including the well-known Shannon entropy, are characterized by their mathematical properties. Based on these characterizations, the entropies, which are suitable for measuring software systems, are rigorously chosen. By treating a software system as an information source, the function calls in procedural systems or method invocations in object-oriented systems resemble the emission of symbols from an information source. Thus, the probabilities, required for computing the entropies, are obtained using an empirical distribution of function calls or method invocations. Application of the suggested measures on procedural and object-oriented programs is further explained using two small examples. Because a. rigorous definition of information measures does not guarantee their usefulness in practice, an evaluation case study is performed In particular, the aim of this study is to practically evaluate the intuitiveness and scalability of the measures on a real software system totaling about 460000 lines of code. In addition to being intuitive and meaningful, the case study results highlight differences between the information measures. Thus, the family of measures presented can satisfy different measurement requirements.

[1]  Vishwani D. Agrawal,et al.  An entropy measure for the complexity of multi-output Boolean functions , 1991, DAC '90.

[2]  William B. Frakes,et al.  Software Engineering in the Unix/C Environment , 1991 .

[3]  Maurice H. Halstead,et al.  Elements of software science , 1977 .

[4]  L. L. Campbell,et al.  On Measures of Information and Their Characterizations (J. Aczél and Z. Daróczy) , 1977 .

[5]  Richard W. Hamming,et al.  Coding and Information Theory , 2018, Feynman Lectures on Computation.

[6]  Sandro Morasca,et al.  Property-Based Software Engineering Measurement , 1996, IEEE Trans. Software Eng..

[7]  Taghi M. Khoshgoftaar,et al.  Measuring coupling and cohesion: an information-theory approach , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[8]  Elaine J. Weyuker,et al.  Evaluating Software Complexity Measures , 2010, IEEE Trans. Software Eng..

[9]  Giuseppe Visaggio Structural information as a quality metric in software systems organization , 1997, 1997 Proceedings International Conference on Software Maintenance.

[10]  J G Daugman,et al.  Information Theory and Coding , 2005 .

[11]  Yong Rae Kwon,et al.  Complexity Measures for Concurrent Programs Based on Information-Theoretic Metrics , 1993, Inf. Process. Lett..

[12]  Taghi M. Khoshgoftaar,et al.  Dynamic system complexity , 1993, [1993] Proceedings First International Software Metrics Symposium.

[13]  Timothy A. Budd,et al.  An introduction to object-oriented programming , 1991 .

[14]  Warren Harrison,et al.  An Entropy-Based Measure of Software Complexity , 1992, IEEE Trans. Software Eng..

[15]  Adam A. Porter,et al.  Empirically guided software development using metric-based classification trees , 1990, IEEE Software.

[16]  C. Koutsougeras,et al.  Training of a neural network for pattern classification based on an entropy measure , 1988, IEEE 1988 International Conference on Neural Networks.

[17]  John Stephen Davis,et al.  A Study of the Applicability of Complexity Measures , 1988, IEEE Trans. Software Eng..

[18]  Eli Berlinger An information theory based complexity measure , 1899 .

[19]  Yoichi Muraoka,et al.  Building software quality classification trees: Approach, experimentation, evaluation , 1999 .

[20]  Lein Harn,et al.  Entropy as a measure of database information , 1990, [1990] Proceedings of the Sixth Annual Computer Security Applications Conference.

[21]  Kapsu Kim,et al.  Complexity measures for object-oriented program based on the entropy , 1995, Proceedings 1995 Asia Pacific Software Engineering Conference.