Relating Clusterization Measures and Software Quality

Empirical studies have shown that dependence clusters are both prevalent in source code and detrimental to many activities related to software, including maintenance, testing and comprehension. Based on such observations, it would be worthwhile to try to give a more precise characterization of the connection between dependence clusters and software quality. Such attempts are hindered by a number of difficulties: there are problems in assessing the quality of software, measuring the degree of clusterization of software and finding the means to exhibit the connection (or lack of it) between the two. In this paper we present our approach to establish a connection between software quality and clusterization. Software quality models comprise of low- and high-level quality attributes, in addition we defined new clusterization metrics that give a concise characterization of the clusters contained in programs. Apart from calculating correlation coefficients, we used mutual information to quantify the relationship between clusterization and quality. Results show that a connection can be demonstrated between the two, and that mutual information combined with correlation can be a better indicator to conduct deeper examinations in the area.

[1]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[2]  Tibor Gyimóthy,et al.  Impact Analysis in the Presence of Dependence Clusters Using Static Execute after in WebKit , 2012, 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation.

[3]  David W. Binkley,et al.  Interprocedural slicing using dependence graphs , 1988, SIGP.

[4]  David W. Binkley,et al.  Dependence cluster visualization , 2010, SOFTVIS '10.

[5]  Mark Harman,et al.  Locating dependence clusters and dependence pollution , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[6]  Tracy Hall,et al.  Fault Analysis in OSS Based on Program Slicing Metrics , 2009, 2009 35th Euromicro Conference on Software Engineering and Advanced Applications.

[7]  Carsten O. Daub,et al.  The mutual information: Detecting and evaluating dependencies between variables , 2002, ECCB.

[8]  Mark Harman,et al.  Dependence clusters in source code , 2009, TOPL.

[9]  Mark Harman,et al.  Dependence Anti Patterns , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering - Workshops.

[10]  Mark Harman,et al.  Identifying 'Linchpin Vertices' That Cause Large Dependence Clusters , 2009, 2009 Ninth IEEE International Working Conference on Source Code Analysis and Manipulation.

[11]  Ákos Hajnal,et al.  A demand‐driven approach to slicing legacy COBOL systems , 2012, J. Softw. Maintenance Res. Pract..

[12]  Alexander Serebrenik,et al.  Theil index for aggregation of software metrics values , 2010, 2010 IEEE International Conference on Software Maintenance.

[13]  Tibor Gyimóthy,et al.  A probabilistic software quality model , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[14]  Mark Harman,et al.  Coherent dependence clusters , 2010, PASTE '10.

[15]  Mithun Acharya,et al.  Practical change impact analysis based on static program slicing for industrial software systems , 2011, 2011 33rd International Conference on Software Engineering (ICSE).