Software Quality and Community Structure in Java Software Networks

We present a study of 600 Java software networks with the aim of characterizing the relationship among their defectiveness and community metrics. We analyze the community structure of such networks, defined as their topological division into subnetworks of densely connected nodes. A high density of connections represents a higher level of cooperation between classes, so a well-defined division in communities could indicate that the software system has been designed in a modular fashion and all its functionalities are well separated. We show how the community structure can be an indicator of well-written, high quality code by retrieving the communities of the analyzed systems and by ranking their division in communities through the built-in metric called modularity. We found that the software systems with highest modularity possess the majority of bugs, and tested whether this result is related to some confounding effect. We found two power laws relating the maximum defect density with two different metric...

[1]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[2]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Qinghua Zheng,et al.  Exploring community structure of software Call Graph and its applications in class cohesion measurement , 2015, J. Syst. Softw..

[4]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  D. L. Parnas,et al.  On the criteria to be used in decomposing systems into modules , 1972, Software Pioneers.

[6]  Michele Marchesi,et al.  A modified Yule process to model the evolution of some object-oriented system properties , 2011, Inf. Sci..

[7]  Allan Tucker,et al.  Munch: An Efficient Modularisation Strategy to Assess the Degree of Refactoring on Sequential Source Code Checkings , 2011, 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops.

[8]  Spiros Mancoridis,et al.  On the evaluation of the Bunch search-based software modularization algorithm , 2007, Soft Comput..

[9]  Alexander Chatzigeorgiou,et al.  Forecasting Java Software Evolution Trends Employing Network Models , 2015, IEEE Transactions on Software Engineering.

[10]  Giulio Concas,et al.  A study of the community structure of a complex software network , 2013, 2013 4th International Workshop on Emerging Trends in Software Metrics (WETSoM).

[11]  Andreas Zeller,et al.  It's not a bug, it's a feature: How misclassification impacts bug prediction , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[12]  Xiaolong Zheng,et al.  Analyzing open-source software systems as complex networks , 2008 .

[13]  Michele Marchesi,et al.  On the Distribution of Bugs in the Eclipse System , 2011, IEEE Transactions on Software Engineering.

[14]  Emden R. Gansner,et al.  Bunch: a clustering tool for the recovery and maintenance of software system structures , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[15]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[16]  Ron Sanchez,et al.  Modularity, flexibility, and knowledge management in product and organization design , 1996 .

[17]  Banu Diri,et al.  Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem , 2009, Inf. Sci..

[18]  Kim B. Clark,et al.  Design Rules: The Power of Modularity Volume 1 , 1999 .

[19]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[20]  Giulio Concas,et al.  Clustering of defects in Java software systems , 2014, WETSoM 2014.

[21]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Marko Bajec,et al.  Community structure of complex software systems: Analysis and applications , 2011, ArXiv.

[23]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Samantha Jenkins,et al.  Software architecture graphs as complex networks: A novel partitioning scheme to measure stability and evolution , 2007, Inf. Sci..

[25]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Giulio Concas,et al.  Are Refactoring Practices Related to Clusters in Java Software? , 2014, XP.

[27]  Ying Zou,et al.  Cross-Project Defect Prediction Using a Connectivity-Based Unsupervised Classifier , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[28]  Edward D. Arnheiter,et al.  Quality management in a modular world , 2006 .

[29]  Keqing He,et al.  A qualitative method for measuring the structural complexity of software systems based on complex networks , 2005, 12th Asia-Pacific Software Engineering Conference (APSEC'05).

[30]  Andreas Zeller,et al.  The impact of tangled code changes , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[31]  Gail C. Murphy,et al.  Determining Implementation Expertise from Bug Reports , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[32]  Banu Diri,et al.  A systematic review of software fault prediction studies , 2009, Expert Syst. Appl..

[33]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[34]  Alfred V. Aho,et al.  Do Crosscutting Concerns Cause Defects? , 2008, IEEE Transactions on Software Engineering.

[35]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[36]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[37]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[38]  Ahmed E. Hassan,et al.  Studying the impact of dependency network measures on software quality , 2010, 2010 IEEE International Conference on Software Maintenance.

[39]  H. D. Rombach,et al.  The Goal Question Metric Approach , 1994 .

[40]  Melissa A. Schilling Toward a General Modular Systems Theory and Its Application to Interfirm Product Modularity , 2000 .

[41]  Yixin Bian,et al.  Testing the theory of relative dependency from an evolutionary perspective: higher dependencies concentration in smaller modules over the lifetime of software products , 2016, J. Softw. Evol. Process..

[42]  Benjamin H. Good,et al.  Performance of modularity maximization in practical contexts. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[43]  Christopher R. Myers,et al.  Software systems as complex networks: structure, function, and evolvability of software collaboration graphs , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[44]  Fabian Beck,et al.  On the congruence of modularity and code coupling , 2011, ESEC/FSE '11.