Co-change Clusters: Extraction and Application on Assessing Software Modularity

The traditional modular structure defined by the package hierarchy suffers from the dominant decomposition problem and it is widely accepted that alternative forms of modularization are necessary to increase developer’s productivity. In this paper, we propose an alternative form to understand and assess package modularity based on co-change clusters, which are highly inter-related classes considering co-change relations. We evaluate how co-change clusters relate to the package decomposition of four real-world systems. The results show that the projection of co-change clusters to packages follows different patterns in each system. Therefore, we claim that modular views based on co-change clusters can improve developers’ understanding on how well-modularized are their systems, considering that modularity is the ability to confine changes and evolve components in parallel.

[1]  Stas Negara,et al.  Is It Dangerous to Use Version Control Histories to Study Source Code Evolution? , 2012, ECOOP.

[2]  Mik Kersten,et al.  Using task context to improve programmer productivity , 2006, SIGSOFT '06/FSE-14.

[3]  D. L. Parnas,et al.  On the criteria to be used in decomposing systems into modules , 1972, Software Pioneers.

[4]  Marcelo de Almeida Maia,et al.  Assessing modularity using co-change clusters , 2014, MODULARITY.

[5]  Kim B. Clark,et al.  Design Rules: The Power of Modularity , 2000 .

[6]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[7]  Glenford J. Myers,et al.  Structured Design , 1974, IBM Syst. J..

[8]  Martin P. Robillard,et al.  Representing concerns in source code , 2007, TSEM.

[9]  Jennifer Widom,et al.  Clustering association rules , 1997, Proceedings 13th International Conference on Data Engineering.

[10]  Cristina V. Lopes,et al.  Aspect-oriented programming , 1999, ECOOP Workshops.

[11]  Bogdan Dit,et al.  An adaptive approach to impact analysis from change requests to source code , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[12]  Martin P. Robillard,et al.  Concern graphs: finding and describing concerns using structural program dependencies , 2002, Proceedings of the 24th International Conference on Software Engineering. ICSE 2002.

[13]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[14]  Marco Tulio Valente,et al.  Uncovering Causal Relationships between Software Metrics and Bugs , 2012, 2012 16th European Conference on Software Maintenance and Reengineering.

[15]  Bogdan Dit,et al.  Measuring the Semantic Similarity of Comments in Bug Reports , 2008 .

[16]  Edward Omiecinski,et al.  Alternative Interest Measures for Mining Associations in Databases , 2003, IEEE Trans. Knowl. Data Eng..

[17]  Gabriele Bavota,et al.  Detecting bad smells in source code using change history information , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[18]  Andreas Zeller,et al.  The impact of tangled code changes , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[19]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[20]  A. Steven Klusener,et al.  Assessing software archives with evolutionary clusters , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[21]  Fabian Beck,et al.  Evaluating the Impact of Software Evolution on Software Clustering , 2010, 2010 17th Working Conference on Reverse Engineering.

[22]  Ehsan Kouroshfar,et al.  Studying the effect of co-change dispersion on software quality , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[23]  Ahmed E. Hassan,et al.  Identifying crosscutting concerns using historical code changes , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[24]  Stéphane Ducasse,et al.  Distribution Map , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[25]  Dirk Beyer,et al.  Clustering software artifacts based on frequent common changes , 2005, 13th International Workshop on Program Comprehension (IWPC'05).

[26]  Chris F. Kemerer,et al.  Towards a metrics suite for object oriented design , 2017, OOPSLA '91.

[27]  Denys Poshyvanyk,et al.  Integrating conceptual and logical couplings for change impact analysis in software , 2013, Empirical Software Engineering.

[28]  Robert J. Walker,et al.  Do crosscutting concerns cause modularity problems? , 2012, SIGSOFT FSE.

[29]  Marco Tulio Valente,et al.  Remodularization analysis using semantic clustering , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[30]  Marco Aurélio Gerosa,et al.  Towards a classification of logical dependencies origins: a case study , 2011, IWPSE-EVOL '11.

[31]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[32]  Michele Lanza,et al.  An extensive comparison of bug prediction approaches , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[33]  Thomas Zimmermann,et al.  When do changes induce fixes? On Fridays , 2005 .

[34]  Martin P. Robillard,et al.  Recommending change clusters to support software investigation: an empirical study , 2010, J. Softw. Maintenance Res. Pract..

[35]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[36]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[37]  William G. Griswold,et al.  Exploiting the map metaphor in a tool for software evolution , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[38]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[39]  Martin P. Robillard,et al.  ConcernMapper: simple view-based separation of scattered concerns , 2005, eclipse '05.

[40]  Jing Li,et al.  The Qualitas Corpus: A Curated Collection of Java Code for Empirical Studies , 2010, 2010 Asia Pacific Software Engineering Conference.

[41]  Marco Tulio Valente,et al.  Predicting software defects with causality tests , 2014, J. Syst. Softw..

[42]  Carliss Y. Baldwin,et al.  Design Rules: Volume 1, the Power of Modularity: Preface to the Japanese Edition , 2003 .

[43]  Ricardo Terra,et al.  Qualitas.class corpus: a compiled version of the qualitas corpus , 2013, SOEN.

[44]  Marco Tulio Valente,et al.  A Semi-Automatic Approach for Extracting Software Product Lines , 2012, IEEE Transactions on Software Engineering.

[45]  Andreas Zeller,et al.  Mining version histories to guide software changes , 2005, Proceedings. 26th International Conference on Software Engineering.

[46]  簡聰富,et al.  物件導向軟體之架構(Object-Oriented Software Construction)探討 , 1989 .

[47]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[48]  Sven Apel,et al.  Granularity in software product lines , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[49]  Denys Poshyvanyk,et al.  Using information retrieval to support design of incremental change of software , 2007, ASE '07.

[50]  Kris De Volder,et al.  Navigating and querying code without getting lost , 2003, AOSD '03.

[51]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[52]  Cristina V. Lopes,et al.  Aspect-oriented programming , 1999, ECOOP Workshops.

[53]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[54]  Thomas Zimmermann,et al.  Mining Aspects from Version History , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[55]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[56]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .