Recommending change clusters to support software investigation: an empirical study

During software maintenance tasks, developers often spend a valuable amount of effort investigating source code. This effort can be reduced if tools are available to help developers navigate the source code effectively. We studied to what extent developers can benefit from information contained in clusters of change sets to guide their investigation of a software system. We defined change clusters as groups of change sets that have a certain amount of elements in common. Our analysis of 4200 change sets for seven different systems and covering a cumulative time span of over 17 years of development showed that less than one in five tasks overlapped with change clusters. Furthermore, a detailed qualitative analysis of the results revealed that only 13% of the clusters associated with applicable change tasks were likely to be useful. We conclude that change clusters can only support a minority of change tasks, and should only be recommended if it is possible to do so at minimal cost to the developers. Copyright © 2009 John Wiley & Sons, Ltd.

[1]  Victor R. Basili,et al.  System Structure Analysis: Clustering with Data Bindings , 1985, IEEE Transactions on Software Engineering.

[2]  Norman Wilde,et al.  Software reconnaissance: Mapping program features to code , 1995, J. Softw. Maintenance Res. Pract..

[3]  Anselm L. Strauss,et al.  Basics of qualitative research : techniques and procedures for developing grounded theory , 1998 .

[4]  N. Hoffart Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory , 2000 .

[5]  Kris De Volder,et al.  Navigating and querying code without getting lost , 2003, AOSD '03.

[6]  Kamran Sartipi,et al.  A user-assisted approach to component clustering , 2003, J. Softw. Maintenance Res. Pract..

[7]  Thomas Zimmermann,et al.  Preprocessing CVS Data for Fine-Grained Analysis , 2004, MSR.

[8]  Andrian Marcus,et al.  An information retrieval approach to concept location in source code , 2004, 11th Working Conference on Reverse Engineering.

[9]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[10]  Andreas Zeller,et al.  Mining Version Histories to Guide Software Changes , 2004 .

[11]  Janice Singer,et al.  Hipikat: a project memory for software development , 2005, IEEE Transactions on Software Engineering.

[12]  Stéphane Ducasse,et al.  Enriching reverse engineering with semantic clustering , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[13]  Rainer Koschke,et al.  Equipping the reflexion method with automated clustering , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[14]  Xiaogang Wang,et al.  Multiple layer clustering of large software systems , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[15]  Thomas Zimmermann,et al.  Mining Aspects from Version History , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[16]  Yann-Gaël Guéhéneuc,et al.  Extracting Change-patterns from CVS Repositories , 2006, 2006 13th Working Conference on Reverse Engineering.

[17]  Ali Shokoufandeh,et al.  Studying the Evolution of Software Systems Using Change Clusters , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[18]  Gail C. Murphy,et al.  Questions programmers ask during software evolution tasks , 2006, SIGSOFT '06/FSE-14.

[19]  Harald C. Gall,et al.  Classifying Change Types for Qualifying Change Couplings , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[20]  Richard C. Holt,et al.  Replaying development history to assess the effectiveness of change propagation tools , 2006, Empirical Software Engineering.

[21]  Andreas Zeller The Future of Programming Environments: Integration, Synergy, and Assistance , 2007, Future of Software Engineering (FOSE '07).

[22]  Jonathan I. Maletic,et al.  Journal of Software Maintenance and Evolution: Research and Practice Survey a Survey and Taxonomy of Approaches for Mining Software Repositories in the Context of Software Evolution , 2022 .

[23]  Harald C. Gall,et al.  Analysing Software Repositories to Understand Software Evolution , 2008, Software Evolution.

[24]  Martin P. Robillard,et al.  Recommending adaptive changes for framework evolution , 2011, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[25]  Martin P. Robillard,et al.  Topology analysis of software dependencies , 2008, TSEM.

[26]  A. Steven Klusener,et al.  Assessing software archives with evolutionary clusters , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[27]  Martin P. Robillard,et al.  Reusing Program Investigation Knowledge for Code Understanding , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[28]  Martin P. Robillard,et al.  Retrieving Task-Related Clusters from Change History , 2008, 2008 15th Working Conference on Reverse Engineering.