Retrieving Task-Related Clusters from Change History

During software maintenance tasks, developers often spend an important amount of effort investigating source code. This effort can be reduced if tools are available to help developers navigate the source code effectively. For this purpose, we propose to search the change history of a software system to identify clusters of program elements related to a task. We evaluated the feasibility of this idea with an extensive historical analysis of change data. Our study evaluated to what extent change sets approximating tasks could have benefited from knowledge about clusters of past changes. A study of 3500 change sets for seven different systems and covering a cumulative time span of close to 12 years of development shows that less than 12% of the changes could have benefited from change clusters. We report on our observations on the factors that influence how we can use change clusters to guide program navigation.

[1]  Xiaogang Wang,et al.  Multiple layer clustering of large software systems , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[2]  Norman Wilde,et al.  Software reconnaissance: Mapping program features to code , 1995, J. Softw. Maintenance Res. Pract..

[3]  Stéphane Ducasse,et al.  Enriching reverse engineering with semantic clustering , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[4]  A. Steven Klusener,et al.  Assessing software archives with evolutionary clusters , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[5]  Harald C. Gall,et al.  Classifying Change Types for Qualifying Change Couplings , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[6]  Rainer Koschke,et al.  Equipping the reflexion method with automated clustering , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[7]  Thomas Zimmermann,et al.  Preprocessing CVS Data for Fine-Grained Analysis , 2004, MSR.

[8]  Thomas Zimmermann,et al.  Mining Aspects from Version History , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[9]  Janice Singer,et al.  Hipikat: a project memory for software development , 2005, IEEE Transactions on Software Engineering.

[10]  Victor R. Basili,et al.  System Structure Analysis: Clustering with Data Bindings , 1985, IEEE Transactions on Software Engineering.

[11]  Kris De Volder,et al.  Navigating and querying code without getting lost , 2003, AOSD '03.

[12]  Richard C. Holt,et al.  Replaying development history to assess the effectiveness of change propagation tools , 2006, Empirical Software Engineering.

[13]  Gail C. Murphy,et al.  Questions programmers ask during software evolution tasks , 2006, SIGSOFT '06/FSE-14.

[14]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[15]  Martin P. Robillard,et al.  Reusing Program Investigation Knowledge for Code Understanding , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[16]  Yann-Gaël Guéhéneuc,et al.  Extracting Change-patterns from CVS Repositories , 2006, 2006 13th Working Conference on Reverse Engineering.

[17]  Andreas Zeller,et al.  Mining version histories to guide software changes , 2005, Proceedings. 26th International Conference on Software Engineering.

[18]  Martin P. Robillard,et al.  Topology analysis of software dependencies , 2008, TSEM.

[19]  Ali Shokoufandeh,et al.  Studying the Evolution of Software Systems Using Change Clusters , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[20]  Andreas Zeller The Future of Programming Environments: Integration, Synergy, and Assistance , 2007, Future of Software Engineering (FOSE '07).