Correlating Social Interactions to Release History during Software Evolution

In this paper, we propose a method to reason about the nature of software changes by mining and correlating discussion archives. We employ an information retrieval approach to find correlation between source code change history and history of social interactions surrounding these changes. We apply our correlation method on two software systems, LSEdit and Apache Ant. The results of these exploratory case studies demonstrate the evidence of similarity between the content of free-form text emails among developers and the actual modifications in the code. We identify a set of correlation patterns between discussion and changed code vocabularies and discover that some releases referred to as minor should instead fall under the major category. These patterns can be used to give estimations about the type of a change and time needed to implement it.

[1]  Stan Matwin,et al.  Mining the maintenance history of a legacy software system , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[2]  Andrian Marcus,et al.  Recovering documentation-to-source-code traceability links using latent semantic indexing , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[3]  Ted J. Biggerstaff,et al.  The concept assignment problem in program understanding , 1993, [1993] Proceedings Working Conference on Reverse Engineering.

[4]  Genny Tortora,et al.  Enhancing an artefact management system with traceability recovery features , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[5]  Q. H. Tu On navigation and analysis of software architecture evolution , 2002 .

[6]  Michael W. Godfrey,et al.  The build-time software architecture view , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[7]  Janice Singer,et al.  Hipikat: a project memory for software development , 2005, IEEE Transactions on Software Engineering.

[8]  Andreas Zeller,et al.  Mining version histories to guide software changes , 2005, Proceedings. 26th International Conference on Software Engineering.

[9]  Olga Baysal,et al.  Attaching Social Interactions Surrounding Software Changes to the Release History of an Evolving Software System , 2006 .

[10]  Richard C. Holt,et al.  Predicting change propagation in software systems , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[11]  Audris Mockus,et al.  International Workshop on Mining Software Repositories , 2004 .

[12]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[13]  Giuliano Antoniol,et al.  Recovering Traceability Links between Code and Documentation , 2002, IEEE Trans. Software Eng..

[14]  Andreas Zeller,et al.  Mining Version Histories to Guide Software Changes , 2004 .

[15]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[16]  Jingwei Wu,et al.  Open source software evolution and its dynamics , 2006 .