Automatic Matching Release Notes and Source Code by Generating Summary for Software Change

To quickly locate the source code that maps to a specific change described in change history, establishing traceability links between release notes and source code is a necessary task. Current works on the traceability link recovery can be used to find out source code changes which are of higher textual similarities with the release note. However, these approaches rely on consistency of the text used in artifacts at various abstraction levels, and the completeness of text descriptions. In this paper, we propose to leverage source code change information for improving the accuracy of release note to source code traceability recovery tasks. In order to reduce the complexity of link recovery, our approach first performs change impact analysis to cluster the source code changes for the same purpose as a virtual class. After that, our approach employs a natural language generation algorithm to generate readable summary sentence for each virtual class. The traceability links are built between release notes and clusters of program entities by computing the linguistic similarity of sentences. We conduct case studies on 26 releases of 3 popular softwares to evaluate the approach, and the results indicate that our proposed method can improve the accuracy of traceability link recovery compared to other IR-based techniques.

[1]  Jane Huffman Hayes,et al.  Advancing candidate link generation for requirements tracing: the study of methods , 2006, IEEE Transactions on Software Engineering.

[2]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[3]  Gabriele Bavota,et al.  Using code ownership to improve IR-based Traceability Link Recovery , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[4]  Mordechai Nisenson,et al.  A Traceability Technique for Specifications , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[5]  Andrian Marcus,et al.  Recovery of Traceability Links between Software Documentation and Source Code , 2005, Int. J. Softw. Eng. Knowl. Eng..

[6]  Xiaonan Luo,et al.  A Probabilistic Neural Network-Based Approach for Related Software Changes Detection , 2014, 2014 21st Asia-Pacific Software Engineering Conference.

[7]  Genny Tortora,et al.  Recovering traceability links in software artifact management systems using information retrieval methods , 2007, TSEM.

[8]  Emily Hill,et al.  Towards automatically generating summary comments for Java methods , 2010, ASE.

[9]  Ingolf Krüger,et al.  Tracing requirements to tests with high precision and recall , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[10]  Richard N. Taylor,et al.  Software traceability with topic modeling , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[11]  Serge Demeyer,et al.  Studying software evolution information by visualizing the change history , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[12]  Giuliano Antoniol,et al.  Recovering Traceability Links between Code and Documentation , 2002, IEEE Trans. Software Eng..

[13]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[14]  M. Dolores del Castillo,et al.  SyMSS: A syntax-based measure for short-text semantic similarity , 2011, Data Knowl. Eng..