Software Repositories: A Source for Traceability Links

paper analyzes six open source projects in order to assess software repositories, such as those managed by Subversion, as a source for uncovering/discovering traceability links between different types of software artifacts. Our finding suggests that software repositories store a variety of artifacts that are central to open source development and use. Furthermore, a heuristic-based approach that uses sequential-pattern mining is presented. This approach analyzes commits in a version history to mine for highly frequent co-occurring changes to different artifacts (e.g., source code and documentation). The hypothesis is if different types of artifacts are committed together frequently then there is a high probability that they have a traceability link between them. Examples of mined traceability links from our preliminary experimentation on mining KDE (K Desktop Environment) repositories are presented.

[1]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[2]  Jane Huffman Hayes,et al.  Baselines in requirements tracing , 2005, ACM SIGSOFT Softw. Eng. Notes.

[3]  Olly Gotel,et al.  An analysis of the requirements traceability problem , 1994, Proceedings of IEEE International Conference on Requirements Engineering.

[4]  Jane Huffman Hayes,et al.  Baselines in requirements tracing , 2005, PROMISE '05.

[5]  Gail C. Murphy,et al.  Hipikat: recommending pertinent software development artifacts , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[6]  Carl K. Chang,et al.  Automating speculative queries through event-based requirements traceability , 2002, Proceedings IEEE Joint International Conference on Requirements Engineering.

[7]  Jane Cleland-Huang,et al.  Supporting software evolution through dynamically retrieving traces to UML artifacts , 2004 .

[8]  Giuliano Antoniol,et al.  Maintaining traceability during object-oriented software evolution: a case study , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[9]  Alexander Egyed,et al.  A scenario-driven approach to traceability , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[10]  Andrea Zisman,et al.  Tracing Software Requirements Artifacts , 2003, Software Engineering Research and Practice.

[11]  Raffaella Settimi,et al.  Supporting software evolution through dynamically retrieving traces to UML artifacts , 2004, Proceedings. 7th International Workshop on Principles of Software Evolution, 2004..

[12]  Carl K. Chang,et al.  Supporting event based traceability through high-level recognition of change events , 2002, Proceedings 26th Annual International Computer Software and Applications.

[13]  David Notkin,et al.  Software reflexion models: bridging the gap between source and high-level models , 1995, SIGSOFT FSE.

[14]  Richard C. Holt,et al.  Predicting change propagation in software systems , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[15]  Daniel German,et al.  Mining CVS repositories, the softChange experience , 2004, MSR.

[16]  Daniel M. Germán,et al.  An empirical study of fine-grained software modifications , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[17]  Harald C. Gall,et al.  Detection of logical coupling based on product release history , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[18]  Alexander Egyed,et al.  A Scenario-Driven Approach to Trace Dependency Analysis , 2003, IEEE Trans. Software Eng..

[19]  Jonathan I. Maletic,et al.  Mining sequences of changed-files from version histories , 2006, MSR '06.

[20]  Arie van Deursen,et al.  Reconstructing requirements coverage views from design and test using traceability recovery via LSI , 2005, TEFSE '05.

[21]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[22]  George Spanoudakis,et al.  Software Traceability : A Roadmap , 2005 .

[23]  Jane Cleland-Huang,et al.  Goal-centric traceability for managing non-functional requirements , 2005, ICSE.

[24]  Walt Scacchi,et al.  Understanding the requirements for developing open source software systems , 2002, IEE Proc. Softw..

[25]  Giuliano Antoniol,et al.  Recovering code to documentation links in OO systems , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[26]  Giuliano Antoniol,et al.  Recovering Traceability Links between Code and Documentation , 2002, IEEE Trans. Software Eng..

[27]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[28]  Andreas Zeller,et al.  Mining Version Histories to Guide Software Changes , 2004 .

[29]  Jane Huffman Hayes,et al.  Improving requirements tracing via information retrieval , 2003, Proceedings. 11th IEEE International Requirements Engineering Conference, 2003..

[30]  Janice Singer,et al.  Hipikat: a project memory for software development , 2005, IEEE Transactions on Software Engineering.

[31]  Genny Tortora,et al.  Can Information Retrieval Techniques Effectively Support Traceability Link Recovery? , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[32]  Arie van Deursen,et al.  Can LSI help reconstructing requirements traceability in design and test? , 2006, Conference on Software Maintenance and Reengineering (CSMR'06).

[33]  O. Gotel,et al.  An Analysis of the Requirements Traceability Problem Imperial College of Science , Technology & Medicine Department of Computing , 180 Queen ' s Gate , 1994 .

[34]  Gerardo Canfora,et al.  Impact analysis by mining software and change request repositories , 2005, 11th IEEE International Software Metrics Symposium (METRICS'05).

[35]  Genny Tortora,et al.  Enhancing an artefact management system with traceability recovery features , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[36]  Andrian Marcus,et al.  Recovering documentation-to-source-code traceability links using latent semantic indexing , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[37]  Andrea Zisman,et al.  Rule-based generation of requirements traceability relations , 2004, J. Syst. Softw..