Feature location via information retrieval based filtering of a single scenario execution trace

The paper presents a semi-automated technique for feature location in source code. The technique is based on combining information from two different sources: an execution trace, on one hand and the comments and identifiers from the source code, on the other hand. Users execute a single partial scenario, which exercises the desired feature and all executed methods are identified based on the collected trace. The source code is indexed using Latent Semantic Indexing, an Information Retrieval method, which allows users to write queries relevant to the desired feature and rank all the executed methods based on their textual similarity to the query. Two case studies on open source software (JEdit and Eclipse) indicate that the new technique has high accuracy, comparable with previously published approaches and it is easy to use as it considerably simplifies the dynamic analysis.

[1]  Spiros Mancoridis,et al.  A hierarchy of dynamic software views: from object-interactions to feature-interactions , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[2]  Andreas Zeller,et al.  Mining version histories to guide software changes , 2005, Proceedings. 26th International Conference on Software Engineering.

[3]  Václav Rajlich,et al.  Incremental change in object-oriented programming , 2004, IEEE Software.

[4]  Mariano Ceccato,et al.  Aspect mining through the formal concept analysis of execution traces , 2004, 11th Working Conference on Reverse Engineering.

[5]  Emily Hill,et al.  Using natural language program analysis to locate and understand action-oriented concerns , 2007, AOSD.

[6]  Norman Wilde,et al.  Software reconnaissance: Mapping program features to code , 1995, J. Softw. Maintenance Res. Pract..

[7]  Václav Rajlich,et al.  Case study of feature location using dependence graph , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[8]  Giuliano Antoniol,et al.  Scenario-driven dynamic analysis for comprehending large software systems , 2006, Conference on Software Maintenance and Reengineering (CSMR'06).

[9]  Andrian Marcus,et al.  An information retrieval approach to concept location in source code , 2004, 11th Working Conference on Reverse Engineering.

[10]  Rainer Koschke,et al.  Locating Features in Source Code , 2003, IEEE Trans. Software Eng..

[11]  Harald C. Gall,et al.  Analyzing and relating bug report data for feature tracking , 2003, 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings..

[12]  Gerhard Fischer,et al.  Supporting reuse by delivering task-relevant and personalized information , 2002, ICSE '02.

[13]  Giuliano Antoniol,et al.  Recovering Traceability Links between Code and Documentation , 2002, IEEE Trans. Software Eng..

[14]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[15]  Stéphane Ducasse,et al.  Analyzing feature traces to incorporate the semantics of change in software evolution analysis , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[16]  Alexander Egyed,et al.  STRADA: A Tool for Scenario-Based Feature-to-Code Trace Detection and Analysis , 2007, 29th International Conference on Software Engineering (ICSE'07 Companion).

[17]  Stéphane Ducasse,et al.  Semantic clustering: Identifying topics in source code , 2007, Inf. Softw. Technol..

[18]  Ted J. Biggerstaff,et al.  The concept assignment problem in program understanding , 1993, [1993] Proceedings Working Conference on Reverse Engineering.

[19]  Ali Shokoufandeh,et al.  On Computing the Canonical Features of Software Systems , 2006, 2006 13th Working Conference on Reverse Engineering.

[20]  William B. Frakes,et al.  Software reuse research: status and future , 2005, IEEE Transactions on Software Engineering.

[21]  Andrian Marcus,et al.  Static techniques for concept location in object-oriented code , 2005, 13th International Workshop on Program Comprehension (IWPC'05).

[22]  Genny Tortora,et al.  Recovering traceability links in software artifact management systems using information retrieval methods , 2007, TSEM.

[23]  Martin P. Robillard,et al.  Automatic generation of suggestions for program investigation , 2005, ESEC/FSE-13.

[24]  Norman Wilde,et al.  Industrial tools for the feature location problem: an exploratory study , 2006, J. Softw. Maintenance Res. Pract..

[25]  Swapna S. Gokhale,et al.  Static and dynamic distance metrics for feature-based code analysis , 2005, J. Syst. Softw..

[26]  Norman Wilde,et al.  Industrial tools for the feature location problem: an exploratory study: Practice Articles , 2006 .

[27]  Wei Zhao,et al.  SNIAFL: towards a static non-interactive approach to feature location , 2004, Proceedings. 26th International Conference on Software Engineering.

[28]  Jane Huffman Hayes,et al.  Advancing candidate link generation for requirements tracing: the study of methods , 2006, IEEE Transactions on Software Engineering.

[29]  Gail E. Kaiser,et al.  An Information Retrieval Approach For Automatically Constructing Software Libraries , 1991, IEEE Trans. Software Eng..

[30]  Arun Lakhotia,et al.  A formalism to automate mapping from program features to code , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[31]  Tibor Gyimóthy,et al.  Dynamic slicing of Java bytecode programs , 2005, Fifth IEEE International Workshop on Source Code Analysis and Manipulation (SCAM'05).

[32]  Andrew David Eisenberg,et al.  Dynamic feature traces: finding features in unfamiliar code , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[33]  Yann-Gaël Guéhéneuc,et al.  Feature Location Using Probabilistic Ranking of Methods Based on Execution Scenarios and Information Retrieval , 2007, IEEE Transactions on Software Engineering.

[34]  Alfred V. Aho,et al.  Pattern Matching in Strings , 1980 .

[35]  Steven P. Reiss,et al.  Generating Java trace data , 2000, JAVA '00.

[36]  Swapna S. Gokhale,et al.  Locating program features using execution slices , 1999, Proceedings 1999 IEEE Symposium on Application-Specific Systems and Software Engineering and Technology. ASSET'99 (Cat. No.PR00122).

[37]  Andrian Marcus,et al.  Identification of high-level concept clones in source code , 2001, Proceedings 16th Annual International Conference on Automated Software Engineering (ASE 2001).

[38]  Václav Rajlich,et al.  Changing the paradigm of software engineering , 2006, CACM.

[39]  Norman Wilde,et al.  An approach to feature location in distributed systems , 2006, J. Syst. Softw..

[40]  Andrian Marcus,et al.  Supporting program comprehension using semantic and structural information , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[41]  Andrian Marcus,et al.  Recovery of Traceability Links between Software Documentation and Source Code , 2005, Int. J. Softw. Eng. Knowl. Eng..

[42]  Yann-Gaël Guéhéneuc,et al.  Feature Identification: An Epidemiological Metaphor , 2006, IEEE Transactions on Software Engineering.