Comparing manual and automated feature location in conceptual models: A Controlled experiment

Abstract Context Maintenance activities cannot be completed without locating the set of software artifacts that realize a particular feature of a software system. Manual Feature Location (FL) is widely used in industry, but it becomes challenging (time-consuming and error prone) in large software repositories. To reduce manual efforts, automated FL techniques have been proposed. Research efforts in FL tend to make comparisons between automated FL techniques, ignoring manual FL techniques. Moreover, existing research puts the focus on code, neglecting other artifacts such as models. Objective This paper aims to compare manual FL against automated FL in models to answer important questions about performance, productivity, and satisfaction of both treatments. Method We run an experiment for comparing manual and automated FL on a set of 18 subjects (5 experts and 13 non-experts) in the domain of our industrial partner, BSH, manufacturer of induction hobs for more than 15 years. We measure performance (recall, precision, and F-measure), productivity (ratio between F-measure and spent time), and satisfaction (perceived ease of use, perceived usefulness, and intention to use) of both treatments, and perform statistical tests to assess whether the obtained differences are significant. Results Regarding performance, manual FL significantly outperforms automated FL in precision and F-measure (up to 27.79% and 19.05%, respectively), whereas automated FL significantly outperforms manual FL in recall (up to 32.18%). Regarding productivity, manual FL obtains 3.43%/min, which improves automated FL significantly. Finally, there are no significant differences in satisfaction for both treatments. Conclusions The findings of our work can be leveraged to advance research to improve the results of manual and automated FL techniques. For instance, automated FL in industry faces issues such as low discrimination capacity. In addition, the obtained satisfaction results have implications for the usage and possible combination of manual, automated, and guided FL techniques.

[1]  Brad A. Myers,et al.  An Exploratory Study of How Developers Seek, Relate, and Collect Relevant Information during Software Maintenance Tasks , 2006, IEEE Transactions on Software Engineering.

[2]  Marek Hatala,et al.  The effects of visualization and interaction techniques on feature model configuration , 2016, Empirical Software Engineering.

[3]  Jane Huffman Hayes,et al.  Advancing candidate link generation for requirements tracing: the study of methods , 2006, IEEE Transactions on Software Engineering.

[4]  Natalia Juristo Juzgado,et al.  Understanding replication of experiments in software engineering: A classification , 2014, Inf. Softw. Technol..

[5]  Liam O'Brien,et al.  MAP - mining architectures for product line evaluations , 2001, Proceedings Working IEEE/IFIP Conference on Software Architecture.

[6]  Per Runeson,et al.  Guidelines for conducting and reporting case study research in software engineering , 2009, Empirical Software Engineering.

[7]  Zhenchang Xing,et al.  How developers perform feature location tasks: a human‐centric and process‐oriented exploratory study , 2013, J. Softw. Evol. Process..

[8]  Jacques Klein,et al.  Feature location benchmark for extractive software product line adoption research using realistic and synthetic Eclipse variants , 2018, Inf. Softw. Technol..

[9]  Shari Lawrence Pfleeger,et al.  Preliminary Guidelines for Empirical Research in Software Engineering , 2002, IEEE Trans. Software Eng..

[10]  Fred D. Davis Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology , 1989, MIS Q..

[11]  Emily Hill,et al.  Which Feature Location Technique is Better? , 2013, 2013 IEEE International Conference on Software Maintenance.

[12]  Robert Feldt,et al.  Validity Threats in Empirical Software Engineering Research - An Initial Survey , 2010, SEKE.

[13]  Anette Hulth,et al.  Improved Automatic Keyword Extraction Given More Linguistic Knowledge , 2003, EMNLP.

[14]  Jacques Klein,et al.  Automating the Extraction of Model-Based Software Product Lines from Model Variants (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[15]  Andreas Burger,et al.  FLOrIDA: Feature LOcatIon DAshboard for extracting and visualizing feature traces , 2017, VaMoS.

[16]  Janice Singer,et al.  Studying Software Engineers: Data Collection Techniques for Software Field Studies , 2005, Empirical Software Engineering.

[17]  João Araújo,et al.  Evaluating the Efficacy of Value-driven Methods: A Controlled Experiment , 2017, ISD.

[18]  Norman Wilde,et al.  Industrial tools for the feature location problem: an exploratory study , 2006, J. Softw. Maintenance Res. Pract..

[19]  Denys Poshyvanyk,et al.  Feature location via information retrieval based filtering of a single scenario execution trace , 2007, ASE.

[20]  Patrick Mäder,et al.  Preventing Defects: The Impact of Requirements Traceability Completeness on Software Quality , 2017, IEEE Transactions on Software Engineering.

[21]  Mark Neal,et al.  Why and how of requirements tracing , 1994, IEEE Software.

[22]  Abdelhak-Djamel Seriai,et al.  Feature Location in a Collection of Product Variants: Combining Information Retrieval and Hierarchical Clustering , 2014, SEKE.

[23]  Jordi Cabot,et al.  Model-Driven Software Engineering in Practice , 2017, Synthesis Lectures on Software Engineering.

[24]  Andrian Marcus,et al.  An information retrieval approach to concept location in source code , 2004, 11th Working Conference on Reverse Engineering.

[25]  Alexander Egyed,et al.  Reengineering legacy applications into software product lines: a systematic mapping , 2017, Empirical Software Engineering.

[26]  Yann-Gaël Guéhéneuc,et al.  Feature Location Using Probabilistic Ranking of Methods Based on Execution Scenarios and Information Retrieval , 2007, IEEE Transactions on Software Engineering.

[27]  Oscar Pastor,et al.  In search of evidence for model-driven development claims: An experiment on quality, effort, productivity and satisfaction , 2015, Inf. Softw. Technol..

[28]  Gabriele Bavota,et al.  Automatic query reformulations for text retrieval in software engineering , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[29]  A. Vargha,et al.  A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong , 2000 .

[30]  Arie van Deursen,et al.  Domain-specific languages: an annotated bibliography , 2000, SIGP.

[31]  Jaime Font,et al.  An approach for bug localization in models using two levels: model and metamodel , 2019, Software and Systems Modeling.

[32]  Jacob Krüger,et al.  Features and How to Find Them , 2019, Software Engineering for Variability Intensive Systems.

[33]  Andrian Marcus,et al.  Recovering documentation-to-source-code traceability links using latent semantic indexing , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[34]  Daniel L. Moody,et al.  The method evaluation model: a theoretical model for validating information systems design methods , 2003, ECIS.

[35]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[36]  Claes Wohlin,et al.  Experimentation in Software Engineering , 2012, Springer Berlin Heidelberg.

[37]  Marsha Chechik,et al.  A Survey of Feature Location Techniques , 2013, Domain Engineering, Product Lines, Languages, and Conceptual Models.

[38]  Arbi Ghazarian A Research Agenda for Software Reliability , 2009 .

[39]  Sebastian Herold,et al.  Manually Locating Features in Industrial Source Code: The Search Actions of Software Nomads , 2015, 2015 IEEE 23rd International Conference on Program Comprehension.

[40]  Kyo Chul Kang,et al.  Feature-Oriented Domain Analysis (FODA) Feasibility Study , 1990 .

[41]  Jacob Krüger,et al.  Towards a Better Understanding of Software Features and Their Characteristics: A Case Study of Marlin , 2018, VaMoS.

[42]  Michal Antkiewicz,et al.  Maintaining feature traceability with embedded annotations , 2015, SPLC.

[43]  Bogdan Dit,et al.  Using Data Fusion and Web Mining to Support Feature Location in Software , 2010, 2010 IEEE 18th International Conference on Program Comprehension.

[44]  Lionel C. Briand,et al.  A Hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering , 2014, Softw. Test. Verification Reliab..

[45]  Jaime Font,et al.  Feature location in models through a genetic algorithm driven by information retrieval techniques , 2016, MoDELS.

[46]  Wei Zhao,et al.  SNIAFL: towards a static non-interactive approach to feature location , 2004, ICSE 2004.

[47]  Jaime Font,et al.  Improving feature location in long-living model-based product families designed with sustainability goals , 2017, J. Syst. Softw..

[48]  Volker Haarslev,et al.  Ontological approach for the semantic recovery of traceability links between software artefacts , 2008, IET Softw..

[49]  Norman Wilde,et al.  A comparison of methods for locating features in legacy software , 2003, J. Syst. Softw..

[50]  Jan-Philipp Steghöfer,et al.  The state of adoption and the challenges of systematic variability management in industry , 2020, Empirical Software Engineering.

[51]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[52]  Oscar Pastor,et al.  Assessing the Performance of Automated Model Extraction Rules , 2017, ISD.

[53]  Jaime Font,et al.  Achieving Feature Location in Families of Models Through the Use of Search-Based Software Engineering , 2018, IEEE Transactions on Evolutionary Computation.

[54]  Jacob Krüger,et al.  Where is my feature and what is it about? A case study on recovering feature facets , 2019, J. Syst. Softw..

[55]  Jens von Pilgrim,et al.  A survey of traceability in requirements engineering and model-driven development , 2010, Software & Systems Modeling.

[56]  Richard A. Krueger,et al.  Focus groups : a practical guide for applied research / by Richard A. Krueger , 1989 .

[57]  Andrea De Lucia,et al.  On the Equivalence of Information Retrieval Methods for Automated Traceability Link Recovery , 2010, 2010 IEEE 18th International Conference on Program Comprehension.

[58]  Paul Grünbacher,et al.  Model-Based Customization and Deployment of Eclipse-Based Tools: Industrial Experiences , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[59]  Bogdan Dit,et al.  Feature location in source code: a taxonomy and survey , 2013, J. Softw. Evol. Process..

[60]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[61]  Oscar Pastor,et al.  The Influence of Requirements in Software Model Development in an Industrial Environment , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[62]  W. J. Conover,et al.  Practical Nonparametric Statistics , 1972 .