Semi-automatic selection of primary studies in systematic literature reviews: is it reasonable?

The systematic review (SR) is a methodology used to find and aggregate all relevant existing evidence about a specific research question of interest. One of the activities associated with the SR process is the selection of primary studies, which is a time consuming manual task. The quality of primary study selection impacts the overall quality of SR. The goal of this paper is to propose a strategy named “Score Citation Automatic Selection” (SCAS), to automate part of the primary study selection activity. The SCAS strategy combines two different features, content and citation relationships between the studies, to make the selection activity as automated as possible. Aiming to evaluate the feasibility of our strategy, we conducted an exploratory case study to compare the accuracy of selecting primary studies manually and using the SCAS strategy. The case study shows that for three SRs published in the literature and previously conducted in a manual implementation, the average effort reduction was 58.2 % when applying the SCAS strategy to automate part of the initial selection of primary studies, and the percentage error was 12.98 %. Our case study provided confidence in our strategy, and suggested that it can reduce the effort required to select the primary studies without adversely affecting the overall results of SR.

[1]  Mary Shaw,et al.  The golden age of software architecture , 2006, IEEE Software.

[2]  Pearl Brereton,et al.  Tools to Support Systematic Literature Reviews in Software Engineering: A Mapping Study , 2013, 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement.

[3]  Manoel G. Mendonça,et al.  A Visual Text Mining approach for Systematic Reviews , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[4]  Pearl Brereton,et al.  Lessons from applying the systematic literature review process within the software engineering domain , 2007, J. Syst. Softw..

[5]  Martin J. Shepperd,et al.  Software project economics: a roadmap , 2007, Future of Software Engineering (FOSE '07).

[6]  Tore Dybå,et al.  Evidence-based software engineering , 2004, Proceedings. 26th International Conference on Software Engineering.

[7]  BudgenDavid,et al.  Lessons from applying the systematic literature review process within the software engineering domain , 2007 .

[8]  Natalia Juristo Juzgado,et al.  Developing search strategies for detecting relevant experiments , 2009, Empirical Software Engineering.

[9]  Tore Dybå,et al.  Evidence-Based Software Engineering for Practitioners , 2005, IEEE Softw..

[10]  Guenther Ruhe,et al.  Impact Analysis of Missing Values on the Prediction Accuracy of Analogy-based Software Effort Estimation Method AQUA , 2007, ESEM 2007.

[11]  Sebastian K. Boell,et al.  Are systematic reviews better, less biased and of higher quality? , 2011, ECIS.

[12]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[13]  Kai Petersen,et al.  Identifying Strategies for Study Selection in Systematic Reviews and Maps , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[14]  Tore Dybå,et al.  Applying Systematic Reviews to Diverse Study Types: An Experience Report , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[15]  Cleiton Silva,et al.  Managing Literature Reviews Information through Visualization , 2018, ICEIS.

[16]  Emilia Mendes,et al.  Using Visual Text Mining to Support the Study Selection Activity in Systematic Literature Reviews , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[17]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[18]  Rosane Minghim,et al.  A visual analysis approach to validate the selection review of primary studies in systematic reviews , 2012, Inf. Softw. Technol..

[19]  Muhammad Ali Babar,et al.  Systematic literature reviews in software engineering: Preliminary results from interviews with researchers , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[20]  Muhammad Ali Babar,et al.  An Empirical Investigation of Systematic Reviews in Software Engineering , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[21]  Claes Wohlin,et al.  Experimentation in software engineering: an introduction , 2000 .

[22]  Mehwish Riaz,et al.  Experiences Conducting Systematic Reviews from Novices' Perspective , 2010, EASE.

[23]  Paul Clements,et al.  The Golden Age of Software Architecture: A Comprehensive Survey. Technical Report CMU-ISRI-06-101 , 2006 .

[24]  Rosane Minghim,et al.  HiPP: A Novel Hierarchical Point Placement Strategy and its Application to the Exploration of Document Collections , 2008, IEEE Transactions on Visualization and Computer Graphics.

[25]  Briony J. Oates,et al.  Using systematic reviews and evidence-based software engineering with masters students , 2009, EASE.