Reliability of search in systematic reviews: Towards a quality assessment framework for the automated-search strategy

Abstract Context The trust in systematic literature reviews (SLRs) to provide credible recommendations is critical for establishing evidence-based software engineering (EBSE) practice. The reliability of SLR as a method is not a given and largely depends on the rigor of the attempt to identify, appraise and aggregate evidence. Previous research, by comparing SLRs on the same topic, has identified search as one of the reasons for discrepancies in the included primary studies. This affects the reliability of an SLR, as the papers identified and included in it are likely to influence its conclusions. Objective We aim to propose a comprehensive evaluation checklist to assess the reliability of an automated-search strategy used in an SLR. Method Using a literature review, we identified guidelines for designing and reporting automated-search as a primary search strategy. Using the aggregated design, reporting and evaluation guidelines, we formulated a comprehensive evaluation checklist. The value of this checklist was demonstrated by assessing the reliability of search in 27 recent SLRs. Results Using the proposed evaluation checklist, several additional issues (not captured by the current evaluation checklist) related to the reliability of search in recent SLRs were identified. These issues severely limit the coverage of literature by the search and also the possibility to replicate it. Conclusion Instead of solely relying on expensive replications to assess the reliability of SLRs, this work provides means to objectively assess the likely reliability of a search-strategy used in an SLR. It highlights the often-assumed aspect of repeatability of search when using automated-search. Furthermore, by explicitly considering repeatability and consistency as sub-characteristics of a reliable search, it provides a more comprehensive evaluation checklist than the ones currently used in EBSE.

[1]  Lianping Chen,et al.  Towards an Evidence-Based Understanding of Electronic Data Sources , 2010, EASE.

[2]  Barbara A. Kitchenham,et al.  Validating Search Processes in Systematic Literature Reviews , 2018, EAST.

[3]  Paul Grünbacher,et al.  Requirements monitoring frameworks: A systematic review , 2016, Inf. Softw. Technol..

[4]  Robert Feldt,et al.  An initial analysis of software engineers’ attitudes towards organizational change , 2017, Empirical Software Engineering.

[5]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A systematic literature review , 2009, Inf. Softw. Technol..

[6]  Muhammad Ali Babar,et al.  Identifying relevant studies in software engineering , 2011, Inf. Softw. Technol..

[7]  Bernhard Hoisl,et al.  Extracting reusable design decisions for UML-based domain-specific languages: A multi-method study , 2016, J. Syst. Softw..

[8]  Sajjad Mahmood,et al.  Challenges of project management in global software development: A client-vendor analysis , 2016, Inf. Softw. Technol..

[9]  Magne Jørgensen,et al.  A Systematic Review of Software Development Cost Estimation Studies , 2007 .

[10]  Pearl Brereton,et al.  Risks and risk mitigation in global software development: A tertiary study , 2014, Inf. Softw. Technol..

[11]  Emilia Mendes,et al.  The effect of software engineers' personality traits on team climate and performance: A Systematic Literature Review , 2016, Inf. Softw. Technol..

[12]  Tony Gorschek,et al.  Requirements engineering for safety-critical systems: A systematic literature review , 2016, Inf. Softw. Technol..

[13]  John Grundy,et al.  Systematic literature reviews in agile software development: A tertiary study , 2017, Inf. Softw. Technol..

[14]  Susan L. Norris,et al.  Limitations of A Measurement Tool to Assess Systematic Reviews (AMSTAR) and suggestions for improvement , 2016, Systematic Reviews.

[15]  Wasif Afzal,et al.  Software test process improvement approaches: A systematic literature review and an industrial case study , 2016, J. Syst. Softw..

[16]  Tsong Yueh Chen,et al.  An assessment of systems and software engineering scholars and institutions (1993-1997) , 1997, J. Syst. Softw..

[17]  Yves Le Traon,et al.  A systematic review on the engineering of software for ubiquitous systems , 2016, J. Syst. Softw..

[18]  Lisa Bero,et al.  Systematic Review: A Method at Risk for Being Corrupted , 2017, American journal of public health.

[19]  Tore Dybå,et al.  Evidence-Based Software Engineering for Practitioners , 2005, IEEE Softw..

[20]  Casper Lassenius,et al.  Challenges and success factors for large-scale agile transformations: A systematic literature review , 2016, J. Syst. Softw..

[21]  Claes Wohlin,et al.  Systematic literature studies: Database searches vs. backward snowballing , 2012, Proceedings of the 2012 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement.

[22]  Nauman Bin Ali,et al.  Is effectiveness sufficient to choose an intervention?: Considering resource use in empirical software engineering , 2016, ESEM.

[23]  Coral Calero,et al.  Software reliability modeling based on ISO/IEC SQuaRE , 2016, Inf. Softw. Technol..

[24]  David Moher,et al.  Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews , 2007, BMC medical research methodology.

[25]  Per Runeson,et al.  Reference-based search strategies in systematic reviews , 2009, EASE.

[26]  Cheng Zhang,et al.  Search Engine Overlaps : Do they agree or disagree? , 2007, Second International Workshop on Realising Evidence-Based Software Engineering (REBSE '07).

[27]  Jeremy Grimshaw,et al.  AMSTAR is a reliable and valid measurement tool to assess the methodological quality of systematic reviews. , 2009, Journal of clinical epidemiology.

[28]  Vahid Garousi,et al.  Challenges and best practices in industry-academia collaborations in software engineering: A systematic literature review , 2016, Inf. Softw. Technol..

[29]  Daniel M Fox,et al.  Evidence and Health Policy: Using and Regulating Systematic Reviews , 2017, American journal of public health.

[30]  Tomi Männistö,et al.  Performance variability in software product lines: proposing theories from a case study , 2015, Empirical Software Engineering.

[31]  Natalia Juristo Juzgado,et al.  Developing search strategies for detecting relevant experiments , 2009, Empirical Software Engineering.

[32]  T. Greenhalgh How to Read a Paper: The Basics of Evidence-Based Medicine , 1997 .

[33]  Rachel Churchill,et al.  ROBIS: A new tool to assess risk of bias in systematic reviews was developed , 2016, Journal of clinical epidemiology.

[34]  Alain Abran,et al.  Systematic literature review of ensemble effort estimation , 2016, J. Syst. Softw..

[35]  John D. McGregor,et al.  A systematic mapping study of software product lines testing , 2011, Inf. Softw. Technol..

[36]  Liam O'Brien,et al.  Spot pricing in the Cloud ecosystem: A comparative investigation , 2016, J. Syst. Softw..

[37]  Fernanda Campos,et al.  Towards pragmatic interoperability to support collaboration: A systematic review and mapping of the literature , 2016, Inf. Softw. Technol..

[38]  Claes Wohlin,et al.  On the reliability of mapping studies in software engineering , 2013, J. Syst. Softw..

[39]  Vahid Garousi,et al.  A systematic literature review of literature reviews in software testing , 2016, Inf. Softw. Technol..

[40]  Per Runeson,et al.  Software product line testing - A systematic mapping study , 2011, Inf. Softw. Technol..

[41]  David Haselberger,et al.  A literature-based framework of performance-related leadership interactions in ICT project teams , 2016, Inf. Softw. Technol..

[42]  Kai Petersen,et al.  Identifying Strategies for Study Selection in Systematic Reviews and Maps , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[43]  Maria Cláudia Figueiredo Pereira Emer,et al.  The effects of test driven development on internal quality, external quality and productivity: A systematic review , 2016, Inf. Softw. Technol..

[44]  Cristina Cachero,et al.  Requirements modeling languages for software product lines: A systematic literature review , 2016, Inf. Softw. Technol..

[45]  Claes Wohlin,et al.  A systematic literature review on the industrial use of software process simulation , 2014, J. Syst. Softw..

[46]  Claes Wohlin,et al.  Software component decision-making: In-house, OSS, COTS or outsourcing - A systematic literature review , 2016, J. Syst. Softw..

[47]  Iris Groher,et al.  Software architecture knowledge management approaches and their support for knowledge management activities: A systematic literature review , 2016, Inf. Softw. Technol..

[48]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A tertiary study , 2010, Inf. Softw. Technol..

[49]  John P A Ioannidis,et al.  The Mass Production of Redundant, Misleading, and Conflicted Systematic Reviews and Meta-analyses. , 2016, The Milbank quarterly.

[50]  María José Escalona Cuaresma,et al.  Agile, Web Engineering and Capability Maturity Model Integration: A systematic literature review , 2016, Inf. Softw. Technol..

[51]  Hajo A. Reijers,et al.  Business process maturity models: a systematic literature review , 2016 .

[52]  Claes Wohlin,et al.  Systematic literature reviews in software engineering , 2013, Inf. Softw. Technol..

[53]  Jeremy M. Grimshaw,et al.  Increasing the demand for childhood vaccination in developing countries: a systematic review , 2009, BMC international health and human rights.

[54]  Tony Gorschek,et al.  A method for evaluating rigor and industrial relevance of technology evaluations , 2011, Empirical Software Engineering.

[55]  Dietmar Pfahl,et al.  Software Process Simulation Modeling: An Extended Systematic Review , 2010, ICSP.

[56]  Jürgen Börstler,et al.  The impacts of agile and lean practices on project constraints: A tertiary study , 2016, J. Syst. Softw..

[57]  Penny Whiting,et al.  The rationale for rating risk of bias should be fully reported: response. , 2016, Journal of Clinical Epidemiology.

[58]  Barbara Kitchenham,et al.  Procedures for Performing Systematic Reviews , 2004 .

[59]  Emilia Mendes,et al.  How Reliable Are Systematic Reviews in Empirical Software Engineering? , 2010, IEEE Transactions on Software Engineering.

[60]  Pearl Brereton,et al.  Evidence-Based Software Engineering and Systematic Reviews , 2015 .

[61]  Touseef Tahir,et al.  Systematic Literature Review on Software Measurement Programs , 2016 .

[62]  Pearl Brereton,et al.  A systematic review of systematic review process research in software engineering , 2013, Inf. Softw. Technol..

[63]  Claes Wohlin,et al.  Guidelines for snowballing in systematic literature studies and a replication in software engineering , 2014, EASE '14.

[64]  Vahid Garousi,et al.  When and what to automate in software testing? A multi-vocal literature review , 2016, Inf. Softw. Technol..

[65]  Muhammad Ali Babar,et al.  Architecting cloud‐enabled systems: a systematic survey of challenges and solutions , 2017, Softw. Pract. Exp..

[66]  Kai Petersen,et al.  Evaluating strategies for study selection in systematic literature studies , 2014, ESEM '14.

[67]  Alireza Sadeghi,et al.  A Taxonomy and Qualitative Comparison of Program Analysis Techniques for Security Assessment of Android Software , 2017, IEEE Transactions on Software Engineering.

[68]  Claes Wohlin Is there a Future for Empirical Software Engineering? , 2016, ESEM.

[69]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA Statement , 2009, BMJ : British Medical Journal.

[70]  André L. M. Santos,et al.  Six years of systematic literature reviews in software engineering: An updated tertiary study , 2011, Inf. Softw. Technol..

[71]  L. Rosen,et al.  The art and science of study identification: a comparative analysis of two systematic reviews , 2016, BMC Medical Research Methodology.

[72]  Andrea Janes,et al.  What recommendation systems for software engineering recommend: A systematic literature review , 2016, J. Syst. Softw..