SYMBALS: A Systematic Review Methodology Blending Active Learning and Snowballing

Research output has grown significantly in recent years, often making it difficult to see the forest for the trees. Systematic reviews are the natural scientific tool to provide clarity in these situations. However, they are protracted processes that require expertise to execute. These are problematic characteristics in a constantly changing environment. To solve these challenges, we introduce an innovative systematic review methodology: SYMBALS. SYMBALS blends the traditional method of backward snowballing with the machine learning method of active learning. We applied our methodology in a case study, demonstrating its ability to swiftly yield broad research coverage. We proved the validity of our method using a replication study, where SYMBALS was shown to accelerate title and abstract screening by a factor of 6. Additionally, four benchmarking experiments demonstrated the ability of our methodology to outperform the state-of-the-art systematic review methodology FAST2.

[1]  Richard T. Watson,et al.  Analyzing the Past to Prepare for the Future: Writing a Literature Review , 2002, MIS Q..

[2]  Magnus C. Ohlsson,et al.  Experimentation in Software Engineering , 2000, The Kluwer International Series in Software Engineering.

[3]  Sailik Sengupta,et al.  A Survey of Moving Target Defenses for Network Security , 2019, IEEE Communications Surveys & Tutorials.

[4]  Claes Wohlin,et al.  Guidelines for the search strategy to update systematic literature reviews in software engineering , 2020, Inf. Softw. Technol..

[5]  Guy M. Goodwin,et al.  Introduction to Systematic Reviews , 2004, Journal of psychopharmacology.

[6]  Neal R Haddaway,et al.  Which academic search systems are suitable for systematic reviews or meta‐analyses? Evaluating retrieval qualities of Google Scholar, PubMed, and 26 other resources , 2020, Research synthesis methods.

[7]  Francisco J. García-Peñalvo,et al.  Information retrieval methodology for aiding scientific database search , 2018, Soft Comput..

[8]  Andrew Jaquith Security Metrics: Replacing Fear, Uncertainty, and Doubt , 2007 .

[9]  Francisco J. García-Peñalvo,et al.  Decision support tools for SLR search string construction , 2018, TEEM.

[10]  David Cohn,et al.  Active Learning , 2010, Encyclopedia of Machine Learning.

[11]  Henry Muccini,et al.  Reducing the Effort for Systematic Reviews in Software Engineering , 2019, Data Sci..

[12]  Muhammad Ali Babar,et al.  Identifying relevant studies in software engineering , 2011, Inf. Softw. Technol..

[13]  J. Higgins,et al.  Cochrane Handbook for Systematic Reviews of Interventions , 2010, International Coaching Psychology Review.

[14]  Andrew W. Brown,et al.  Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry , 2017, BMJ Open.

[15]  Tore Dybå,et al.  Evidence-based software engineering , 2004, Proceedings. 26th International Conference on Software Engineering.

[16]  Claes Wohlin,et al.  Investigating the Use of a Hybrid Search Strategy for Systematic Reviews , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[17]  Sophia Ananiadou,et al.  Reducing systematic review workload through certainty-based screening , 2014, J. Biomed. Informatics.

[18]  Paramvir Singh,et al.  Exploring Automatic Search in Digital Libraries: A Caution Guide for Systematic Reviewers , 2017, EASE.

[19]  Hossam M. Hammady,et al.  Rayyan—a web and mobile app for systematic reviews , 2016, Systematic Reviews.

[20]  P. Shekelle,et al.  Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015: elaboration and explanation , 2015, BMJ : British Medical Journal.

[21]  Xin Huang,et al.  A Map of Threats to Validity of Systematic Literature Reviews in Software Engineering , 2016, 2016 23rd Asia-Pacific Software Engineering Conference (APSEC).

[22]  Pearl Brereton,et al.  Evidence-Based Software Engineering and Systematic Reviews , 2015 .

[23]  P. Shekelle,et al.  Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement , 2015, Systematic Reviews.

[24]  Byron C. Wallace,et al.  Toward systematic review automation: a practical guide to using machine learning tools in research synthesis , 2019, Systematic Reviews.

[25]  Mark Ware,et al.  The STM report: An overview of scientific and scholarly journal publishing fourth edition , 2015 .

[26]  James Thomas,et al.  Use of cost-effectiveness analysis to compare the efficiency of study identification methods in systematic reviews , 2016, Systematic Reviews.

[27]  Jenny Torres,et al.  Metrics and Indicators of Information Security Incident Management: A Systematic Mapping Study , 2020 .

[28]  Tim Menzies,et al.  FAST2: An intelligent assistant for finding relevant papers , 2017, Expert Syst. Appl..

[29]  Claes Wohlin,et al.  On the Performance of Hybrid Search Strategies for Systematic Literature Reviews in Software Engineering , 2020, Inf. Softw. Technol..

[30]  Tim Menzies,et al.  Finding better active learners for faster literature reviews , 2016, Empirical Software Engineering.

[31]  Richard Torkar,et al.  Software fault prediction metrics: A systematic literature review , 2013, Inf. Softw. Technol..

[32]  Claes Wohlin,et al.  Guidelines for snowballing in systematic literature studies and a replication in software engineering , 2014, EASE '14.

[33]  Maura R. Grossman,et al.  Engineering Quality and Reliability in Technology-Assisted Review , 2016, SIGIR.

[34]  A. Gates,et al.  Performance and usability of machine learning for screening in systematic reviews: a comparative evaluation of three tools , 2019, Systematic Reviews.

[35]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A tertiary study , 2010, Inf. Softw. Technol..

[36]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.

[37]  André L. M. Santos,et al.  Six years of systematic literature reviews in software engineering: An updated tertiary study , 2011, Inf. Softw. Technol..

[38]  G. Glass Primary, Secondary, and Meta-Analysis of Research , 2008 .

[39]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[40]  Muhammad Ali Babar,et al.  Systematic reviews in software engineering: An empirical investigation , 2013, Inf. Softw. Technol..

[41]  Vilhelm Verendel,et al.  Quantified security is a weak hypothesis: a critical survey of results and assumptions , 2009, NSPW '09.

[42]  Pearl Brereton,et al.  Lessons from applying the systematic literature review process within the software engineering domain , 2007, J. Syst. Softw..

[43]  WebsterJane,et al.  Analyzing the past to prepare for the future , 2002 .

[44]  Rebecca Slayton,et al.  Measuring Risk: Computer Security Metrics, Automation, and Learning , 2015, IEEE Annals of the History of Computing.

[45]  Robert K. Cunningham,et al.  Why Measuring Security Is Hard , 2010, IEEE Security & Privacy.

[46]  Jessica Babineau,et al.  Product Review: Covidence (Systematic Review Software) , 2014 .

[47]  Maarten Hoogerwerf,et al.  An open source machine learning framework for efficient and transparent systematic reviews , 2021, Nature Machine Intelligence.

[48]  Tore Dybå,et al.  Strength of evidence in systematic reviews in software engineering , 2008, ESEM '08.

[49]  Muhammad Zohaib Z. Iqbal,et al.  Landscaping systematic mapping studies in software engineering: A tertiary study , 2019, J. Syst. Softw..

[50]  Shari Lawrence Pfleeger,et al.  Preliminary Guidelines for Empirical Research in Software Engineering , 2002, IEEE Trans. Software Eng..

[51]  Zhuo Lu,et al.  Cyber Deception: Overview and the Road Ahead , 2018, IEEE Security & Privacy.

[52]  Isla Kuhn,et al.  Software tools to support title and abstract screening for systematic reviews in healthcare: an evaluation , 2020, BMC Medical Research Methodology.

[53]  Laurie A. Williams,et al.  Mapping the field of software life cycle security metrics , 2018, Inf. Softw. Technol..

[54]  Carla E. Brodley,et al.  Deploying an interactive machine learning system in an evidence-based practice center: abstrackr , 2012, IHI '12.

[55]  Guy Tsafnat,et al.  A question of trust: can we build an evidence base to gain trust in systematic review automation technologies? , 2019, Systematic Reviews.

[56]  Mark Petticrew,et al.  Systematic reviews from astronomy to zoology: myths and misconceptions , 2001, BMJ : British Medical Journal.

[57]  Per Runeson,et al.  A Machine Learning Approach for Semi-Automated Search and Selection in Literature Studies , 2017, EASE.

[58]  Maura R. Grossman,et al.  Preventing the transmission of COVID-19 and other coronaviruses in older adults aged 60 years and above living in long-term care: a rapid review , 2020, Systematic Reviews.