Title, abstract and keyword searching resulted in poor recovery of articles in systematic reviews of epidemiologic practice.

OBJECTIVE Article full texts are often inaccessible via the standard search engines of biomedical literature, such as PubMed and Embase, which are commonly used for systematic reviews. Excluding the full text bodies from a literature search may result in a small or selective subset of articles being included in the review because of the limited information that is available in only title, abstract and keywords. This article describes a comparison of search strategies based on a systematic literature review of all manuscripts published in 5 top-ranked epidemiology journals between 2000 and 2017. STUDY DESIGN AND SETTING Based on a text-mining approach, we studied whether 9 different methodological topics were mentioned across text fields (title, abstract, keywords, and text body). The following methodological topics were studied: propensity score methods, inverse probability weighting, marginal structural modelling, multiple imputation, Kaplan-Meier estimation, number needed to treat, measurement error, randomized controlled trial, and latent class analysis. RESULTS In total, 31,641 Hypertext Markup Language (HTML) files were downloaded from the journals' websites. For all methodological topics and journals, at most 50% of articles with a mention of a topic in the text body also mentioned the topic in the title, abstract or keywords. For each topic, a gradual decrease over calendar time was observed of reporting in the title, abstract or keywords. CONCLUSION Literature searches based on title, abstract and keywords alone may not be sufficiently sensitive for studies of epidemiological research practice. This study also illustrates the potential value of full text literature searches, provided there is accessibility of full text bodies for literature searches.

[1]  S. Ananiadou,et al.  Using text mining for study identification in systematic reviews: a systematic review of current approaches , 2015, Systematic Reviews.

[2]  R. Fitzpatrick,et al.  Sample size calculations are poorly conducted and reported in many randomized trials of hip and knee osteoarthritis: results of a systematic review. , 2018, Journal of clinical epidemiology.

[3]  Rolf H H Groenwold,et al.  Reporting of covariate selection and balance assessment in propensity score analysis is suboptimal: a systematic review. , 2015, Journal of clinical epidemiology.

[4]  E. Hak,et al.  A systematic review finds inconsistency in the measures used to estimate adherence and persistence to multiple cardiometabolic medications. , 2019, Journal of clinical epidemiology.

[5]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[6]  Sheryl L Ramer,et al.  Site-ation pearl growing: methods and librarianship history and theory. , 2005, Journal of the Medical Library Association : JMLA.

[7]  C. Alves,et al.  Number needed to treat (NNT) in clinical literature: an appraisal , 2017, BMC Medicine.

[8]  T. Greenhalgh,et al.  Effectiveness and efficiency of search methods in systematic reviews of complex evidence: audit of primary sources , 2005, BMJ : British Medical Journal.

[9]  George Davey Smith,et al.  meta-analysis bias in location and selection of studies , 1998 .

[10]  Maarten van Smeden,et al.  Measurement error is often neglected in medical literature: a systematic review. , 2018, Journal of clinical epidemiology.

[11]  David Moher,et al.  Epidemiology and Reporting Characteristics of Systematic Reviews of Biomedical Research: A Cross-Sectional Study , 2016, PLoS medicine.

[12]  Vicki S Conn,et al.  Beyond MEDLINE for literature searches. , 2003, Journal of nursing scholarship : an official publication of Sigma Theta Tau International Honor Society of Nursing.