A Light-Weight Text Summarization System for Fast Access to Medical Evidence

As the volume of published medical research continues to grow rapidly, staying up-to-date with the best-available research evidence regarding specific topics is becoming an increasingly challenging problem for medical experts and researchers. The current COVID19 pandemic is a good example of a topic on which research evidence is rapidly evolving. Automatic query-focused text summarization approaches may help researchers to swiftly review research evidence by presenting salient and query-relevant information from newly-published articles in a condensed manner. Typical medical text summarization approaches require domain knowledge, and the performances of such systems rely on resource-heavy medical domain-specific knowledge sources and pre-processing methods (e.g., text classification) for deriving semantic information. Consequently, these systems are often difficult to speedily customize, extend, or deploy in low-resource settings, and they are often operationally slow. In this paper, we propose a fast and simple extractive summarization approach that can be easily deployed and run, and may thus aid medical experts and researchers obtain fast access to the latest research evidence. At runtime, our system utilizes similarity measurements derived from pre-trained medical domain-specific word embeddings in addition to simple features, rather than computationally-expensive pre-processing and resource-heavy knowledge bases. Automatic evaluation using ROUGE—a summary evaluation tool—on a public dataset for evidence-based medicine shows that our system's performance, despite the simple implementation, is statistically comparable with the state-of-the-art. Extrinsic manual evaluation based on recently-released COVID19 articles demonstrates that the summarizer performance is close to human agreement, which is generally low, for extractive summarization.

[1]  M. Ebell,et al.  Analysis of questions asked by family doctors regarding patient care , 1999, BMJ.

[2]  Kyung-Yong Chung,et al.  PHR Based Diabetes Index Service Model Using Life Behavior Analysis , 2017, Wirel. Pers. Commun..

[3]  Cor J Kalkman,et al.  Doctors’ Perceptions and Use of Evidence-Based Medicine: A Systematic Review and Thematic Synthesis of Qualitative Studies , 2013, Academic medicine : journal of the Association of American Medical Colleges.

[4]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[5]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Cécile Paris,et al.  Query-oriented evidence extraction to support evidence-based medicine practice , 2016, J. Biomed. Informatics.

[8]  Deepak Sharma,et al.  Ranking-Based Sentence Retrieval for Text Summarization , 2018, Smart Innovations in Communication and Computational Sciences.

[9]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[10]  Hong Yu,et al.  AskHERMES: An online question answering system for complex clinical questions , 2011, J. Biomed. Informatics.

[11]  T. Greenhalgh,et al.  Evidence based medicine: a movement in crisis? , 2014, BMJ : British Medical Journal.

[12]  Said Ouatik El Alaoui,et al.  SemBioNLQA: A semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions , 2020, Artif. Intell. Medicine.

[13]  R Brian Haynes,et al.  Evidence based medicine: what it is and what it isn't. 1996. , 2007, Clinical orthopaedics and related research.

[14]  Naomie Salim,et al.  Text summarization features selection method using pseudo Genetic-based model , 2012, 2012 International Conference on Information Retrieval & Knowledge Management.

[15]  Guilherme Del Fiol,et al.  Text summarization in the biomedical domain: A systematic review of recent research , 2014, J. Biomed. Informatics.

[16]  Tapio Salakoski,et al.  Distributional Semantics Resources for Biomedical Text Processing , 2013 .

[17]  P. Pluye,et al.  Patient-Oriented Evidence that Matters (POEMs)™ Suggest Potential Clinical Topics for the Choosing Wisely™ Campaign , 2015, The Journal of the American Board of Family Medicine.

[18]  C. Chew‐Graham,et al.  PICO, PICOS and SPIDER: a comparison study of specificity and sensitivity in three search tools for qualitative systematic reviews , 2014, BMC Health Services Research.

[19]  G. Lip How the Read a Paper: The Basics of Evidence Based Medicine , 1998, Journal of Human Hypertension.

[20]  W. Hersh,et al.  Factors associated with successful answering of clinical questions using an information retrieval system. , 2002, Bulletin of the Medical Library Association.

[21]  Regina Barzilay,et al.  Sentence Fusion for Multidocument News Summarization , 2005, CL.

[22]  Chirk Jenn Ng,et al.  Development of a Search Strategy for an Evidence Based Retrieval Service , 2016, PloS one.

[23]  Milad Moradi,et al.  Summarization of biomedical articles using domain-specific word embeddings and graph ranking , 2020, J. Biomed. Informatics.

[24]  Jimmy J. Lin,et al.  Answering Clinical Questions with Knowledge-Based and Statistical Techniques , 2007, CL.

[25]  D. Sackett,et al.  Evidence based medicine: what it is and what it isn't , 1996, BMJ.

[26]  Gordon H Guyatt,et al.  Progress in evidence-based medicine: a quarter century on , 2017, The Lancet.

[27]  Dejan Dinevski,et al.  Biomedical question answering using semantic relations , 2015, BMC Bioinformatics.

[28]  George D. C. Cavalcanti,et al.  Assessing sentence scoring techniques for extractive text summarization , 2013, Expert Syst. Appl..

[29]  Cécile Paris,et al.  A corpus for research in text processing for evidence based medicine , 2016, Lang. Resour. Evaluation.

[30]  José A. Sacristán,et al.  Patient-centered medicine and patient-oriented research: improving health outcomes for individual patients , 2013, BMC Medical Informatics and Decision Making.

[31]  Eric Nyberg,et al.  Tackling Biomedical Text Summarization: OAQA at BioASQ 5B , 2017, BioNLP.

[32]  Ahmad Zainul Fanani,et al.  Literature Review of Automatic Text Summarization: Research Trend, Dataset and Method , 2019, 2019 International Conference on Information and Communications Technology (ICOIACT).

[33]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[34]  Hong Yu,et al.  Automatically Extracting Information Needs from Ad Hoc Clinical Questions , 2008, AMIA.

[35]  Charles P. Friedman,et al.  Research Paper: Factors Associated with Success in Searching MEDLINE and Applying Evidence to Answer Clinical Questions , 2002, J. Am. Medical Informatics Assoc..

[36]  David Martínez,et al.  Automatic classification of sentences to support Evidence Based Medicine , 2011, BMC Bioinformatics.

[37]  Bo Li,et al.  Adaptive Maximum Marginal Relevance Based Multi-email Summarization , 2009, AICI.

[38]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[39]  Hyoil Han,et al.  Biomedical question answering: A survey , 2010, Comput. Methods Programs Biomed..

[40]  Laura Plaza Comparing different knowledge sources for the automatic summarization of biomedical literature , 2014, J. Biomed. Informatics.

[41]  Fang Chen,et al.  Appraising UMLS Coverage for Summarizing Medical Evidence , 2016, COLING.

[42]  Jane Hunter,et al.  Identifying scientific artefacts in biomedical literature: The Evidence Based Medicine use case , 2014, J. Biomed. Informatics.

[43]  Elena Lloret,et al.  Quantifying the Limits and Success of Extractive Summarization Systems Across Domains , 2010, HLT-NAACL.

[44]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..