Appraising UMLS Coverage for Summarizing Medical Evidence

When making clinical decisions, practitioners need to rely on the most relevant evidence available. However, accessing a vast body of medical evidence and confronting with the issue of information overload can be challenging and time consuming. This paper proposes an effective summarizer for medical evidence by utilizing both UMLS and WordNet. Given a clinical query and a set of relevant abstracts, our aim is to generate a fluent, well-organized, and compact summary that answers the query. Analysis via ROUGE metrics shows that using WordNet as a general-purpose lexicon helps to capture the concepts not covered by the UMLS Metathesaurus, and hence significantly increases the performance. The effectiveness of our proposed approach is demonstrated by conducting a set of experiments over a specialized evidence-based medicine (EBM) corpus - which has been gathered and annotated for the purpose of biomedical text summarization.

[1]  James Geller,et al.  Using WordNet synonym substitution to enhance UMLS source integration , 2009, Artif. Intell. Medicine.

[2]  Dejan Dinevski,et al.  Biomedical question answering using semantic relations , 2015, BMC Bioinformatics.

[3]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[4]  Cécile Paris,et al.  Query-oriented evidence extraction to support evidence-based medicine practice , 2016, J. Biomed. Informatics.

[5]  Byron C. Wallace,et al.  Automating Risk of Bias Assessment for Clinical Trials , 2014, IEEE Journal of Biomedical and Health Informatics.

[6]  Graeme Hirst,et al.  Using Outcome Polarity in Sentence Extraction for Medical Question-Answering , 2006, AMIA.

[7]  Olivier Bodenreider,et al.  Using WordNet to Improve the Mapping of Data Elements to UMLS for Data Sources Integration , 2006, AMIA.

[8]  Arantxa Otegi,et al.  Improving search over Electronic Health Records using UMLS-based query expansion through random walks , 2014, J. Biomed. Informatics.

[9]  Daniel Dominic Sleator,et al.  Link Grammar Parser , 2000 .

[10]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[11]  D. Cukierman,et al.  Health Query Expansion Using WordNet and UMLS , 2015 .

[12]  H. C. Coumou,et al.  How do primary care physicians seek answers to clinical questions? A literature review. , 2006, Journal of the Medical Library Association : JMLA.

[13]  Hong Yu,et al.  Automatically Extracting Information Needs from Ad Hoc Clinical Questions , 2008, AMIA.

[14]  Karen Spärck Jones Automatic summarising: factors and directions , 1998, ArXiv.

[15]  Shafiq R. Joty,et al.  Improving graph-based random walks for complex question answering using syntactic, shallow semantic and extended string subsequence kernels , 2011, Inf. Process. Manag..

[16]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[17]  Fang Chen,et al.  An Efficient Approach for Multi-Sentence Compression , 2016, ACML.

[18]  Roberto Navigli,et al.  From senses to texts: An all-in-one graph-based approach for measuring semantic similarity , 2015, Artif. Intell..

[19]  Roberto Navigli,et al.  Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity , 2013, ACL.

[20]  Olivier Bodenreider,et al.  Comparing terms, concepts and semantic classes in WordNet and the Unified Medical Language System , 2001 .

[21]  Yang Wang,et al.  Question Answering Summarization of Multiple Biomedical Documents , 2007, Canadian Conference on AI.

[22]  Min Feng,et al.  Automatic Clinical Question Answering Based on UMLS Relations , 2007, Third International Conference on Semantics, Knowledge and Grid (SKG 2007).

[23]  Fang Chen,et al.  On Improving Informativity and Grammaticality for Multi-Sentence Compression , 2016, ArXiv.

[24]  Halil Kilicoglu,et al.  Abstraction Summarization for Managing the Biomedical Research Literature , 2004, HLT-NAACL 2004.

[25]  Katja Filippova,et al.  Multi-Sentence Compression: Finding Shortest Paths in Word Graphs , 2010, COLING.

[26]  A. Detsky,et al.  Evidence-based medicine. A new approach to teaching the practice of medicine. , 1992, JAMA.

[27]  Panagiotis Stamatopoulos,et al.  Summarization from Medical Documents: A Survey , 2005, Artif. Intell. Medicine.

[28]  Deirdre Hogan,et al.  Empirical Measurements of Lexical Similarity in Noun Phrase Conjuncts , 2007, ACL.

[29]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[30]  Xiaojun Wan,et al.  Manifold-Ranking Based Topic-Focused Multi-Document Summarization , 2007, IJCAI.

[31]  Florian Boudin,et al.  Keyphrase Extraction for N-best Reranking in Multi-Sentence Compression , 2013, HLT-NAACL.

[32]  Pablo Gervás,et al.  A semantic graph-based approach to biomedical summarisation , 2011, Artif. Intell. Medicine.

[33]  Tao Li,et al.  Learning to Rank for Query-Focused Multi-document Summarization , 2011, 2011 IEEE 11th International Conference on Data Mining.

[34]  Wenpeng Yin,et al.  Summarization , 2018, Encyclopedia of Database Systems.

[35]  K. Cohen,et al.  Biomedical language processing: what's beyond PubMed? , 2006, Molecular cell.

[36]  W. Hersh,et al.  Factors associated with successful answering of clinical questions using an information retrieval system. , 2002, Bulletin of the Medical Library Association.

[37]  A. Brooks,et al.  Evidence-based oncology project. , 2002, Surgical oncology clinics of North America.

[38]  Rodney L. Summerscales,et al.  AUTOMATIC SUMMARIZATION OF CLINICAL ABSTRACTS FOR EVIDENCE-BASED MEDICINE , 2013 .

[39]  Christian Biemann,et al.  Chinese Whispers - an Efficient Graph Clustering Algorithm and its Application to Natural Language Processing Problems , 2006 .

[40]  S. Dongen A cluster algorithm for graphs , 2000 .

[41]  H. Hricak,et al.  Evidence-based medicine. , 1997, Singapore medical journal.

[42]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[43]  Frank Schilder,et al.  FastSum: Fast and Accurate Query-based Multi-document Summarization , 2008, ACL.

[44]  M. Ebell,et al.  Obstacles to answering doctors' questions about patient care with evidence: qualitative study , 2002, BMJ : British Medical Journal.

[45]  Identification of Important Text in Full Text Articles Using Summarization , 2005 .

[46]  Hyoil Han,et al.  Biomedical question answering: A survey , 2010, Comput. Methods Programs Biomed..

[47]  Mourad Oussalah,et al.  Similarity-Based Query-Focused Multi-document Summarization Using Crowdsourced and Manually-built Lexical-Semantic Resources , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[48]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[49]  Chris Callison-Burch,et al.  Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .

[50]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[51]  Fang Chen,et al.  A Query-Based Summarization Service from Multiple News Sources , 2016, 2016 IEEE International Conference on Services Computing (SCC).

[52]  P M Nadkarni Information retrieval in medicine: overview and applications. , 2000, Journal of postgraduate medicine.

[53]  Charles P. Friedman,et al.  Research Paper: Factors Associated with Success in Searching MEDLINE and Applying Evidence to Answer Clinical Questions , 2002, J. Am. Medical Informatics Assoc..

[54]  Hyoil Han,et al.  The use of domain-specific concepts in biomedical text summarization , 2007, Inf. Process. Manag..

[55]  Alan R. Aronson,et al.  Semi-Automatic Indexing of Full Text Biomedical Articles , 2005, AMIA.

[56]  P. M. Nadkarni E-Medicine - Information Retrieval in Medicine: Overview andApplications , 2000 .

[57]  Jimmy J. Lin,et al.  Answering Clinical Questions with Knowledge-Based and Statistical Techniques , 2007, CL.

[58]  D. Sackett,et al.  Evidence based medicine: what it is and what it isn't , 1996, BMJ.

[59]  Hong Yu,et al.  AskHERMES: An online question answering system for complex clinical questions , 2011, J. Biomed. Informatics.

[60]  Allan Hanbury Medical information retrieval: an instance of domain-specific search , 2012, SIGIR '12.

[61]  Bridget C. O’Brien,et al.  Access of primary and secondary literature by health personnel in an academic health center: implications for open access. , 2013, Journal of the Medical Library Association : JMLA.

[62]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[63]  Christof Monz,et al.  Statistical Machine Translation with Local Language Models , 2011, EMNLP.

[64]  Karen Sparck Jones,et al.  Book Reviews: Evaluating Natural Language Processing Systems: An Analysis and Review , 1996, CL.

[65]  Wai Lam,et al.  Evaluation Challenges in Large-Scale Document Summarization , 2003, ACL.

[66]  Cécile Paris,et al.  A corpus for research in text processing for evidence based medicine , 2016, Lang. Resour. Evaluation.

[67]  Jianhua Li,et al.  Analysis of Polarity Information in Medical Text , 2005, AMIA.

[68]  Marti A. Hearst,et al.  A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text , 2002, Pacific Symposium on Biocomputing.

[69]  Diego Mollá Aliod,et al.  Development of a Corpus for Evidence Based Medicine Summarisation , 2011, ALTA.

[70]  Cécile Paris,et al.  Automatic evidence quality prediction to support evidence-based decision making , 2015, Artif. Intell. Medicine.