Head Concepts Selection for Verbose Medical Queries Expansion

Semantic concepts and relations encoded in domain-specific ontologies and other medical semantic resources play a crucial role in deciphering terms in medical queries and documents. The exploitation of these resources for tackling the semantic gap issue has been widely studied in the literature. However, there are challenges that hinder their widespread use in real-world applications. Among these challenges is the insufficient knowledge individually encoded in existing medical ontologies, which is magnified when users express their information needs using long-winded natural language queries. In this context, many of the users’ query terms are either unrecognized by the used ontologies, or cause retrieving false positives that degrade the quality of current medical information search approaches. In this article, we explore the combination of multiple extrinsic semantic resources in the development of a full-fledged medical information search framework to: i) highlight and expand head medical concepts in verbose medical queries (i.e. concepts among query terms that significantly contribute to the informativeness and intent of a given query), ii) build semantically-enhanced inverted index documents, and iii) contribute to a heuristical weighting technique in the query-document matching process. To demonstrate the effectiveness of the proposed approach, we conducted several experiments over the CLEF e-Health 2014 dataset. Findings indicate that the proposed method combining several extrinsic semantic resources proved to be more effective than related approaches in terms of precision measure.

[1]  Mohammed Belkhatir,et al.  A linguistically driven framework for query expansion via grammatical constituent highlighting and role-based concept weighting , 2016, Inf. Process. Manag..

[2]  Björn Buchhold,et al.  Semantic Search on Text and Knowledge Bases , 2016, Found. Trends Inf. Retr..

[3]  Manolis Tsiknakis,et al.  Semantically-enabled Personal Medical Information Recommender , 2015, SEMWEB.

[4]  W. Bruce Croft,et al.  Search Engines - Information Retrieval in Practice , 2009 .

[5]  Eric Fosler-Lussier,et al.  Phrase2VecGLM: Neural generalized language model–based semantic tagging for complex query reformulation in medical IR , 2018, BioNLP.

[6]  Arantxa Otegi,et al.  Improving search over Electronic Health Records using UMLS-based query expansion through random walks , 2014, J. Biomed. Informatics.

[7]  W. Bruce Croft,et al.  Evaluating verbose query processing techniques , 2010, SIGIR.

[8]  Min Wang,et al.  Exploiting entity relationship for query expansion in enterprise search , 2014, Information Retrieval.

[9]  Iadh Ounis,et al.  Automatically Building a Stopword List for an Information Retrieval System , 2005, J. Digit. Inf. Manag..

[10]  Mohammed Belkhatir,et al.  Natural language technology and query expansion: issues, state-of-the-art and perspectives , 2011, Journal of Intelligent Information Systems.

[11]  Heung-Seon Oh,et al.  A Multiple-stage Approach to Re-ranking Clinical Documents , 2014, CLEF.

[12]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[13]  Vít Novácek,et al.  Knowledge base completion using distinct subgraph paths , 2018, SAC.

[14]  Yue Wang,et al.  Learning2extract for Medical Domain Retrieval , 2017, AIRS.

[15]  Khalid S. Rabayah,et al.  EXPERIMENTAL EVALUATION OF QUERY REFORMULATION TECHNIQUES IN THE CONTEXT OF MEDICAL INFORMATION RETRIEVAL , 2018 .

[16]  Zhiyong Lu,et al.  Exploring Query Expansion for Entity Searches in PubMed , 2016, Louhi@EMNLP.

[17]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[18]  Charles E. Kahn,et al.  How users search and what they search for in the medical domain , 2015, Information Retrieval Journal.

[19]  Mohammed Belkhatir,et al.  Coupled intrinsic and extrinsic human language resource-based query expansion , 2018, Knowledge and Information Systems.

[20]  Xiaojie Liu,et al.  An Investigation of the Effectiveness of Concept-based Approach in Medical Information Retrieval , 2014, CLEF.

[21]  Werner Nutt,et al.  But What Do We Actually Know? , 2016, AKBC@NAACL-HLT.

[22]  Ryen W. White,et al.  Characterizing the influence of domain expertise on web search behavior , 2009, WSDM '09.

[23]  Ophir Frieder,et al.  Enhancing web search in the medical domain via query clarification , 2016, Information Retrieval Journal.

[24]  Gareth J. F. Jones,et al.  ShARe/CLEF eHealth Evaluation Lab 2014, Task 3: User-centred Health Information Retrieval , 2014, CLEF.

[25]  Wei Shen,et al.  An Investigation of the Eectiveness of Concept-based Approach in Medical Information Retrieval GRIUM @ CLEF2014eHealthTask 3 , 2014 .

[26]  Shojiro Nishio,et al.  IDF for Word N-grams , 2017, ACM Trans. Inf. Syst..

[27]  Yue Wang,et al.  Key Terms Guided Expansion for Verbose Queries in Medical Domain , 2018, AIRS.

[28]  L. Rodney Long,et al.  Multi-modal Query Expansion Based on Local Analysis for Medical Image Retrieval , 2009, MCBR-CDS.

[29]  Yejun Wu,et al.  Enriching a thesaurus as a better question-answering tool and information retrieval aid , 2018, J. Inf. Sci..

[30]  Feng Wang,et al.  The research of query expansion based on medical terms reweighting in medical information retrieval , 2018, EURASIP J. Wirel. Commun. Netw..

[31]  Mohammed Maree,et al.  Addressing semantic heterogeneity through multiple knowledge base assisted merging of domain-specific ontologies , 2015, Knowl. Based Syst..

[32]  Ben Carterette,et al.  An adaptive evidence weighting method for medical record search , 2013, SIGIR.

[33]  Peter Willett,et al.  The Porter stemming algorithm: then and now , 2006, Program.

[34]  Pavel Pecina,et al.  Term Selection for Query Expansion in Medical Cross-Lingual Information Retrieval , 2019, ECIR.

[35]  Jinwook Choi,et al.  Exploring Effective Information Retrieval Technique for the Medical Web Documents: SNUMedinfo at CLEFeHealth2014 Task 3 , 2014, CLEF.

[36]  Luca Soldaini,et al.  The Knowledge and Language Gap in Medical Information Seeking , 2019, SIGIR Forum.

[37]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[38]  Isabelle Stanton,et al.  Circumlocution in diagnostic medical queries , 2014, SIGIR.

[39]  Haolin Wang,et al.  Semantically Enhanced Medical Information Retrieval System: A Tensor Factorization Based Approach , 2017, IEEE Access.

[40]  Manish Gupta,et al.  Information Retrieval with Verbose Queries , 2015, Found. Trends Inf. Retr..

[41]  Allan Hanbury,et al.  Exploiting Health Related Features to Infer User Expertise in the Medical Domain , 2014 .

[42]  Dan Klein,et al.  Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon , 2005 .

[43]  Ghalem Belalem,et al.  Query Expansion Using Medical Information Extraction for Improving Information Retrieval in French Medical Domain , 2018, Int. J. Intell. Inf. Technol..

[44]  Gareth J. F. Jones,et al.  Medical information retrieval: introduction to the special issue , 2016, Information Retrieval Journal.

[45]  Hong Yu,et al.  Key Concept Identification for Medical Information Retrieval , 2015, EMNLP.

[46]  Lei Yang,et al.  Query log analysis of an electronic health record search engine. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[47]  Guido Zuccon,et al.  Information retrieval as semantic inference: a Graph Inference model applied to medical search , 2016, Information Retrieval Journal.