Leveraging Cognitive Search Patterns to Enhance Automated Natural Language Retrieval Performance

The search of information in large text repositories has been plagued by the so-called document-query vocabulary gap, i.e. the semantic discordance between the contents in the stored document entities on the one hand and the human query on the other hand. Over the past two decades, a significant body of works has advanced technical retrieval prowess while several studies have shed light on issues pertaining to human search behavior. We believe that these efforts should be conjoined, in the sense that automated retrieval systems have to fully emulate human search behavior and thus consider the procedure according to which users incrementally enhance their initial query. To this end, cognitive reformulation patterns that mimic user search behaviour are highlighted and enhancement terms which are statistically collocated with or lexical-semantically related to the original terms adopted in the retrieval process. We formalize the application of these patterns by considering a query conceptual representation and introducing a set of operations allowing to operate modifications on the initial query. A genetic algorithm-based weighting process allows placing emphasis on terms according to their conceptual role-type. An experimental evaluation on real-world datasets against relevance, language, conceptual and knowledge-based models is conducted. We also show, when compared to language and relevance models, a better performance in terms of mean average precision than a word embedding-based model instantiation.

[1]  Marti A. Hearst Search User Interfaces , 2009 .

[2]  Elena García Barriocanal,et al.  An empirical analysis of ontology-based query expansion for learning resource searches using MERLOT and the Gene ontology , 2011, Knowl. Based Syst..

[3]  Mohammed Maree,et al.  Addressing semantic heterogeneity through multiple knowledge base assisted merging of domain-specific ontologies , 2015, Knowl. Based Syst..

[4]  Douglas W. Oard,et al.  A Fixed-Point Method for Weighting Terms in Verbose Informational Queries , 2014, CIKM.

[5]  W. Bruce Croft,et al.  A Comparison of Retrieval Models using Term Dependencies , 2014, CIKM.

[6]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[7]  Barbara H. Partee,et al.  Lexical semantics and compositionality. , 1995 .

[8]  Mandar Mitra,et al.  Improving query expansion using WordNet , 2013, J. Assoc. Inf. Sci. Technol..

[9]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[10]  P. Smith,et al.  A review of ontology based query expansion , 2007, Inf. Process. Manag..

[11]  Yong Yu,et al.  Viewing Term Proximity from a Different Perspective , 2008, ECIR.

[12]  Roberto Navigli A Quick Tour of Word Sense Disambiguation, Induction and Related Approaches , 2012, SOFSEM.

[13]  Rada Mihalcea,et al.  Using Wikipedia for Automatic Word Sense Disambiguation , 2007, NAACL.

[14]  Kevyn Collins-Thompson,et al.  Query expansion using random walk models , 2005, CIKM '05.

[15]  Clement T. Yu,et al.  An effective approach to document retrieval via utilizing WordNet and recognizing phrases , 2004, SIGIR '04.

[16]  Padhraic Smyth,et al.  Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model , 2006, NIPS.

[17]  Mohammed Belkhatir,et al.  Coupled intrinsic and extrinsic human language resource-based query expansion , 2018, Knowledge and Information Systems.

[18]  Soo Young Rieh,et al.  Analysis of multiple query reformulations on the web: The interactive information retrieval context , 2006, Information Processing & Management.

[19]  Eero Hyvönen,et al.  Ontology-Based Query Expansion Widget for Information Retrieval , 2009, SFSW@ESWC.

[20]  Mohammed Belkhatir,et al.  A linguistically driven framework for query expansion via grammatical constituent highlighting and role-based concept weighting , 2016, Inf. Process. Manag..

[21]  W. Bruce Croft,et al.  A Language Modeling Approach to Information Retrieval , 1998, SIGIR Forum.

[22]  Amanda Spink,et al.  Determining the informational, navigational, and transactional intent of Web queries , 2008, Inf. Process. Manag..

[23]  Jagdev Bhogal,et al.  Ontology Based Query Expansion with a Probabilistic Retrieval Model , 2013 .

[24]  Cord Spreckelsen,et al.  Word Sense Disambiguation of Medical Terms via Recurrent Convolutional Neural Networks , 2017, eHealth.