Query Intention Acquisition: A Case Study on Automatically Inferring Structured Queries

The process of acquiring the user’s intentions is an important phase in the querying process. The identification of their intentions enables the selection of appropriate retrieval strategies. In this paper, we first outline this cross-section work in contextual Information Retrieval. We then focus on one particular type of intention: the syntactic and semantic types associated with a query term. We present a case study using the email search task of the TREC Enterprise Track. We build and analyze a data set of query intentions linked to the email’s structure, and then attempt to automatically infer structured queries and study the affect that ambiguity of queries and the difficulty of inferring them has on various retrieval models (structured and unstructured). Our study reveals that predicting the intentions is a hard problem due to the inherent uncertainty within the querying process. We also show that automatically inferred queries do not outperform other types of structured retrieval models, because they are not robust enough to handle the ambiguity nor reliable enough to be accurately inferred.

[1]  Robert S. Taylor,et al.  Value-Added Processes in Information Systems , 1987 .

[2]  Berthier A. Ribeiro-Neto,et al.  Searching web databases by structuring keyword-based queries , 2002, CIKM '02.

[3]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .

[4]  Xin Fu,et al.  The loquacious user: a document-independent source of terms for query expansion , 2005, SIGIR '05.

[5]  Andrew Trotman,et al.  The Simplest Query Language That Could Possibly Work , 2004 .

[6]  N. Roberts,et al.  Value-added processes in information systems , 1986 .

[7]  Jianfeng Gao,et al.  Dependence language model for information retrieval , 2004, SIGIR '04.

[8]  W. Bruce Croft,et al.  A general language model for information retrieval , 1999, CIKM '99.

[9]  Alberto H. F. Laender,et al.  The effectiveness of automatically structured queries in digital libraries , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[10]  Gilad Mishne,et al.  Language Models for Searching in Web Corpora , 2004, TREC.

[11]  W. Bruce Croft,et al.  The use of phrases and structured queries in information retrieval , 1991, SIGIR '91.

[12]  Rohini K. Srihari,et al.  Biterm language models for document retrieval , 2002, SIGIR '02.

[13]  P. E. van der Vet,et al.  User intentions in information retrieval , 2005 .

[14]  James P. Callan,et al.  Combining document representations for known-item search , 2003, SIGIR.

[15]  Leif AzzopardiKrisztian Language Modeling Approaches for Enterprise Tasks , 2005 .

[16]  M. Degroot Optimal Statistical Decisions , 1970 .

[17]  Leif Azzopardi,et al.  Probabilistic hyperspace analogue to language , 2005, SIGIR '05.