What the user does not want?: query reformulation through term inclusion-exclusion

In information retrieval, keyword-based queries often fail to capture actual information need, especially when the need is very specific and particular. Using natural language, however, a user can clearly tell what she wants (positive part) and what she does not (negative parts). We propose techniques for automatic removal of negative parts and query augmentation with judicious term inclusion-exclusion from negative parts. Experiments conducted on standard datasets like TREC, ROBUST, WT10G demonstrate that the proposed techniques yield substantial performance gain, often being statistically significant.

[1]  James Allan,et al.  INQUERY at TREC-5 , 1996, TREC.

[2]  Sukomal Pal,et al.  Using Negative Information in Search , 2011, 2011 Second International Conference on Emerging Applications of Information Technology.

[3]  W. Bruce Croft,et al.  Discovering key concepts in verbose queries , 2008, SIGIR '08.

[4]  Nicholas J. Belkin,et al.  Query length in interactive information retrieval , 2003, SIGIR.

[5]  Gareth J. F. Jones,et al.  DCU and ISI@INEX 2010: Adhoc and Data-Centric Tracks , 2010, INEX.