论文信息 - An error driven approach to query segmentation

An error driven approach to query segmentation

Query segmentation is the task of splitting a query into a sequence of non-overlapping segments that completely cover all tokens in the query. The majority of query segmentation methods are unsupervised. In this paper, we propose an error-driven approach to query segmentation (EDQS) with the help of search logs, which enables unsupervised training with guidance from the system-specific errors. In EDQS, we first detect the system's errors by examining the consistency among the segmentations of similar queries. Then, a model is trained by the detected errors to select the correct segmentation of a new query from the top-n outputs of the system. Our evaluation results show that EDQS can significantly boost the performance of state-of-the-art query segmentation methods on a publicly available data set.

Jian Su | Wei Zhang | Chew Lim Tan | Chin-Yew Lin | Yunbo Cao

[1] Peter Boros,et al. Query Segmentation for Web Search , 2003, WWW.

[2] Matthias Hagen,et al. Query segmentation revisited , 2011, WWW.

[3] Matthias Hagen,et al. The power of naive query segmentation , 2010, SIGIR '10.

[4] Qin Iris Wang,et al. Learning Noun Phrase Query Segmentation , 2007, EMNLP.