论文信息 - Using Query Performance Predictors to Improve Spoken Queries

Using Query Performance Predictors to Improve Spoken Queries

The goal of query performance prediction is to estimate a query’s retrieval effectiveness without user feedback. Past research has investigated the usefulness of query performance predictors for the task of reducing verbose textual queries. The basic idea is to automatically find a shortened version of the original query that yields a better retrieval. To date, such techniques have been applied to TREC topic descriptions (as surrogates for verbose queries) and to long textual queries issued to a web search engine. In this paper, we build upon an existing query reduction approach that was applied to TREC topic descriptions and evaluate its generalizability to the new task of reducing spoken query transcriptions. Our results show that we are able to outperform the original spoken query by a small, but significant margin. Furthermore, we show that the terms that are omitted from better-performing sub-queries include extraneous terms not central to the query topic, disfluencies, and speech recognition errors.

Fernando Diaz | Jaime Arguello | Sandeep Avula

[1] Le Zhao,et al. Term necessity prediction , 2010, CIKM.

[2] W. Bruce Croft,et al. Improving verbose queries using subset distribution , 2010, CIKM.

[3] Vitor R. Carvalho,et al. Reducing long queries using query quality predictors , 2009, SIGIR.

[4] Fernando Diaz,et al. Performance prediction using spatial autocorrelation , 2007, SIGIR.

[5] Oren Kurland,et al. Predicting Query Performance by Query-Drift Estimation , 2009, ICTIR.

[6] Falk Scholer,et al. Effective Pre-retrieval Query Performance Prediction Using Similarity and Variability Evidence , 2008, ECIR.

[7] Geoffrey Zweig,et al. Leveraging multiple query logs to improve language models for spoken query recognition , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8] W. Bruce Croft,et al. Modeling subset distributions for verbose queries , 2011, SIGIR.

[9] W. Bruce Croft,et al. Relevance-Based Language Models , 2001, SIGIR '01.

[10] Francoise Beaufays,et al. “Your Word is my Command”: Google Search by Voice: A Case Study , 2010 .

[11] Elad Yom-Tov,et al. Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval , 2005, SIGIR '05.