Boosting Web Retrieval through Query Operations

We explore the use of phrase and proximity terms in the context of web retrieval, which is different from traditional ad-hoc retrieval both in document structure and in query characteristics. We show that for this type of task, the usage of both phrase and proximity terms is highly beneficial for early precision as well as for overall retrieval effectiveness. We also analyze why phrase and proximity terms are far more effective for web retrieval than for ad-hoc retrieval.

[1]  Amanda Spink,et al.  Searching the Web: the public and their queries , 2001 .

[2]  Fidel Cacheda,et al.  Understanding how people use search engines: a statistical analysis for e-Business , 2000 .

[3]  David Hawking,et al.  Overview of the TREC-2002 Web Track , 2002, TREC.

[4]  David Hawking,et al.  Proximity Operators - So Near And Yet So Far , 1995, TREC.

[5]  David Hawking,et al.  Relevance weighting using distance between term occurrences , 1996 .

[6]  Claire Cardie,et al.  An Analysis of Statistical and Syntactic Phrases , 1997, RIAO.

[7]  Jacques Savoy,et al.  Report on the TREC 2003 Experiment: Genomic and Web Searches , 2003, TREC.

[8]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[9]  David Hawking,et al.  Overview of the TREC 2003 Web Track , 2003, TREC.

[10]  Jacques Savoy,et al.  Term Proximity Scoring for Keyword-Based Retrieval Systems , 2003, ECIR.

[11]  Joel L Fagan,et al.  Experiments in Automatic Phrase Indexing For Document Retrieval: A Comparison of Syntactic and Non-Syntactic Methods , 1987 .

[12]  James P. Callan,et al.  Combining document representations for known-item search , 2003, SIGIR.

[13]  W. Bruce Croft,et al.  An exploratory analysis of phrases in text retrieval , 2000, RIAO.

[14]  Charles L. A. Clarke,et al.  Shortest-substring retrieval and ranking , 2000, TOIS.

[15]  Gilad Mishne,et al.  The University of Amsterdam at the TREC 2003 Question Answering Track , 2003, TREC.

[16]  Wessel Kraaij,et al.  Comparing the Effect of Syntactic vs. Statistical Phrase Indexing Strategies for Dutch , 1998, ECDL.

[17]  William R. Hersh,et al.  TREC GENOMICS Track Overview , 2003, TREC.

[18]  Susan T. Dumais,et al.  An Analysis of the AskMSR Question-Answering System , 2002, EMNLP.

[19]  David Carmel,et al.  Juru at TREC 2003 - Topic Distillation using Query-Sensitive Tuning and Cohesiveness Filtering , 2003, TREC.

[20]  E. Michael Keen,et al.  Term position ranking: some new test results , 1992, SIGIR '92.

[21]  J. M. Fernández-Luna,et al.  UvA-DARE ( Digital Academic Repository ) Boosting Web Retrieval through Query Operations , 2004 .

[22]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[23]  David Hawking,et al.  Overview of the TREC 2004 Web Track , 2004, TREC.

[24]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[25]  W. Bruce Croft,et al.  The use of phrases and structured queries in information retrieval , 1991, SIGIR '91.

[26]  Avi Arampatzis,et al.  An Evaluation of Linguistically-motivated Indexing Schemes , 2000 .

[27]  Garrison W. Cottrell,et al.  Automatic combination of multiple ranked retrieval systems , 1994, SIGIR '94.

[28]  Hinrich Schütze,et al.  Xerox TREC-5 Site Report: Routing, Filtering, NLP, and Spanish Tracks , 1996, TREC.

[29]  Amanda Spink,et al.  From E-Sex to E-Commerce: Web Search Changes , 2002, Computer.

[30]  Shuming Shi,et al.  Microsoft Research Asia at the Web Track of TREC 2009 , 2009, TREC.

[31]  Vibhu O. Mittal,et al.  The Happy Searcher: Challenges in Web Information Retrieval , 2004, PRICAI.