Hard Queries can be Addressed with Query Splitting Plus Stepping Stones and Pathways

A key finding of the Reliable Information Access Workshop of 2003 was that in collections like those used for TREC 6-8, there are a number of hard queries for which no current search engine can return a high quality set of results. Our Stepping Stones and Pathways (SSP) approach may yield an effective solution to such hard problems, as well as support exploration of collections of content not well known to a person (with broad interest and/or complex information needs). Our initial and promising testing of SSP had users prepare two separate short queries in order to launch processing. However, since beginning with a single information need is a more typical initial situation, we have extended the SSP research by exploring query splitting, especially as might apply to handling hard queries. This paper summarizes our recent results and identifies some of the future work needed.

[1]  Chris Buckley,et al.  Pivoted Document Length Normalization , 1996, SIGIR Forum.

[2]  Chris Buckley Why current IR engines fail , 2004, SIGIR '04.

[3]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Robust Retrieval Track , 2004 .

[4]  Fernando Adrian Das Neves,et al.  Stepping Stones and Pathways:Improving Retrieval by Chains of Relationships between Documents , 2004 .

[5]  Neil R. Smalheiser,et al.  Information discovery from complementary literatures: Categorizing viruses as potential weapons , 2001, J. Assoc. Inf. Sci. Technol..

[6]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[7]  Elad Yom-Tov,et al.  SIGIR workshop report: predicting query difficulty - methods and applications , 2005, SIGF.

[8]  Gerhard Weikum,et al.  The SphereSearch Engine for Unified Ranked Retrieval of Heterogeneous XML and Web Documents , 2005, VLDB.

[9]  A Borodin,et al.  Xii-1 Xii. Query Splitting in Relevance Feedback Systems , .

[10]  Stephen E. Robertson,et al.  Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive , 1998, TREC.

[11]  Ellen M. Voorhees,et al.  The TREC robust retrieval track , 2005, SIGF.

[12]  D. Swanson Fish Oil, Raynaud's Syndrome, and Undiscovered Public Knowledge , 2015, Perspectives in biology and medicine.

[13]  Cyril W. Cleverdon,et al.  The significance of the Cranfield tests on index languages , 1991, SIGIR '91.

[14]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[15]  Claudio Carpineto,et al.  An information-theoretic approach to automatic query expansion , 2001, TOIS.

[16]  Edward A. Fox,et al.  Connecting topics in document collections with stepping stones and pathways , 2005, CIKM '05.