Navigationaided retrieval

Users searching for information in hypermedia environments often perform querying followed by manual navigation. Yet, the conventional text/hypertext retrieval paradigm does not explicity take post-query navigation into account. This paper proposes a new retrieval paradigm, called navigation-aided retrieval (NAR), which treats both querying and navigation as first-class activities. In the NAR paradigm, querying is seen as a means to identify starting points for navigation, and navigation is guided based on information supplied in the query. NAR is a generalization of the conventional probabilistic information retrieval paradigm, which implicitly assumes no navigation takes place. This paper presents a formal model for navigation-aided retrieval, and reports empirical results that point to the real-world applicability of the model. The experiments were performed over a large Web corpus provided by TREC, using human judgments on a new rating scale developed for navigation-aided retrieval. In the case of ambiguous queries, the new retrieval model identifies good starting points for post-query navigation. For less ambiguous queries that need not be paired with navigation, the output closely matches that of a conventional retrieval system.

[1]  Thorsten Joachims,et al.  Web Watcher: A Tour Guide for the World Wide Web , 1997, IJCAI.

[2]  Marcia J. Bates,et al.  The design of browsing and berrypicking techniques for the online search interface , 1989 .

[3]  Stephen E. Robertson,et al.  Okapi at TREC , 1992, TREC.

[4]  Ed H. Chi,et al.  Using information scent to model user information needs and actions and the Web , 2001, CHI.

[5]  Mark S. Ackerman,et al.  The perfect search engine is not enough: a study of orienteering behavior in directed search , 2004, CHI.

[6]  Andrei Z. Broder,et al.  The Connectivity Server: Fast Access to Linkage Information on the Web , 1998, Comput. Networks.

[7]  Susan T. Dumais,et al.  Optimizing search by showing results in context , 2001, CHI.

[8]  David Hawking,et al.  Toward better weighting of anchors , 2004, SIGIR '04.

[9]  Julie Chen,et al.  The bloodhound project: automating discovery of web usability issues using the InfoScentπ simulator , 2003, CHI '03.

[10]  Krishna Bharat,et al.  Improved algorithms for topic distillation in a hyperlinked environment , 1998, SIGIR '98.

[11]  T. Joachims WebWatcher : A Tour Guide for the World Wide Web , 1997 .

[12]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[13]  David Hawking,et al.  Overview of the TREC-9 Web Track , 2000, TREC.

[14]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[15]  Henry Lieberman,et al.  Letizia: An Agent That Assists Web Browsing , 1995, IJCAI.

[16]  Iadh Ounis,et al.  Inferring Query Performance Using Pre-retrieval Predictors , 2004, SPIRE.

[17]  Peter Pirolli,et al.  Information Foraging , 2009, Encyclopedia of Database Systems.

[18]  Dunja Mladenic Using Text Learning to help Web browsing , 2001 .

[19]  Ian Soboroff Do TREC web collections look like the web? , 2002, SIGF.

[20]  David R. Karger,et al.  Magnet: supporting navigation in semistructured data environments , 2005, SIGMOD '05.

[21]  Wei-Ying Ma,et al.  Learning to cluster web search results , 2004, SIGIR '04.

[22]  Stephen E. Robertson,et al.  Effective site finding using link anchor information , 2001, SIGIR '01.

[23]  Mark Levene,et al.  The best trail algorithm for assisted navigation of Web sites , 2003, Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726).

[24]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 2 , 2000, Inf. Process. Manag..

[25]  Jon M. Kleinberg,et al.  Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text , 1998, Comput. Networks.

[26]  Soumen Chakrabarti,et al.  Enhanced topic distillation using text, markup tags, and hyperlinks , 2001, SIGIR '01.

[27]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[28]  Joel C. Miller,et al.  Modifications of Kleinberg's HITS algorithm using matrix exponentiation and web log records , 2001, SIGIR '01.

[29]  Christopher Olston,et al.  ScentTrails: Integrating browsing and searching on the Web , 2003, TCHI.

[30]  Zhenyu Liu,et al.  Analysis of User Web Traffic with A Focus on Search Activities , 2005, WebDB.