Enhancing retrieval with hyperlinks: A general model based on propositional argumentation systems

Fast, effective, and adaptable techniques are needed to automatically organize and retrieve information on the ever-increasing World Wide Web. In that respect, different strategies have been suggested to take hypertext links into account. For example, hyperlinks have been used to (1) enhance document representation, (2) improve document ranking by propagating document score, (3) provide an indicator of popularity, and (4) find hubs and authorities for a given topic. Although the TREC experiments have not demonstrated the usefulness of hyperlinks for retrieval, the hypertext structure is nevertheless an essential aspect of the Web, and as such, should not be ignored. The development of abstract models of the IR task was a key factor to the improvement of search engines. However, at this time conceptual tools for modeling the hypertext retrieval task are lacking, making it difficult to compare, improve, and reason on the existing techniques. This article proposes a general model for using hyperlinks based on Probabilistic Argumentation Systems, in which each of the above-mentioned techniques can be stated. This model will allow to discover some inconsistencies in the mentioned techniques, and to take a higher level and systematic approach for using hyperlinks for retrieval.

[1]  Peter Bruza,et al.  Investigating aboutness axioms using information fields , 1994, SIGIR '94.

[2]  W. Bruce Croft,et al.  A Comparison of Text Retrieval Models , 1992, Comput. J..

[3]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[4]  Rolf Haenni,et al.  Probabilistic Argumentation Systems , 2003 .

[5]  Jacques Savoy,et al.  Report on the TREC-8 Experiment: Searching on the Web and in Distributed Collections , 1999, TREC.

[6]  Massimo Marchiori,et al.  The Quest for Correct Information on the Web: Hyper Search Engines , 1997, Comput. Networks.

[7]  Fabrizio Sebastiani,et al.  Trends in ... a Critical Review: On the Role of Logic in Information Retrieval , 1998, Inf. Process. Manag..

[8]  Norbert Fuhr,et al.  Probabilistic Datalog—a logic for powerful retrieval methods , 1995, SIGIR '95.

[9]  Mounia Lalmas,et al.  Intelligent Retrieval of Hypermedia Documents , 2003, Intelligent Exploration of the Web.

[10]  Fabio Crestani,et al.  Information Retrieval by Logical Imaging , 1995, J. Documentation.

[11]  Jian-Yun Nie,et al.  An information retrieval model based on modal logic , 1989, Inf. Process. Manag..

[12]  W. Bruce Croft,et al.  Evaluation of an inference network-based retrieval model , 1991, TOIS.

[13]  Justin Picard,et al.  Modeling and combining evidence provided by document relationships using probabilistic argumentation systems , 1998, SIGIR '98.

[14]  Warren R. Greiff,et al.  A theory of term weighting based on exploratory data analysis , 1998, SIGIR '98.

[15]  C. Lee Giles,et al.  Accessibility of information on the Web , 2000, INTL.

[16]  Jürg Kohlas,et al.  Algorithms for uncertainty and defeasible reasoning , 2000 .

[17]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[18]  Martin van den Berg,et al.  Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[19]  Hans-Peter Frei,et al.  The Use of Semantic Links in Hypertext Information Retrieval , 1995, Inf. Process. Manag..

[20]  Sndor Dominich Mathematical Foundations of Information Retrieval , 2002, Computational Linguistics.

[21]  Jacques Savoy,et al.  A Learning Scheme for Information Retrieval in Hypertext , 1994, Inf. Process. Manag..

[22]  Jacques Savoy,et al.  Ranking Schemes in Hybrid Boolean Systems: A New Approach , 1997, J. Am. Soc. Inf. Sci..

[23]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[24]  Fabio Crestani,et al.  Searching the web by constrained spreading activation , 2000, Inf. Process. Manag..

[25]  Robert M. Losee Mathematical Foundations of Information Retrieval , 2002 .

[26]  Krishna Bharat,et al.  Improved algorithms for topic distillation in a hyperlinked environment , 1998, SIGIR '98.

[27]  W. Bruce Croft,et al.  Retrieval Strategies for Hypertext , 1993, Inf. Process. Manag..

[28]  Jacques Savoy,et al.  Report on the TREC-9 Experiment: Link-based Retrieval and Distributed Collections , 2000, TREC.