Topical and Structural Linkage in Wikipedia

We explore statistical properties of links within Wikipedia. We demonstrate that a simple algorithm can predict many of the links that would normally be added to a new article, without considering the topic of the article itself. We then explore a variant of topic-oriented PageRank, which can effectively identify topical links within existing articles, when compared with manual judgments of their topical relevance. Based on these results, we suggest that linkages within Wikipedia arise from a combination of structural requirements and topical relationships

[1]  Andrew Trotman,et al.  Overview of INEX 2007 Link the Wiki Track , 2007, INEX.

[2]  Li Xiong,et al.  Automatic link detection: a sequence labeling approach , 2009, CIKM.

[3]  Jaap Kamps,et al.  Link Detection in XML Documents: What about repeated links? , 2008 .

[4]  M. de Rijke,et al.  Discovering missing links in Wikipedia , 2005, LinkKDD '05.

[5]  Ellen M. Voorhees Variations in relevance judgments and the measurement of retrieval effectiveness , 2000, Inf. Process. Manag..

[6]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[7]  Andrew Trotman,et al.  Overview of the INEX 2008 Link the Wiki Track , 2008, INEX.

[8]  Andrew Trotman,et al.  Advances in Focused Retrieval, 7th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2008, Dagstuhl Castle, Germany, December 15-18, 2008. Revised and Selected Papers , 2009, INEX.

[9]  Charles L. A. Clarke,et al.  University of Waterloo at INEX2007: Adhoc and Link-the-Wiki Tracks , 2007, INEX.

[10]  Andrew Trotman,et al.  Focused Access to XML Documents, 6th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2007, Dagstuhl Castle, Germany, December 17-19, 2007. Selected Papers , 2008, INEX.

[11]  Charles L. A. Clarke,et al.  University of Waterloo at INEX 2009: Ad Hoc, Book, Entity Ranking, and Link-the-Wiki Tracks , 2009, INEX.

[12]  Andrew Trotman,et al.  Report on the SIGIR 2008 workshop on focused retrieval , 2008, SIGF.

[13]  Andrew Trotman,et al.  Focused Retrieval and Evaluation, 8th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2009, Brisbane, Australia, December 7-9, 2009, Revised and Selected Papers , 2010, INEX.

[14]  Charles L. A. Clarke,et al.  Information Retrieval - Implementing and Evaluating Search Engines , 2010 .

[15]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.