Average-Clicks: A New Measure of Distance on the World Wide Web

The pages and hyperlinks of the World Wide Web may be viewed as nodes and edges in a directed graph. In this paper, we propose a new definition of the distance between two pages, called average-clicks. It is based on the probability to click a link through random surfing. We compare the average-clicks measure to the classical measure of clicks between two pages, and show average-clicks fits better to the users' intuitions of distance.

[1]  Alberto O. Mendelzon,et al.  What do the Neighbours Think? Computing Web Page Reputations , 2000, IEEE Data Eng. Bull..

[2]  Jon M. Kleinberg,et al.  The Web as a Graph: Measurements, Models, and Methods , 1999, COCOON.

[3]  Shlomo Moran,et al.  The stochastic approach for link-structure analysis (SALSA) and the TKC effect , 2000, Comput. Networks.

[4]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[5]  Jon M. Kleinberg,et al.  Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text , 1998, Comput. Networks.

[6]  Soumen Chakrabarti,et al.  Data mining for hypertext: a tutorial survey , 2000, SKDD.

[7]  Hector Garcia-Molina,et al.  Efficient Crawling Through URL Ordering , 1998, Comput. Networks.

[8]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[9]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[10]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[11]  Andrei Z. Broder,et al.  A Technique for Measuring the Relative Size and Overlap of Public Web Search Engines , 1998, Comput. Networks.

[12]  Lada A. Adamic The Small World Web , 1999, ECDL.

[13]  Solomon Eyal Shimony,et al.  Cost-Based Abduction and MAP Explanation , 1994, Artif. Intell..

[14]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[15]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[16]  Marc Najork,et al.  Breadth-First Search Crawling Yields High-Quality Pages , 2001 .

[17]  Cyveillance Sizing the Internet , 2000 .