Efficient and Robust Fully Distributed Power Method with an Application to Link Analysis

Methods for link analysis are a key component in search engines for hyperlinked document networks. Documents are assigned an importance score based on the graph structure of the hyperlinks among the documents. At the heart of link analysis protocols we find the problem of calculating the principal eigenvector of a suitable matrix that is defined based on the hyperlink graph. In this paper we introduce a fully distributed method, inspired by the power method, for the calculation of the principal eigenvector of generic matrices, focusing on link analysis as an application. Theoretical results are given that support the correctness of the approach, and experimental validation is presented based on subsets of the WWW. Unlike other proposals, our protocol matches the sequential power method in speed and accuracy for generic matrices even in extremely hostile failure scenarios. This allows for better scalability, fault tolerance and load balancing. Most importantly, it represents an important step towards a flexible, cheap and fully peer-to-peer search method in networks of hyperlinked documents. 1. Authors are listed in alphabetical order. This work was partially supported by the Future and Emerging Technologies unit of the European Commission through Project BISON (IST-2001-38923) and DELIS (IST-2002-001907). 2. Telenor R&D, N-1331 Fornebu, Norway 3. University of Bologna, Mura Anteo Zamboni 7, I-40126 Bologna, Italy, and MTA RGAI, SZTE, Szeged, Hungary

[1]  David Hales,et al.  Knowledge-Based Jobs and the Boundaries of Firms Agent-based Simulation of Firms Learning and Workforce Skill Set Dynamics , 2006 .

[2]  Márk Jelasity,et al.  Gossip-based aggregation in large dynamic networks , 2005, TOCS.

[3]  Rocco Moretti,et al.  Analysis and prototype of a metamodeling environment for engineering grid services , 2005 .

[4]  Rajeev Motwani,et al.  Randomized Algorithms , 1995, SIGA.

[5]  P. Bonacich Factoring and weighting approaches to status scores and clique identification , 1972 .

[6]  James C. Browne,et al.  Distributed pagerank for P2P systems , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[7]  Stefano Ferretti,et al.  Interactivity maintenance for event synchronization in massive multiplayer online games , 2005 .

[8]  Jack Dongarra,et al.  Templates for the Solution of Algebraic Eigenvalue Problems , 2000, Software, environments, tools.

[9]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[10]  Kristen Brent Venable,et al.  Reasoning with preferences over temporal, uncertain, and conditional statements , 2005 .

[11]  Anne-Marie Kermarrec,et al.  The Peer Sampling Service: Experimental Evaluation of Unstructured Gossip-Based Implementations , 2004, Middleware.

[12]  Hector Garcia-Molina,et al.  The Eigentrust algorithm for reputation management in P2P networks , 2003, WWW '03.

[13]  David Hales,et al.  Tag-Based Cooperation in Peer-to-Peer Networks with Newscast , 2005, SOAS.

[14]  Laura Bocchi,et al.  Atomic Commit and Negotiation in Service Oriented Computing , 2006, COORDINATION.

[15]  Nicola Dragoni,et al.  Fault Tolerant Knowledge Level Communication in Open Asynchronous Multi-Agent Systems , 2005 .

[16]  Janos Simon,et al.  Advanced Collective Communication in WDM Optical Rings. , 2004 .

[17]  Chanathip Namprempre,et al.  HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering , 1996, HYPERTEXT '96.

[18]  Nicola Dragoni,et al.  A Reasoning Infrastructure to Support Cooperation of Intelligent Agents on the Semantic Grid , 2005 .

[19]  David Hales,et al.  How to cheat BitTorrent and why nobody does , 2005 .

[20]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[21]  Guangwen Yang,et al.  Distributed page ranking in structured P2P networks , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..