The influence of search engines on preferential attachment

There is much current interest in the evolution of social networks, especially, the Web graph, through time. "Preferential attachment" and the "copying model" are well-known models which explain the observed degree distribution of the Web graph reasonably closely. We claim that the presence of highly popular search engines like Google substantially mediate the act of hyperlink creation by limiting the author's attention to a small set of "celebrity" URLs. Page authors (who are also Web surfers) frequently (with probability p) locate pages using a search engine. Then they link to popular pages among those they visit. We initiate an analysis of this more realistic process, and show that the celebrity nodes eventually accumulate a constant fraction of all links created whp, and that the degrees of the other nodes still follow a power-law distribution, but with a steeper power: Pr(degree = k) α k-(1+2/(1-p)) Whp. Our analysis adds evidence to the recent concern that search engines offer new Web pages a steep, self-sustaining barrier to entry to well-connected, entrenched Web communities.

[1]  N. L. Johnson,et al.  Urn models and their application : an approach to modern discrete probability theory , 1978 .

[2]  D. A. Sprott Urn Models and Their Application—An Approach to Modern Discrete Probability Theory , 1978 .

[3]  R. Durrett Probability: Theory and Examples , 1993 .

[4]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[5]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[6]  Jon M. Kleinberg,et al.  The Web as a Graph: Measurements, Models, and Methods , 1999, COCOON.

[7]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[8]  N. Alon,et al.  The Probabilistic Method, Second Edition , 2000 .

[9]  Eli Upfal,et al.  Stochastic models for the Web graph , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[10]  Alan M. Frieze,et al.  A General Model of Undirected Web Graphs , 2001, ESA.

[11]  Alan M. Frieze,et al.  Balls and bins models with feedback , 2002, SODA '02.

[12]  David M. Pennock,et al.  Winners don't take all: Characterizing the competition for links on the web , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Alan M. Frieze,et al.  High Degree Vertices and Eigenvalues in the Preferential Attachment Graph , 2005, Internet Math..

[14]  Kostas Tsioutsiouliklis,et al.  \Googlearchy": How a Few Heavily-Linked Sites Dominate Politics on the Web , 2003 .

[15]  Junghoo Cho,et al.  Impact of search engines on page popularity , 2004, WWW '04.

[16]  Sandeep Pandey,et al.  Shuffling a Stacked Deck: The Case for Partially Randomized Ranking of Search Engine Results , 2005, VLDB.

[17]  Santo Fortunato,et al.  Topical interests and the mitigation of search engine bias , 2005, Proceedings of the National Academy of Sciences.