Large scale properties of the Webgraph

Abstract.In this paper we present an experimental study of the properties of web graphs. We study a large crawl from 2001 of 200M pages and about 1.4 billion edges made available by the WebBase project at Stanford [17]. We report our experimental findings on the topological properties of such graphs, such as the number of bipartite cores and the distribution of degree, PageRank values and strongly connected components.

[1]  J. Mogul,et al.  Computer networks and isdn systems , 1995 .

[2]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[3]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[4]  Taher H. Haveliwala Efficient Computation of PageRank , 1999 .

[5]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[6]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[7]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[8]  Mark Newman,et al.  Models of the Small World , 2000 .

[9]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[10]  Alan M. Frieze,et al.  A General Model of Undirected Web Graphs , 2001, ESA.

[11]  R Pastor-Satorras,et al.  Dynamical and correlation properties of the internet. , 2001, Physical review letters.

[12]  Mihaela Enachescu,et al.  Variations on Random Graph Models for the Web , 2001 .

[13]  Ulrich Meyer,et al.  Heuristics for semi-external depth first search on directed graphs , 2002, SPAA '02.

[14]  Ravi Kumar,et al.  Self-similarity in the web , 2001, TOIT.

[15]  David M. Pennock,et al.  Winners don't take all: Characterizing the competition for links on the web , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Guido Caldarelli,et al.  A Multi-Layer Model for the Web Graph , 2002, WebDyn@WWW.

[17]  Michael Mitzenmacher,et al.  A Brief History of Generative Models for Power Law and Lognormal Distributions , 2004, Internet Math..

[18]  Ulrich Meyer,et al.  Algorithms and Experiments for the Webgraph , 2003, ESA.

[19]  Béla Bollobás,et al.  Robustness and Vulnerability of Scale-Free Random Graphs , 2004, Internet Math..

[20]  Eli Upfal,et al.  Using PageRank to Characterize Web Structure , 2002, Internet Math..