Looking at the Blogosphere Topology through Different Lenses

The blogosphere is a vast and dynamic complex network. Any examination of the structure of such a network is dependent on the selection of blogs sampled and the time frame of the sample. By comparing two large blog datasets, we demonstrate that samples may differ significantly in their coverage but still show consistency in their aggregate network properties. We further compare the structure of a blog dataset with and without spam blogs, which account for a majority of the links in one sample. We also show that properties such as degree distributions and clustering coefficients depend on t he

[1]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[2]  Timothy W. Finin,et al.  Characterizing the Splogosphere , 2006, WWW 2006.

[3]  A. Barabasi,et al.  Scale-free characteristics of random networks: the topology of the world-wide web , 2000 .

[4]  Ravi Kumar,et al.  On the Bursty Evolution of Blogspace , 2003, WWW '03.

[5]  M. Newman Coauthorship networks and patterns of scientific collaboration , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[7]  Yun Chi,et al.  The Splog Detection Task and A Solution Based on Temporal and Link Properties , 2006, TREC.

[8]  Ravi Kumar,et al.  Structure and evolution of blogspace , 2004, CACM.

[9]  Mark Brady,et al.  Blogging: Personal Participation in Public Knowledge-Building on the Web , 2005 .

[10]  Jon M. Kleinberg,et al.  Small-World Phenomena and the Dynamics of Information , 2001, NIPS.

[11]  Iadh Ounis,et al.  The TREC Blogs06 Collection: Creating and Analysing a Blog Test Collection , 2006 .

[12]  Jasmine Novak,et al.  Geographic routing in social networks , 2005, Proc. Natl. Acad. Sci. USA.

[13]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[14]  Eytan Adar,et al.  Implicit Structure and the Dynamics of Blogspace , 2004 .

[15]  B. Tseng,et al.  Tomographic Clustering To Visualize Blog Communities as Mountain Views , 2005 .

[16]  Thomas M. Lento The Ties that Blog: Examining the Relationship Between Social Ties and Continued Participation in the Wallop Weblogging System , 2006 .

[17]  Debora Donato,et al.  Mining the inner structure of the Web graph , 2008, WebDB.

[18]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[19]  Carsten Wiuf,et al.  Subnets of scale-free networks are not scale-free: sampling properties of networks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[21]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, WWW '04.

[22]  David M. Pennock,et al.  Winners don't take all: Characterizing the competition for links on the web , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Debora Donato,et al.  Large scale properties of the Webgraph , 2004 .

[24]  Chris Anderson,et al.  The Long Tail: Why the Future of Business is Selling Less of More , 2006 .