Mining shared social media links to support clustering of blog articles

When monitoring blog articles for the tracking of a certain personality or product, the automatic identification of topic clusters is of high interest. Clustering by textual content is a popular method to accomplish this. In this paper we investigate how links between individual blog articles can be used to support this clustering with another dimension of information. Given the existing component structure of these networks, we focus on the extension with links based on shared social media resources. We show that the component structure extended in this way is of very high use for supporting textual clustering algorithms, and may be used for a new type of hybrid clustering algorithms in the future.

[1]  Eytan Adar,et al.  GUESS: a language and interface for graph exploration , 2006, CHI.

[2]  Rafael Schirru,et al.  Domain-Specific Identification of Topics and Trends in the Blogosphere , 2010, ICDM.

[3]  Bruce A. Reed,et al.  The Size of the Giant Component of a Random Graph with a Given Degree Sequence , 1998, Combinatorics, Probability and Computing.

[4]  Inna Kouper,et al.  Conversations in the Blogosphere: An Analysis "From the Bottom Up" , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[5]  Matthieu Latapy,et al.  Basic notions for the analysis of large two-mode networks , 2008, Soc. Networks.

[6]  Piotr Bródka,et al.  International Conference on Computational Aspects of Social Networks , 2009, Computational Aspects of Social Networks.

[7]  Andreas Dengel,et al.  A social network analysis and mining methodology for the monitoring of specific domains in the blogosphere , 2010, 2010 International Conference on Advances in Social Networks Analysis and Mining.

[8]  Ravi Kumar,et al.  On the Bursty Evolution of Blogspace , 2003, WWW '03.