Weblogs have become a leading form of self-publication on the web. Personal weblogs are often considered to represent a person, and the links between webogs can naturally be given a social interaction. Against this background, finding a community around a given weblog—i.e., identifying a set of weblogs that forms a natural group together with the starting point, because of content or social reasons—is a very natural task. Traditional methods for community finding methods focus almost exclusively on topology analysis. In this paper we present a novel method for discovering weblog communities that incorporates both topology analysis and content analysis. We evaluate our method in a small-scale user study, analyze the contributions of the various components of our approach, and compare it against a state-of-the-art topologybased community finding algorithm.
[1]
W. Bruce Croft,et al.
Time-based language models
,
2003,
CIKM '03.
[2]
Ravi Kumar,et al.
On the Bursty Evolution of Blogspace
,
2003,
WWW '03.
[3]
Sergey Brin,et al.
The Anatomy of a Large-Scale Hypertextual Web Search Engine
,
1998,
Comput. Networks.
[4]
Yun Chi,et al.
Discovery of Blog Communities based on Mutual Awareness
,
2006
.
[5]
Otis Gospodnetic,et al.
Lucene in Action
,
2004
.
[6]
Juan Julián Merelo Guervós,et al.
Mapping weblog communities
,
2003,
ArXiv.