Blog Community Discovery and Evolution Based on Mutual Awareness Expansion

There are information needs involving costly decisions that cannot be efficiently satisfied through conventional Web search engines. Alternately, community centric search can provide multiple viewpoints to facilitate decision making. We propose to discover and model the temporal dynamics of thematic communities based on mutual awareness, where the awareness arises due to observable blogger actions and the expansion of mutual awareness leads to community formation. Given a query, we construct a directed action graph that is time-dependent, and weighted with respect to the query. We model the process of mutual awareness expansion using a random walk process and extract communities based on the model. We propose an interaction space based representation to quantify community dynamics. Each community is represented as a vector in the interaction space and its evolution is determined by a novel interaction correlation method. We have conducted experiments with a real-world blog dataset and have promising results for detection as well as insightful results for community evolution.

[1]  Xiaoying Gao,et al.  Exploiting underrepresented query aspects for automatic query expansion , 2007, KDD '07.

[2]  A. Barabasi,et al.  Quantifying social group evolution , 2007, Nature.

[3]  Myra Spiliopoulou,et al.  Mining and Visualizing the Evolution of Subgroups in Social Networks , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[4]  Xiaoying Gao,et al.  Query Directed Web Page Clustering , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[5]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[6]  Elad Yom-Tov,et al.  What makes a query difficult? , 2006, SIGIR.

[7]  Mehran Sahami,et al.  A web-based kernel function for measuring the similarity of short text snippets , 2006, WWW '06.

[8]  Yun Chi,et al.  Discovery of Blog Communities based on Mutual Awareness , 2006 .

[9]  Amanda Spink,et al.  A temporal comparison of AltaVista Web searching , 2005, J. Assoc. Inf. Sci. Technol..

[10]  Marco Saerens,et al.  Clustering Using a Random Walk Based Distance Measure , 2005, ESANN.

[11]  Chris Buckley,et al.  Why current IR engines fail , 2004, SIGIR '04.

[12]  Dawid Weiss,et al.  Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition , 2004, Intelligent Information Systems.

[13]  Wei-Ying Ma,et al.  Query Expansion by Mining User Logs , 2003, IEEE Trans. Knowl. Data Eng..

[14]  Mounia Lalmas,et al.  A survey on the use of relevance feedback for information access systems , 2003, The Knowledge Engineering Review.

[15]  Ravi Kumar,et al.  On the Bursty Evolution of Blogspace , 2003, WWW '03.

[16]  Paul Dourish,et al.  Where the action is , 2001 .

[17]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[18]  Shing-Tung Yau,et al.  Discrete Green's Functions , 2000, J. Comb. Theory A.

[19]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[20]  Mark Magennis,et al.  The potential and actual effectiveness of interactive query expansion , 1997, SIGIR '97.

[21]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Stanley Milgram,et al.  An Experimental Study of the Small World Problem , 1969 .