Crosssense - Sensemaking in a Folksonomy with Cross-modal Clustering over Content and User Activities

Today folksonomies are of increasing importance, many different platforms emerged and millions of people use them. We consider the case of a user who enters such a social platform and wants to get an overview of a particular domain. The folksonomy provides abundant information for that task in the form of documents, tags on them and users who contribute documents and tags. We propose a process that identifies a small number of thematically ”interesting objects” with respect to subject domains. Our novel algorithm CrossSense builds clusters composed of objects of different types upon a data tensor. It then selects pivot objects that are characteristic of one cluster and are associated with many objects of different types from the clusters. Then, CrossSense collects all the folksonomy content that is associated with a pivot object, i.e. the object’s world: We rank pivot objects and present the top ones to the user. We have experimented with Bibsonomy data against a baseline that selects the most popular users, documents and tags, accompanied by the objects most frequently co-occurring with them. Our experiments show that our pivot objects exhibit more homogeneity and constitute a smaller set of entities to be inspected by the user.

[1]  Jimeng Sun,et al.  Beyond streams and graphs: dynamic tensor analysis , 2006, KDD '06.

[2]  Ciro Cattuto,et al.  Evaluating similarity measures for emergent semantics of social tagging , 2009, WWW '09.

[3]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[4]  Pasquale Lops,et al.  Integrating tags in a semantic content-based recommender , 2008, RecSys '08.

[5]  Stanislaw Osinski Improving Quality of Search Results Clustering with Approximate Matrix Factorisations , 2006, ECIR.

[6]  Panagiotis Symeonidis,et al.  A Unified Framework for Providing Recommendations in Social Tagging Systems Based on Ternary Semantic Analysis , 2010, IEEE Transactions on Knowledge and Data Engineering.

[7]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[8]  Hector Garcia-Molina,et al.  Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems , 2006 .

[9]  Philip S. Yu,et al.  A Framework for Clustering Massive Text and Categorical Data Streams , 2006, SDM.

[10]  Eman Abdu,et al.  A spectral-based clustering algorithm for categorical data using data summaries , 2009, DMMT '09.

[11]  Yu Zong,et al.  Web Co-clustering of Usage Network Using Tensor Decomposition , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[12]  Ciro Cattuto,et al.  Semantic Grounding of Tag Relatedness in Social Bookmarking Systems , 2008, SEMWEB.

[13]  Grigory Begelman,et al.  Automated Tag Clustering: Improving search and exploration in the tag space , 2006 .

[14]  Luis Gravano,et al.  Modeling and managing content changes in text databases , 2005, 21st International Conference on Data Engineering (ICDE'05).

[15]  Arindam Banerjee,et al.  Multi-way Clustering on Relation Graphs , 2007, SDM.