Spectral Clustering in Social-Tagging Systems

Social tagging is an increasingly popular phenomenon with substantial impact on the way we perceive and understand the Web. For the many Web resources that are not self-descriptive, such as images, tagging is the sole way of associating them with concepts explicitly expressed in text. Consequently, users are encouraged to assign tags to Web resources, and tag recommenders are being developed to stimulate the re-use of existing tags in a consistent way. However, a tag still and inevitably expresses the personal perspective of each user upon the tagged resource. This personal perspective should be taken into account when assessing the similarity of resources with help of tags. In this paper, we focus on similarity-based clustering of tagged items, which can support several applications in social-tagging systems, like information retrieval, providing recommendations, or the establishment of user profiles and the discovery of topics. We show that it is necessary to capture and exploit the multiple values of similarity reflected in the tags assigned to the same item by different users. We model the items, the tags on them and the users who assigned the tags in a multigraph structure. To discover clusters of similar items, we extend spectral clustering, an approach successfully used for the clustering of complex data, into a method that captures multiple values of similarity between any two items. Our experiments with two real social-tagging data sets show that our new method is superior to conventional spectral clustering that ignores the existence of multiple values of similarity among the items.

[1]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[2]  Andreas Hotho,et al.  Tag Recommendations in Folksonomies , 2007, LWA.

[3]  Yiannis Kompatsiaris,et al.  Co-Clustering Tags and Social Data Sources , 2008, 2008 The Ninth International Conference on Web-Age Information Management.

[4]  Arindam Banerjee,et al.  Multi-way Clustering on Relation Graphs , 2007, SDM.

[5]  Tamir Hazan,et al.  Multi-way Clustering Using Super-Symmetric Non-negative Tensor Factorization , 2006, ECCV.

[6]  Axel Pinz,et al.  Computer Vision – ECCV 2006 , 2006, Lecture Notes in Computer Science.

[7]  Tamara G. Kolda,et al.  Extracting clusters from large datasets with multiple similarity measures. , 2007 .

[8]  Panagiotis Symeonidis,et al.  A Unified Framework for Providing Recommendations in Social Tagging Systems Based on Ternary Semantic Analysis , 2010, IEEE Transactions on Knowledge and Data Engineering.

[9]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[10]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[11]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[12]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[13]  Joost N. Kok,et al.  Knowledge Discovery in Databases: PKDD 2007, 11th European Conference on Principles and Practice of Knowledge Discovery in Databases, Warsaw, Poland, September 17-21, 2007, Proceedings , 2007, PKDD.

[14]  Lars Schmidt-Thieme,et al.  Learning optimal ranking with tensor factorization for tag recommendation , 2009, KDD.

[15]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..