A semantic-based approach for searching and browsing tag spaces

In this paper we propose the Semantic Tag Clustering Search (STCS) framework for enhancing the user experience in interacting with tagging systems. This framework consists of three parts. The first part deals with syntactic variations by finding clusters of tags that are syntactic variations of each other and assigning labels to them. The second part of the framework addresses the problem of the lack of semantics in tagging systems by recognizing contexts and constructing semantic clusters for tags. The last, and final part of the STCS framework, utilizes the clusters obtained from the first two parts to improve the search and exploration of tag spaces. For removing syntactic variations, we use the normalized Levenshtein distance and the cosine similarity measure based on tag co-occurrences. For creating semantic clusters, we employ two non-hierarchical and two hierarchical clustering techniques. To evaluate the value of the semantic clusters, we develop a Web application called XploreFlickr.com for searching and browsing through Flickr resources.

[1]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[2]  Grigory Begelman,et al.  Automated Tag Clustering: Improving search and exploration in the tag space , 2006 .

[3]  Nicola Henze,et al.  Ranking in folksonomy systems: can context help? , 2008, CIKM '08.

[4]  Timothy W. Finin,et al.  Swoogle: a search and metadata engine for the semantic web , 2004, CIKM '04.

[5]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[6]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[7]  Julie Beth Lovins,et al.  Development of a stemming algorithm , 1968, Mech. Transl. Comput. Linguistics.

[8]  Sam H. Kome,et al.  Hierarchical Subject Relationships in Folksonomies , 2005 .

[9]  Oded Nov,et al.  A social capital perspective on meta-knowledge contribution and social computing , 2012, Decis. Support Syst..

[10]  George A. Miller,et al.  WordNet: A Lexical Database for the English Language , 2002 .

[11]  Andreas Hotho,et al.  FolkRank : A Ranking Algorithm for Folksonomies , 2006, LWA.

[12]  Peter Mika Ontologies Are Us: A Unified Model of Social Networks and Semantics , 2005, International Semantic Web Conference.

[13]  Flavius Frasincar,et al.  A semantic clustering-based approach for searching and browsing tag spaces , 2011, SAC.

[14]  Yong Yu,et al.  Optimizing web search using social annotations , 2007, WWW '07.

[15]  W. Bruce Croft,et al.  Deriving concept hierarchies from text , 1999, SIGIR '99.

[16]  Harris Wu,et al.  Collective taxonomizing: A collaborative approach to organizing document repositories , 2010, Decis. Support Syst..

[17]  Ciro Cattuto,et al.  Semantic Grounding of Tag Relatedness in Social Bookmarking Systems , 2008, SEMWEB.

[18]  P. Schmitz,et al.  Inducing Ontology from Flickr Tags , 2006 .

[19]  Enrico Motta,et al.  Integrating Folksonomies with the Semantic Web , 2007, ESWC.

[20]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[21]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[22]  Paul Clough,et al.  Automatically organising images using concept hierarchies , 2005 .

[23]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[24]  Hans-Hermann Bock,et al.  Data Science and Classification (Studies in Classification, Data Analysis, and Knowledge Organization) , 2006 .

[25]  Raquel Benbunan-Fich,et al.  Exploring contributions of public resources in social bookmarking systems , 2009, Decis. Support Syst..

[26]  Hector Garcia-Molina,et al.  Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems , 2006 .

[27]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[28]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  Gregor E. Morfill,et al.  Spectroscopic evaluation of the effect of the microparticles on radiofrequency argon plasma , 2009 .

[30]  Alberto Córdoba,et al.  Pattern Matching Techniques to Identify Syntactic Variations of Tags in Folksonomies , 2008, WSKS.

[31]  Andreas Hotho,et al.  Mining Association Rules in Folksonomies , 2006, Data Science and Classification.

[32]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .