Constructing Folksonomies by Integrating Structured Metadata with Relational Clustering

Many social Web sites allow users to annotate the content with descriptive metadata, such as tags, and more recently also to organize content hierarchically. These types of structured metadata provide valuable evidence for learning how a community organizes knowledge. For instance, we can aggregate many personal hierarchies into a common taxonomy, also known as a folksonomy, that will aid users in visualizing and browsing social content, and also to help them in organizing their own content. However, learning from social metadata presents several challenges: sparseness, ambiguity, noise, and inconsistency. We describe an approach to folksonomy learning based on relational clustering that addresses these challenges by exploiting structured metadata contained in personal hierarchies. Our approach clusters similar hierarchies using their structure and tag statistics, then incrementally weaves them into a deeper, bushier tree. We study folksonomy learning using social metadata extracted from the photo-sharing site Flickr. We evaluate the learned folksonomy quantitatively by automatically comparing it to a reference taxonomy. Our empirical results suggest that the proposed framework, which addresses the challenges listed above, improves on existing folksonomy learning methods.

[1]  Steffen Staab,et al.  Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis , 2005, J. Artif. Intell. Res..

[2]  Lise Getoor,et al.  Collective entity resolution in relational data , 2007, TKDD.

[3]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.

[4]  Kristina Lerman,et al.  Modeling Social Annotation: A Bayesian Approach , 2008, TKDD.

[5]  Kristina Lerman,et al.  Integrating Structured Metadata with Relational Affinity Propagation , 2010, Statistical Relational Artificial Intelligence.

[6]  Peter Mika Ontologies Are Us: A Unified Model of Social Networks and Semantics , 2005, International Semantic Web Conference.

[7]  P. Schmitz,et al.  Inducing Ontology from Flickr Tags , 2006 .

[8]  Hector Garcia-Molina,et al.  Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems , 2006 .

[9]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[10]  Renée J. Miller,et al.  Leveraging data and structure in ontology integration , 2007, SIGMOD '07.

[11]  Daniel Jurafsky,et al.  Semantic Taxonomy Induction from Heterogenous Evidence , 2006, ACL.

[12]  Christopher H. Brooks,et al.  Improved annotation of the blogosphere via autotagging and hierarchical clustering , 2006, WWW '06.

[13]  W. Bruce Croft,et al.  Deriving concept hierarchies from text , 1999, SIGIR '99.

[14]  Steffen Staab,et al.  Measuring Similarity between Ontologies , 2002, EKAW.

[15]  Kristina Lerman,et al.  Constructing folksonomies from user-specified relations on flickr , 2009, WWW '09.

[16]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[17]  Grace Hui Yang,et al.  A Metric-based Framework for Automatic Taxonomy Induction , 2009, ACL.

[18]  Adam Mathes,et al.  Folksonomies-Cooperative Classification and Communication Through Shared Metadata , 2004 .