Semantic Relation Extraction from Socially-Generated Tags: A Methodology for Metadata Generation

The growing predominance of social semantics in the form of tagging presents the metadata community with both opportunities and challenges as for leveraging this new form of information content representation and for retrieval. One key challenge is the absence of contextual information associated with these tags. This paper presents an experiment working with Flickr tags as an example of utilizing social semantics sources for enriching subject metadata. The procedure included four steps: 1) Collecting a sample of Flickr tags, 2) Calculating cooccurrences between tags through mutual information, 3) Tracing contextual information of tag pairs via Google search results, 4) Applying natural language processing and machine learning techniques to extract semantic relations between tags. The experiment helped us to build a context sentence collection from the Google search results, which was then processed by natural language processing and machine learning algorithms. This new approach achieved a reasonably good rate of accuracy in assigning semantic relations to tag pairs. This paper also explores the implications of this approach for using social semantics to enrich subject metadata.

[1]  Hector Garcia-Molina,et al.  Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems , 2006 .

[2]  Raymond J. Mooney,et al.  A Mutually Beneficial Integration of Data Mining and Information Extraction , 2000, AAAI/IAAI.

[3]  P. Schmitz,et al.  Inducing Ontology from Flickr Tags , 2006 .

[4]  Mitsuru Ishizuka,et al.  Relation Extraction from Wikipedia Using Subtree Mining , 2007, AAAI.

[5]  Hugo Liu,et al.  Introduction to the Semantics of People & Culture , 2007 .

[6]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.

[7]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[8]  Andrew McCallum,et al.  Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text , 2006, NAACL.

[9]  W. Bruce Croft,et al.  Deriving concept hierarchies from text , 1999, SIGIR '99.

[10]  Fabio Ciravegna,et al.  Relation Extraction for Mining the Semantic Web , 2005 .

[11]  Dan Roth,et al.  Probabilistic Reasoning for Entity & Relation Recognition , 2002, COLING.

[12]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[13]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[14]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[15]  Peter Mika Ontologies Are Us: A Unified Model of Social Networks and Semantics , 2005, International Semantic Web Conference.

[16]  Marieke Guy,et al.  Folksonomies: Tidying Up Tags? , 2006, D Lib Mag..

[17]  Adam Mathes,et al.  Folksonomies-Cooperative Classification and Communication Through Shared Metadata , 2004 .

[18]  Jian Qin Controlled semantics versus social semantics: An epistemological analysis , 2008 .

[19]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[20]  E. Michlmayr A Case Study on Emergent Semantics in Communities ? , 2005 .

[21]  Razvan C. Bunescu,et al.  Extracting Relations from Text: From Word Sequences to Dependency Paths , 2007 .