Exploring the Value of Folksonomies for Creating Semantic Metadata

Finding good keywords to describe resources is an on-going problem: typically we select such words manually from a thesaurus of terms, or they are created using automatic keyword extraction techniques. Folksonomies are an increasingly well populated source of unstructured tags describing web resources. This paper explores the value of the folksonomy tags as potential source of keyword metadata by examining the relationship between folksonomies, community produced annotations, and keywords extracted by machines. The experiment has been carried-out in two ways: subjectively, by asking two human indexers to evaluate the quality of the generated keywords from both systems; and automatically, by measuring the percentage of overlap between the folksonomy set and machine generated keywords set. The results of this experiment show that the folksonomy tags agree more closely with the human generated keywords than those automatically generated. The results also showed that the trained indexers preferred the semantics of folksonomy tags compared to keywords extracted automatically. These results can be considered as evidence for the strong relationship of folksonomies to the human indexer’s mindset, demonstrating that folksonomies used in the del.icio.us bookmarking service are a potential source for generating semantic metadata to annotate web resources.

[1]  Csaba Veres,et al.  The Language of Folksonomies: What Tags Reveal About User Classification , 2006, NLDB.

[2]  Marieke Guy,et al.  Folksonomies: Tidying Up Tags? , 2006, D Lib Mag..

[3]  Peter Mika,et al.  Ontologies are us: A unified model of social networks and semantics , 2005, J. Web Semant..

[4]  Adam Mathes,et al.  Folksonomies-Cooperative Classification and Communication Through Shared Metadata , 2004 .

[5]  Hugo Liu,et al.  Unraveling the Taste Fabric of Social Networks , 2006, Int. J. Semantic Web Inf. Syst..

[6]  Tony Hammond,et al.  Social Bookmarking Tools (I): A General Overview , 2005, D Lib Mag..

[7]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[8]  José Luis Martínez-Fernández,et al.  Automatic Keyword Extraction for News Finder , 2003, Adaptive Multimedia Retrieval.

[9]  Abraham Adolf Fraenkel,et al.  Set theory and logic , 1966 .

[10]  Alexander F. Gelbukh Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing , 2001 .

[11]  Mitsuru Ishizuka,et al.  Keyword extraction from a single document using word co-occurrence statistical information , 2004, Int. J. Artif. Intell. Tools.

[12]  Farzin Maghoul,et al.  Y!Q: contextual search at the point of inspiration , 2005, CIKM '05.

[13]  Anette Hulth,et al.  Automatic Keyword Extraction Using Domain Knowledge , 2001, CICLing.

[14]  Bo-Yeong Kang,et al.  Document indexing: a concept-based approach to term weight estimation , 2005, Inf. Process. Manag..

[15]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[16]  Hideaki Takeda,et al.  A Proposal of Community-based Folksonomy with RDF Metadata , 2005 .

[17]  Thomas Gruber,et al.  Ontology of Folksonomy: A Mash-Up of Apples and Oranges , 2007, Int. J. Semantic Web Inf. Syst..

[18]  Hend Suliman Al-Khalifa,et al.  FolksAnnotation: A Semantic Metadata Tool for Annotating Learning Resources Using Folksonomies and Domain Ontologies , 2006, 2006 Innovations in Information Technology.

[19]  Philippe Fontaine,et al.  A linguistic and statistical approach for extracting knowledge from documents , 2004, Proceedings. 15th International Workshop on Database and Expert Systems Applications, 2004..

[20]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[21]  Carl Gutwin,et al.  KEA: practical automatic keyphrase extraction , 1999, DL '99.

[22]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[23]  Frank van Harmelen,et al.  Information Sharing on the Semantic Web , 2004, Advanced Information and Knowledge Processing.

[24]  Margaret E. I. Kipp Exploring the Context of User, Creator and Intermediary Tagging , 2006 .

[25]  Csaba Veres Concept Modeling by the Masses: Folksonomy Structure and Interoperability , 2006, ER.

[26]  Andreas Hotho,et al.  Information Retrieval in Folksonomies: Search and Ranking , 2006, ESWC.