Sense induction in folksonomies: a review

Folksonomies, often known as tagging systems, such as the ones used on the popular Delicious or Flickr websites, use a very simple Knowledge Organisation System. Users have thus been quick to adopt this system and create extensive annotations on the Web. However, because of the simplicity of the folksonomy model, the semantics of the tags used is not explicit and can only be inferred from their context of use. This is a barrier for the automatic use of such Knowledge Organisation Systems by computers and new techniques have been developed to extract the semantic of the tags. In this article we discuss the drawbacks of some of these approaches and propose a generalization of the different approaches to detect new senses of terms in a folksonomy. Another weak point of the current state of the art in the field is the lack of formal evaluation methodology; we thus propose a novel evaluation framework. We introduce a dataset and evaluation methodology that enable the comparison of results between different approaches to sense induction in folksonomies. Finally we discuss the performances of different approaches to the task of homonymous/polysemous tag detection and synonymous identification.

[1]  Michael Uschold,et al.  Ontologies and semantics for seamless connectivity , 2004, SGMD.

[2]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[3]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[4]  Joakim Nivre,et al.  Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics , 2009 .

[5]  Timothy Baldwin,et al.  Word Sense Induction for Novel Sense Detection , 2012, EACL.

[6]  Boris Motik,et al.  User-Driven Ontology Evolution Management , 2002, EKAW.

[7]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[8]  Bernardo A. Huberman,et al.  The Structure of Collaborative Tagging Systems , 2005, ArXiv.

[9]  Suresh Manandhar,et al.  Extending a Lexical Ontology by a Combination of Distributional Semantics Signatures , 2002, EKAW.

[10]  Ahmed Azough,et al.  A Conversion Process From Flickr Tags to RDF Descriptions , 2007, SAW.

[11]  Enrico Motta,et al.  Integrating Folksonomies with the Semantic Web , 2007, ESWC.

[12]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[13]  Peter Haase,et al.  Collaborative and Usage-driven Evolution of Personal Ontologies , 2005, LWA.

[14]  Davide Eynard,et al.  An integrated approach to discover tag semantics , 2011, SAC.

[15]  Yong Yu,et al.  Emergent Semantics from Folksonomies: A Quantitative Study , 2006, J. Data Semant..

[16]  Pierre Andrews,et al.  Semantic Disambiguation in Folksonomy: A Case Study , 2009, NLP4DL/AT4DL.

[17]  Yiannis Kompatsiaris,et al.  Extracting Emergent Semantics from Large-Scale User-Generated Content , 2011, ICT Innovations.

[18]  Mirella Lapata,et al.  Bayesian Word Sense Induction , 2009, EACL.

[19]  Steffen Staab,et al.  Emergent Semantics Principles and Issues , 2004, DASFAA.

[20]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[21]  Suresh Manandhar,et al.  Proposal for Evaluating Ontology Refinement Methods , 2002, LREC.

[22]  Grigoris Antoniou,et al.  Ontology change: classification and survey , 2008, The Knowledge Engineering Review.

[23]  Ying Zhou,et al.  An Integrated Approach to Extracting Ontological Structures from Folksonomies , 2009, ESWC.

[24]  Céline Van Damme,et al.  FolksOntology : An Integrated Approach for Turning Folksonomies into Ontologies , 2007 .

[25]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[26]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[27]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[28]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[29]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2008, Information Retrieval.

[30]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[31]  C. Bauckhage,et al.  Analyzing Social Bookmarking Systems : A del . icio . us Cookbook , 2008 .

[32]  Peter Mika Ontologies Are Us: A Unified Model of Social Networks and Semantics , 2005, International Semantic Web Conference.

[33]  Nigel Shadbolt,et al.  Understanding the Semantics of Ambiguous Tags in Folksonomies , 2007, ESOE.

[34]  Kilian Q. Weinberger,et al.  Resolving tag ambiguity , 2008, ACM Multimedia.

[35]  Oscar Corcho,et al.  Preliminary Results in Tag Disambiguation using DBpedia , 2009 .

[36]  Asunción Gómez-Pérez,et al.  Review of the state of the art: discovering and associating semantics to tags in folksonomies , 2012, The Knowledge Engineering Review.

[37]  Udo Hahn,et al.  Towards Text Knowledge Engineering , 1998, AAAI/IAAI.