Learning Word Embeddings from Tagging Data: A methodological comparison

The semantics hidden in natural language are an essential building block for a common language understanding needed in areas like NLP or the Semantic Web. Such information is hidden for example in lightweight knowledge representations such as tagging systems and folksonomies. While extracting relatedness from tagging systems shows promising results, the extracted information is often encoded in high dimensional vector representations, which makes relatedness learning or word sense discovery computationally infeasible. In the last few years, methods producing low-dimensional vector representations, so-called word embeddings, have been shown to yield extraordinary structural and semantic features and have been used in many settings. Up to this point, there has been no in-depth exploration of the applicability of word embedding algorithms on tagging data. In this work, we explore different embedding algorithms with regard to their applicability on tagging data and the semantic quality of the produced word embeddings. For this, we use data from three different tagging systems and evaluate the vector representations on several human intuition datasets. To the best of our knowledge, we are the first to generate embeddings from tagging data. Our results encourage the use of word embeddings based on tagging data, as they capture semantic relations between tags better than high-dimensional representations and make learning with tag representations feasible.

[1]  Andreas Hotho,et al.  Computing Semantic Relatedness from Human Navigational Paths: A Case Study on Wikipedia , 2013, Int. J. Semantic Web Inf. Syst..

[2]  Georgiana Dinu,et al.  Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors , 2014, ACL.

[3]  Ricardo Baeza-Yates,et al.  Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising , 2016, SIGIR.

[4]  Dominik Benz,et al.  Capturing Emergent Semantics from Social Annotation Systems , 2013 .

[5]  Andreas Hotho,et al.  Information Retrieval in Folksonomies: Search and Ranking , 2006, ESWC.

[6]  Omer Levy,et al.  Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.

[7]  Omer Levy,et al.  Linguistic Regularities in Sparse and Explicit Word Representations , 2014, CoNLL.

[8]  Arkaitz Zubiaga,et al.  Harnessing Folksonomies to Produce a Social Classification of Resources , 2013, IEEE Transactions on Knowledge and Data Engineering.

[9]  Ciro Cattuto,et al.  Semantic Grounding of Tag Relatedness in Social Bookmarking Systems , 2008, SEMWEB.

[10]  Kalina Bontcheva,et al.  Making sense of social media streams through semantics: A survey , 2014, Semantic Web.

[11]  Rong Yan,et al.  Semantic concept-based query expansion and re-ranking for multimedia retrieval , 2007, ACM Multimedia.

[12]  Peter Mika Ontologies Are Us: A Unified Model of Social Networks and Semantics , 2005, International Semantic Web Conference.

[13]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[14]  Evgeniy Gabrilovich,et al.  A word at a time: computing word relatedness using temporal semantic analysis , 2011, WWW.

[15]  Dominik Benz,et al.  Semantics made by you and me: Self-emerging ontologies can capture the diversity of shared knowledge , 2010 .

[16]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[17]  Andreas Hotho,et al.  Tag Recommendations in Folksonomies , 2007, LWA.

[18]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[19]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[20]  Hinrich Schütze,et al.  A Cooccurrence-Based Thesaurus and Two Applications to Information Retrieval , 1994, Inf. Process. Manag..

[21]  Bernardo A. Huberman,et al.  The Structure of Collaborative Tagging Systems , 2005, ArXiv.

[22]  Ciro Cattuto,et al.  Evaluating similarity measures for emergent semantics of social tagging , 2009, WWW '09.

[23]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[24]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[25]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[26]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[27]  Dominik Benz,et al.  How Tagging Pragmatics Influence Tag Sense Discovery in Social Annotation Systems , 2013, ECIR.

[28]  Dominik Benz,et al.  The social bookmark and publication management system bibsonomy , 2010, The VLDB Journal.

[29]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[30]  Andreas Hotho,et al.  Extracting Semantics from Unconstrained Navigation on Wikipedia , 2015, KI - Künstliche Intelligenz.

[31]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[32]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[33]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[34]  J. Bullinaria,et al.  Extracting semantic representations from word co-occurrence statistics: A computational study , 2007, Behavior research methods.

[35]  Elia Bruni,et al.  Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..