You Shall Know the Most Frequent Sense by the Company it Keeps

Identification of the most frequent sense of a polysemous word is an important semantic task. We introduce two concepts that can benefit MFS detection: companions, which are the most frequently co-occurring words, and the most frequent translation in a bitext. We present two novel methods that incorporate these new concepts, and show that they advance the state of the art on MFS detection.

[1]  Hwee Tou Ng,et al.  Word Sense Disambiguation with Distribution Estimation , 2005, IJCAI.

[2]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[3]  Ted Pedersen,et al.  Maximizing Semantic Relatedness to Perform Word Sense Disambiguation , 2005 .

[4]  Quoc V. Le,et al.  Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[5]  Sanjeev Arora,et al.  Linear Algebraic Structure of Word Senses, with Applications to Polysemy , 2016, TACL.

[6]  Grzegorz Kondrak,et al.  Bootstrapping Unsupervised Bilingual Lexicon Induction , 2017, EACL.

[7]  Julie Weeds,et al.  Automatic Identification of Infrequent Word Senses , 2004, COLING.

[8]  Timothy Baldwin,et al.  Learning Word Sense Distributions, Detecting Unattested Senses and Identifying Novel Senses Using Topic Models , 2014, ACL.

[9]  Roberto Navigli,et al.  Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison , 2017, EACL.

[10]  Graeme Hirst,et al.  Determining Word Sense Dominance Using a Thesaurus , 2006, EACL.

[11]  Ignacio Iacobacci,et al.  Embedding Words and Senses Together via Joint Knowledge-Enhanced Training , 2016, CoNLL.

[12]  Hinrich Schütze,et al.  AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes , 2015, ACL.

[13]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[14]  Roberto Navigli,et al.  Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities , 2016, Artif. Intell..

[15]  Roberto Navigli,et al.  Neural Sequence Learning Models for Word Sense Disambiguation , 2017, EMNLP.

[16]  Graeme Hirst,et al.  Distributional measures of concept-distance: A task-oriented evaluation , 2006, EMNLP.

[17]  Roberto Navigli,et al.  Two Knowledge-based Methods for High-Performance Sense Distribution Learning , 2018, AAAI.

[18]  Roberto Navigli,et al.  Natural Language Understanding: Instructions for (Present and Future) Use , 2018, IJCAI.

[19]  Hwee Tou Ng,et al.  Exploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study , 2003, ACL.

[20]  Nigel Collier,et al.  De-Conflated Semantic Representations , 2016, EMNLP.

[21]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[22]  Jörg Tiedemann,et al.  OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles , 2016, LREC.

[23]  Marianna Apidianaki,et al.  LIMSI: Translations as Source of Indirect Supervision for Multilingual All-Words Sense Disambiguation and Entity Linking , 2015, *SEMEVAL.

[24]  Julie Weeds,et al.  Finding Predominant Word Senses in Untagged Text , 2004, ACL.

[25]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[26]  Ryu Iida,et al.  Gloss-Based Semantic Similarity Metrics for Predominant Sense Acquisition , 2008, IJCNLP.

[27]  Julie Weeds,et al.  Unsupervised Acquisition of Predominant Word Senses , 2007, CL.

[28]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[29]  Pushpak Bhattacharyya,et al.  Unsupervised Most Frequent Sense Detection using Word Embeddings , 2015, HLT-NAACL.

[30]  Ivan Titov,et al.  Bilingual Learning of Multi-sense Embeddings with Discrete Autoencoders , 2016, HLT-NAACL.

[31]  Philip Resnik,et al.  A Perspective on Word Sense Disambiguation Methods and Their Evaluation , 2002 .

[32]  Diana McCarthy,et al.  Domain-Speci(cid:12)c Sense Distributions and Predominant Sense Acquisition , 2022 .

[33]  Paul Buitelaar,et al.  Ranking and Selecting Synsets by Domain Relevance , 2001 .

[34]  Zhiyuan Liu,et al.  A Unified Model for Word Sense Representation and Disambiguation , 2014, EMNLP.

[35]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[36]  Roberto Navigli,et al.  Meaningful Clustering of Senses Helps Boost Word Sense Disambiguation Performance , 2006, ACL.

[37]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[38]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[39]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[40]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[41]  Timothy Baldwin,et al.  LexSemTm: A Semantic Dataset Based on All-words Unsupervised Sense Distribution Learning , 2016, ACL.

[42]  Julie Weeds,et al.  Using automatically acquired predominant senses for Word Sense Disambiguation , 2004, SENSEVAL@ACL.