Merging Verb Senses of Hindi WordNet using Word Embeddings

In this paper, we present an approach for merging fine-grained verb senses of Hindi WordNet. Senses are merged based on gloss similarity score. We explore the use of word embeddings for gloss similarity computation and compare with various WordNet based gloss similarity measures. Our results indicate that word embeddings show significant improvement over WordNet based measures. Consequently, we observe an increase in accuracy on merging fine-grained senses. Gold standard data constructed for our experiments is made available.

[1]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[2]  Ondrej Bojar,et al.  HindEnCorp - Hindi-English and Hindi-only Corpus for Machine Translation , 2014, LREC.

[3]  Christiane Fellbaum,et al.  Making fine-grained and coarse-grained sense distinctions, both manually and automatically , 2006, Natural Language Engineering.

[4]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[5]  Pushpak Bhattacharyya,et al.  Neighbors Help: Bilingual Unsupervised WSD Using Context , 2013, ACL.

[6]  Marco Baroni,et al.  A practical and linguistically-motivated approach to compositional distributional semantics , 2014, ACL.

[7]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[8]  Pushpak Bhattacharyya,et al.  Domain-Specific Word Sense Disambiguation combining corpus based and wordnet based parameters , 2009 .

[9]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[10]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[11]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[12]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[13]  Gholamreza Haffari,et al.  The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis , 2013, ACL.

[14]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[15]  Zellig S. Harris,et al.  Mathematical structures of language , 1968, Interscience tracts in pure and applied mathematics.

[16]  Harish Karnick,et al.  Merging Word Senses , 2013, TextGraphs@EMNLP.

[17]  Eneko Agirre,et al.  Clustering WordNet word senses , 2003, RANLP.

[18]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[19]  Martin Chodorow,et al.  Combining local context and wordnet similarity for word sense identification , 1998 .

[20]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[21]  William B. Dolan,et al.  Word Sense Ambiguation: Clustering Related Senses , 1994, COLING.

[22]  Pushpak Bhattacharyya,et al.  Experiences in Resource Generation for Machine Translation through Crowdsourcing , 2012, LREC.

[23]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[24]  Diana McCarthy,et al.  Relating WordNet Senses for Word Sense Disambiguation , 2006 .

[25]  Julio Gonzalo,et al.  A Study of Polysemy and Sense Proximity in the Senseval-2 Test Suite , 2002, SENSEVAL.

[26]  Wim Peters,et al.  Automatic sense clustering in eurowordnet , 1998, LREC.

[27]  Pushpak Bhattacharyya,et al.  Simple Syntactic and Morphological Processing Can Help English-Hindi Statistical Machine Translation , 2008, IJCNLP.

[28]  Kevin Gimpel,et al.  Tailoring Continuous Word Representations for Dependency Parsing , 2014, ACL.

[29]  M. A. R T H A P A L,et al.  Making fine-grained and coarse-grained sense distinctions , both manually and automatically , 2005 .

[30]  Daniel Jurafsky,et al.  Learning to Merge Word Senses , 2007, EMNLP.

[31]  Karen Sparck Jones A statistical interpretation of term specificity and its application in retrieval , 1972 .

[32]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[33]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[34]  Christiane Fellbaum,et al.  Combining Local Context and Wordnet Similarity for Word Sense Identification , 1998 .

[35]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[36]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[37]  Rada Mihalcea,et al.  EZ.WordNet: Principles for Automatic Generation of a Coarse Grained WordNet , 2001, FLAIRS Conference.

[38]  Roberto Navigli,et al.  Meaningful Clustering of Senses Helps Boost Word Sense Disambiguation Performance , 2006, ACL.

[39]  Mirella Lapata,et al.  Vector-based Models of Semantic Composition , 2008, ACL.