A new approach for unsupervised word sense disambiguation in Hindi language using graph connectivity measures

Word sense disambiguation (WSD) is an important task in computational linguistics as it is essential for many language understanding applications. In this paper, we propose a graph-based unsupervised WSD method for Hindi text which disambiguates multiple ambiguous words present in the sentence simultaneously. In our approach, we first construct the semantic graph for each interpretation of the given sentence by establishing semantic relations between the pair of words present in the sentence. We use Hindi WordNet to establish semantic relations between the pair of words and then we construct the graph. We find the cost of spanning tree corresponding to each semantic graph and the interpretation for which spanning tree has the minimum cost is identified. This interpretation is considered as the resulting interpretation. Our approach also considers all open class words unlike the previous approaches which focus only on noun.

[1]  Roberto Navigli,et al.  Semi-Automatic Extension of Large-Scale Linguistic Knowledge Bases , 2005, FLAIRS.

[2]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[3]  Pushpak Bhattacharyya,et al.  A Graph Based Approach to Word Sense Disambiguation for Hindi Language , 2012 .

[4]  Rada Mihalcea,et al.  Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling , 2005, HLT.

[5]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[6]  Tanveer J. Siddiqui,et al.  An Unsupervised Approach to Hindi Word Sense Disambiguation , 2009, IHCI.

[7]  Morton Benson Longman language activator , 1995 .

[8]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[9]  Mirella Lapata,et al.  Graph Connectivity Measures for Unsupervised Word Sense Disambiguation , 2007, IJCAI.

[10]  Pushpak Bhattacharyya,et al.  Hindi Word Sense Disambiguation , 2004 .

[11]  T. Landauer,et al.  Handbook of Human-Computer Interaction , 1997 .

[12]  Parteek Bhatia,et al.  Word Sense Disambiguation for Hindi Language , 2008 .

[13]  Montse Cuadros,et al.  Quality Assessment of Large Scale Knowledge Resources , 2006, EMNLP.

[14]  Paola Velardi,et al.  Structural semantic interconnections: a knowledge-based approach to word sense disambiguation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Gregory D. Abowd,et al.  Human-Computer Interaction (3rd Edition) , 2003 .

[16]  Rada Mihalcea,et al.  Unsupervised graph-based word sense disambiguation , 2009 .

[17]  Iraklis Varlamis,et al.  An Experimental Study on Unsupervised Graph-based Word Sense Disambiguation , 2010, CICLing.

[18]  Mirella Lapata,et al.  An Experimental Study of Graph Connectivity for Unsupervised Word Sense Disambiguation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.