A statistical approach on Persian word sense disambiguation

This article studies different aspect of a new approach for resolving lexical ambiguities using statistical information gained from a monolingual corpus. The proposed approach resolves the problem of target word selection in an machine translation system. This Method is an unsupervised graph-based approach which uses a bilingual dictionary to find all possible translations of each ambiguous word in the source sentence (English) and then chooses the most appropriate alternative regarding the statistical information gathered from target language (Persian) corpora. Also, two new methods to measure the semantic similarity based on source and target language corpora are introduced. The experiments show that the unsupervised graph-based WSD which uses the proposed semantic similarity measures in the dependency graph outperforms all other methods on WSD for translating English to Persian words, significantly.

[1]  Tayebeh Mosavi Miangah Solving the Polysemy Problem of Persian Words Using Mutual Information Statistics , 2007 .

[2]  Dan Klein,et al.  Fast Exact Inference with a Factored Model for Natural Language Parsing , 2002, NIPS.

[3]  Eneko Agirre,et al.  Word Sense Disambiguation: Algorithms and Applications , 2007 .

[4]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[5]  Rada Mihalcea,et al.  Unsupervised Graph-basedWord Sense Disambiguation Using Measures of Word Semantic Similarity , 2007, International Conference on Semantic Computing (ICSC 2007).

[6]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[7]  Keh-Yih Su,et al.  Some key issues in designing MT systems , 1990, Machine Translation.

[8]  Farhad Oroumchian,et al.  Assessment of a Modern Farsi Corpus , 2005 .

[9]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[10]  Tayebeh Mosavi Miangah,et al.  Word Sense Disambiguation Using Target Language Corpus in a Machine Translation System , 2005, Lit. Linguistic Comput..

[11]  Heshaam Faili,et al.  An experiment of word sense disambiguation in a machine translation system , 2008, 2008 International Conference on Natural Language Processing and Knowledge Engineering.

[12]  Rada Mihalcea,et al.  Unsupervised Graph-basedWord Sense Disambiguation Using Measures of Word Semantic Similarity , 2007 .

[13]  Alon Itai,et al.  Word Sense Disambiguation Using a Second Language Monolingual Corpus , 1994, CL.

[14]  Rada Mihalcea,et al.  PageRank on Semantic Networks, with Application to Word Sense Disambiguation , 2004, COLING.

[15]  Robert L. Mercer,et al.  Word-Sense Disambiguation Using Statistical Methods , 1991, ACL.

[16]  Christopher D. Manning,et al.  Stanford typed dependencies manual , 2010 .