论文信息 - Probabilistic and Neural Network Based POS Tagging of Ambiguous Nepali text: A Comparative Study

Probabilistic and Neural Network Based POS Tagging of Ambiguous Nepali text: A Comparative Study

There are various approaches to the problem of assigning each word of a text with a parts-of-speech tag, which is known as Part-Of-Speech (POS) tagging. This article presents a comprehensive study and comparison of two different techniques of Part-of-Speech (POS) Tagging for Nepali text viz. Hidden Markov Model (HMM) and General Regression Neural Network (GRNN) based. The POS taggers resolves the problem of ambiguity in POS tagging of Nepali text through two different approaches. The evaluation of the taggers are done on the corpora developed and provided by TDIL (Technology Development for Indian Languages). Apart from corpora, python and Java programming languages and the NLTK Toolkit library has been used for implementation. Both the tagger achieves accuracy of 100 percent for known words (with no ambiguity), 58.29 percent (HMM) and 60.45 percent (GRNN) for ambiguous words and 85.36 percent (GRNN) for non- ambiguous unknown words.

Archit Yajnik | Ashish Pradhan | A. Pradhan | Archit Yajnik

[1] Andrew MacKinlay,et al. The effects of part-of-speech tagsets on tagger performance , 2005 .

[2] Ahmed Guessoum,et al. A Hidden Markov Model -Based POS Tagger for Arabic , 2006 .

[3] Enya Kong Tang,et al. Evaluating LSTM Networks, HMM and WFST in Malay Part-of-Speech Tagging , 2017 .

[4] Naushad UzZaman,et al. Comparison of different POS Tagging Techniques (n-gram, HMM and Brill’s tagger) for Bangla , 2007 .

[5] Jayaraj Acharya. A Descriptive Grammar Of Nepali And An Analyzed Corpus , 1991 .

[6] Jr. G. Forney,et al. Viterbi Algorithm , 1973, Encyclopedia of Machine Learning.

[7] Sivaji Bandyopadhyay,et al. Maximum Entropy Based Bengali Part of Speech Tagging , 2008 .

[8] Mary P. Harper,et al. A Second-Order Hidden Markov Model for Part-of-Speech Tagging , 1999, ACL.

[9] Nisheeth Joshi,et al. HMM BASED POS TAGGER FOR HINDI , 2013 .

[10] GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT , 2018 .