Bengali parts-of-speech tagging using Global Linear Model

The paper describes an automatic parts-of-speech tagging for Bengali sentences using Global Linear Model (GLM) which learns to represent the whole sentence through a feature vector called Global feature. Tagger has been trained using averaged perceptron algorithm. Performance of this tagger has been compared to Conditional Random Field (CRF), Support Vector Machine (SVM), Hidden Markov Model (HMM) and Maximum Entropy (ME) based Bengali POS tagger. Experimental results show that GLM based Bengali POS tagger has the accuracy of 93.12 %.

[1]  Lluís Màrquez i Villodre,et al.  SVMTool: A general POS Tagger Generator Based on Support Vector Machines , 2004, LREC.

[2]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[3]  Joy Deep Nath,et al.  Unsupervised Parts-of-Speech Induction for Bengali , 2008, LREC.

[4]  Sivaji Bandyopadhyay,et al.  Part of Speech Tagging in Bengali Using Support Vector Machine , 2008, 2008 International Conference on Information Technology.

[5]  Pushpak Bhattacharyya,et al.  A Common Parts-of-Speech Tagset Framework for Indian Languages , 2008, LREC.

[6]  Sudeshna Sarkar,et al.  Automatic Part-of-Speech Tagging for Bengali: An Approach for Morphologically Rich Languages in a Poor Resource Scenario , 2007, ACL.

[7]  Vincent Ng,et al.  High-Performance, Language-Independent Morphological Segmentation , 2007, HLT-NAACL.

[8]  D. Chakrabarti,et al.  Layered Parts of Speech Tagging for Bangla , 2011 .

[9]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[10]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[11]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[13]  Ankur Parikh,et al.  Part-Of-Speech Tagging using Neural network , 2022 .

[14]  Sivaji Bandyopadhyay,et al.  Voted Approach for Part of Speech Tagging in Bengali , 2009, PACLIC.

[15]  Hammad Ali An Unsupervised Parts-of-Speech Tagger for the Bangla language , 2008 .

[16]  Martin Frodl Part-of-Speech Tagging Using Neural Networks , 2014 .

[17]  Sathish Pammi Tagging and Chunking using Decision Forests , 2022 .

[18]  David Jones High performance , 1989, Nature.