Evaluation of Punjabi Named Entity Recognition using Context Word Feature

Named Entity Recognition is the task of identifying and classifying Named Entities in the given text. In this paper evaluation of Named Entity Recognition in Punjabi language has been performed using context word feature. Words preceding and succeeding the target word are very helpful in determining its category. In this work context word feature of word window 7, 5 and 3 have been used. Experiments have been performed using different training and test sets. In this evaluation a Named Entity Tagset of 14 tags namely PERSON, ORGANIZATION, LOCATION, FACILITY, EVENT, RELATIONSHIP, TIME, DATE, DESIGNATION, TITLE-PERSON, NUMBER, MEASURE, ABBREVIATION and ARTIFACT has been used. It has been observed that word window 7 and 5 have given better results as compared to word window 3. Although F-scores and Precision values of word window 7 are slightly higher than that of word window 5 but recall of word window 7 was found to be lower than that word window 5. General Terms Natural Language Processing, Information Extraction, Named Entity Recognition

[1]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[2]  Sivaji Bandyopadhyay,et al.  Language Independent Named Entity Recognition in Indian Languages , 2008, IJCNLP.

[3]  Bidyut Baran Chaudhuri,et al.  An Experiment on Automatic Detection of Named Entities in Bangla , 2008, IJCNLP.

[4]  Vasudeva Varma,et al.  Experiments in Telugu NER: A Conditional Random Field Approach , 2008, IJCNLP.

[5]  Satoshi Sekine,et al.  Extended Named Entity Hierarchy , 2002, LREC.

[6]  Hitoshi Isahara,et al.  IREX: IR & IE Evaluation Project in Japanese , 2000, LREC.

[7]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[8]  Ralph Grishman,et al.  A Maximum Entropy Approach to Named Entity Recognition , 1999 .

[9]  Dipti Misra Sharma,et al.  Aggregating Machine Learning and Rule Based Heuristics for Named Entity Recognition , 2008, IJCNLP.

[10]  Lilly Suriani Affendey,et al.  Named entity recognition approaches , 2008 .

[11]  Kavi Narayana Murthy,et al.  Named Entity Recognition for Telugu , 2008, IJCNLP.

[12]  Amandeep Kaur,et al.  Improved Named Entity Tagset for Punjabi Language , 2014, 2014 Recent Advances in Engineering and Computational Sciences (RAECS).

[13]  Pabitra Mitra,et al.  A Hybrid Approach for Named Entity Recognition in Indian Languages , 2008 .

[14]  Ralph Grishman,et al.  Message Understanding Conference- 6: A Brief History , 1996, COLING.

[15]  Anil Kumar Singh,et al.  Named Entity Recognition for South and South East Asian Languages: Taking Stock , 2008, IJCNLP.

[16]  Sivaji Bandyopadhyay,et al.  Bengali Named Entity Recognition Using Support Vector Machine , 2008, IJCNLP.

[17]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition , 2002, CoNLL.