Named entity recognition in Assamese: A hybrid approach

Most NER systems have been developed using one of two approaches: Rule-based or Machine-Learning, with their strengths and weaknesses. In this paper, we propose a hybrid NER approach which is a combination of both rule-based and ML approaches to improve the overall system performance for a resource poor language like Assamese. Our proposed hybrid approach is capable of recognizing four types of NEs: Person, Location, Organization and Miscellaneous. The empirical results obtained indicate that the hybrid approach outperforms both rule-based and ML when processed independently. The hybrid Assamese NER obtains an F-measure of 85%-90%.

[1]  Richard M. Schwartz,et al.  Nymble: a High-Performance Learning Name-finder , 1997, ANLP.

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Deepti Chopra,et al.  Named Entity Recognition in Indian Languages Using Gazetteer Method and Hidden Markov Model: A Hybrid Approach , 2012 .

[4]  L. Jimmy,et al.  Named Entity Recognition in Manipuri: A Hybrid Approach , 2013, GSCL.

[5]  Yavrajdeep Kaur,et al.  Named Entity Recognition (NER) System for Hindi Language Using Combination of Rule Based Approach and List Look Up Approach , 2015 .

[6]  S Amarappa,et al.  A Hybrid approach for Named Entity Recognition , Classification and Extraction ( NERCE ) in Kannada Documents , 2013 .

[7]  Mukund Sanglikar,et al.  Named Entity Recognition System for Hindi Language: A Hybrid Approach , 2011 .

[8]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[9]  Shuanhu Bai,et al.  Description of the Kent Ridge Digital Labs System Used for MUC-7 , 1998, MUC.

[10]  Pabitra Mitra,et al.  A Hybrid Approach for Named Entity Recognition in Indian Languages , 2008 .

[11]  Lynette Hirschman,et al.  MITRE: Description of the Alembic System Used for MUC-6 , 1995, MUC.

[12]  Ralph Grishman,et al.  Message Understanding Conference- 6: A Brief History , 1996, COLING.

[13]  David Yarowsky,et al.  Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence , 1999, EMNLP.

[14]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[15]  Christian Callegari,et al.  Advances in Computing, Communications and Informatics (ICACCI) , 2015 .

[16]  Richard M. Schwartz,et al.  BBN: Description of the SIFT System as Used for MUC-7 , 1998, MUC.

[17]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[18]  Deepti Chopra Hindi Named Entity Recognition By Aggregating Rule Based Heuristics and Hidden Markov Model , 2012 .

[19]  Ralph Grishman,et al.  Unsupervised Learning of Generalized Names , 2002, COLING.

[20]  Satoshi Sekine,et al.  Named Entity Discovery Using Comparable News Articles , 2004, COLING.

[21]  Suresh Manandhar,et al.  An Unsupervised Method for General Named Entity Recognition and Automated Concept Discovery , 2004 .

[22]  Amardeep Kaur,et al.  Hybrid Approach for Named Entity Recognition , 2015 .