Survey of Named Entity Recognition Systems with respect to Indian and Foreign Languages

Named Entity Recognition (NER) is sub task of Information Extraction that includes identification of named entities and classification of them into named entity classes such as person, location and organization etc. NER can be used to preprocess textual information and convert it into structured form that can be useful for Information Retrieval, Machine Translation, Question Answering System and Text Summarization. This paper presents a survey regarding NER research done for various Indian and non Indian languages. The study and observations related to approaches, techniques and features required to implement NER for various languages especially for Indian languages is reported. General Terms NER (Named Entity Recognition), HMM (Hidden Markov Model), CRF (Conditional Random Fields), SVM (Support Vector Machine)

[1]  Mukund Sanglikar,et al.  Named Entity Recognition System for Hindi Language: A Hybrid Approach , 2011 .

[2]  P. M. Yohan,et al.  A Survey on Named Entity Recognition in Indian Languages with particular reference to Telugu , 2011 .

[3]  S. Viswanadha Raju,et al.  NAMED ENTITY RECOGNITION FOR TELUGU USING MAXIMUM ENTROPY MODEL , 2010 .

[4]  Ratna Sanyal,et al.  Named Entity Recognition for Indian Languages , 2008, IJCNLP.

[5]  Harshali B. Patil,et al.  Part-of-Speech Tagger for Marathi Language using Limited Training Corpora , 2014 .

[6]  Reyyan Yeniterzi Exploiting Morphology in Turkish Named Entity Recognition System , 2011, ACL.

[7]  Sivaji Bandyopadhyay,et al.  Named Entity Recognition in Bengali: A Conditional Random Field Approach , 2008, IJCNLP.

[8]  Khaled Shaalan,et al.  Integrating Rule-Based System with Classification for Arabic Named Entity Recognition , 2012, CICLing.

[9]  Sobha Lalitha Devi,et al.  Domain Focused Named Entity Recognizer for Tamil Using Conditional Random Fields , 2008, IJCNLP.

[10]  Pushpak Bhattacharyya,et al.  Incorporating Linguistic Expertise Using ILP for Named Entity Recognition in Data Hungry Indian Languages , 2009, ILP.

[11]  David Yarowsky,et al.  Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence , 1999, EMNLP.

[12]  Asanee Kawtrakul,et al.  Thai Named Entity Extraction by incorporating Maximum Entropy Model with Simple Heuristic Information , 2004 .

[13]  Wei Li,et al.  Rapid development of Hindi named entity recognition using conditional random fields and feature induction , 2003, TALIP.

[14]  Hercules Dalianis,et al.  SweNam-A Swedish Named Entity recognizer Its construction, training and evaluation , 2001 .

[15]  Kalina Bontcheva,et al.  GATE: an Architecture for Development of Robust HLT applications , 2002, ACL.

[16]  Alta de Waal,et al.  Named entity recognition in a South African context , 2006 .

[17]  S. Lakshmana Pandian,et al.  Hybrid, Three-stage Named Entity Recognizer for Tamil , 2008 .

[18]  Kavi Narayana Murthy,et al.  Named Entity Recognition for Telugu , 2008, IJCNLP.

[19]  Rohini K. Srihari,et al.  A Hybrid Approach for Named Entity and Sub-Type Tagging , 2000, ANLP.

[20]  Sivaji Bandyopadhyay,et al.  Named Entity Recognition using Support Vector Machine: A Language Independent Approach , 2010 .

[21]  Gupta Vishal,et al.  Named Entity Recognition for Punjabi Language Text Summarization , 2011 .

[22]  Sitanath Biswas,et al.  A Hybrid Oriya Named Entity Recognition system: Harnessing the Power of Rule , 2010 .

[23]  Wen-Lian Hsu,et al.  On Using Ensemble Methods for Chinese Named Entity Recognition , 2006, SIGHAN@COLING/ACL.

[24]  Amandeep Kaur,et al.  Named entity recognition for Punjabi language , 2016 .

[25]  Sivaji Bandyopadhyay,et al.  Named entity recognition in Bengali and Hindi using support vector machine , 2011 .

[26]  Lluís Padró Cirera,et al.  A named entity recognition system based on a finite automata acquisition algorithm , 2005 .

[27]  Veronique Hoste,et al.  Dutch named entity recognition using classifier ensembles , 2010 .

[28]  Sivaji Bandyopadhyay,et al.  Named Entity Recognition and transliteration in Bengali , 2007 .

[29]  Arindam Dey,et al.  Named Entity Recognition for Nepali language: A Semi Hybrid Approach , 2014 .

[30]  Bo-Hyun Yun,et al.  HMM-based Korean Named Entity Recognition , 2003 .

[31]  Pushpak Bhattacharyya,et al.  Think Globally, Apply Locally: Using Distributional Characteristics for Hindi Named Entity Identification , 2010, NEWS@ACL.

[32]  Vasudeva Varma,et al.  A Language-Independent Approach to Identify the Named Entities in Under-Resourced Languages and Clustering Multilingual Documents , 2011, CLEF.

[33]  Sivaji Bandyopadhyay,et al.  Development of Bengali Named Entity Tagged Corpus and its Use in NER Systems , 2008, IJCNLP.

[34]  Sivaji Bandyopadhyay,et al.  Maximum Entropy Approach for Named Entity Recognition in Bengali and Hindi , 2009 .

[35]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[36]  Rohini K. Srihari,et al.  An Information-Extraction System for Urdu---A Resource-Poor Language , 2010, TALIP.

[37]  Ruy Luiz Milidiú,et al.  Machine Learning Algorithms for Portuguese Named Entity Recognition , 2007, Inteligencia Artif..

[38]  Nitin Madnani,et al.  The Web is not a PERSON, Berners-Lee is not an ORGANIZATION, and African-Americans are not LOCATIONS: An Analysis of the Performance of Named-Entity Recognition , 2011, MWE@ACL.

[39]  Richárd Farkas,et al.  Statistical named entity recognition for Hungarian , 2004 .

[40]  Sadao Kurohashi,et al.  Japanese Named Entity Recognition Using Structural Natural Language Processing , 2008, IJCNLP.

[41]  Konstantinos I. Diamantaras,et al.  Greek Named Entity Recognition using Support Vector Machines, Maximum Entropy and Onetime , 2006, LREC.

[42]  Sivaji Bandyopadhyay,et al.  A Conditional Random Field Approach for Named Entity Recognition in Bengali and Hindi , 2009 .