Hybrid Named Entity Recognition - Application to Arabic Language

Most Named Entity Recognition (NER) systems follow either a rule-based approach or machine learning approach. In this paper, we introduce out attempt at developing a hybrid NER system, which combines the rule-based approach with a machine learning approach in order to obtain the advantages of both approaches and overcomes their problems [1]. The system is able to recognize eight types of named entities including Location, Person, Organization, Date, Time, Price, Measurement and Percent. Experimental results on ANERcorp dataset indicated that our hybrid approach outperforms the rule-based approach and the machine learning approach when they are processed separately. Moreover, our hybrid approach outperforms the state-of-the-art of Arabic NER.

[1]  Richard M. Schwartz,et al.  An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[2]  Walter Daelemans,et al.  A formal framework for evaluation of information extraction , 2004 .

[3]  Mona T. Diab,et al.  Arabic Named Entity Recognition: An SVM-based approach , 2008 .

[4]  Ali Mamat,et al.  Named Entity Recognition Using a New Fuzzy Support Vector Machine , 2008 .

[5]  Nizar Habash,et al.  On Arabic Transliteration , 2007 .

[6]  Khaled Shaalan,et al.  A Pipeline Arabic Named Entity Recognition using a Hybrid Approach , 2012, COLING.

[7]  Yassine Benajiba,et al.  Arabic Named Entity Recognition using Optimized Feature Sets , 2008, EMNLP.

[8]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[9]  Nizar Habash,et al.  Improving NER in Arabic Using a Morphological Tagger , 2008, LREC.

[10]  A. Mamat,et al.  A New Fuzzy Support Vector Machine Method for Named Entity Recognition , 2008, 2008 International Conference on Computer Science and Information Technology.

[11]  Khaled Shaalan,et al.  A Survey of Arabic Named Entity Recognition and Classification , 2014, CL.

[12]  Nizar Habash,et al.  Introduction to Arabic Natural Language Processing , 2010, Introduction to Arabic Natural Language Processing.

[13]  Ralph Grishman,et al.  A Maximum Entropy Approach to Named Entity Recognition , 1999 .

[14]  Khaled Shaalan,et al.  A Novel Hybrid Approach to Arabic Named Entity Recognition , 2014 .

[15]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[16]  Yassine Benajiba,et al.  ANERsys: An Arabic Named Entity Recognition System Based on Maximum Entropy , 2009, CICLing.

[17]  Khaled Shaalan,et al.  NERA: Named Entity Recognition for Arabic , 2009, J. Assoc. Inf. Sci. Technol..

[18]  Wei Li,et al.  Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.

[19]  Ibrahim A. Al-Kharashi,et al.  Arabic morphological analysis techniques: A comprehensive survey , 2004, J. Assoc. Inf. Sci. Technol..