Named Entity Recognition Using Hybrid Machine Learning Approach

This paper presents a hybrid method using machine learning approach for named entity recognition (NER). A system built based on this method is able to achieve reasonable performance with minimal training data and gazetteers. The hybrid machine learning approach differs from previous machine learning-based systems in that it uses maximum entropy model (MEM) and hidden Markov model (HMM) successively. We report on the performance of our proposed NER system using British National Corpus (BNC). In the recognition process, we first use MEM to identify the named entities in the corpus by imposing some temporary tagging as references. The MEM walkthrough can be regarded as a training process for HMM, as we then use HMM for the final tagging. We show that with enough training data and appropriate error correction mechanism, this approach can achieve higher precision and recall than using a single statistical model We conclude with our experimental results that indicate the flexibility of our system in different domains

[1]  Hermann Ney,et al.  Maximum Entropy Models for Named Entity Recognition , 2003, CoNLL.

[2]  Marc Moens,et al.  Named Entity Recognition without Gazetteers , 1999, EACL.

[3]  Hwee Tou Ng,et al.  Named Entity Recognition: A Maximum Entropy Approach Using Global Information , 2002, COLING.

[4]  Ralph Grishman,et al.  Exploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition , 1998, VLC@COLING/ACL.

[5]  Yorick Wilks,et al.  Evaluation of an Algorithm for the Recognition and Classification of Proper Names , 1996, COLING.

[6]  Emmanuel Roche,et al.  Finite-State Language Processing , 1997 .

[7]  George R. Krupka,et al.  IsoQuest Inc.: Description of the NetOwl , 1998, Message Understanding Conference.

[8]  Ralph Grishman,et al.  A Decision Tree Method for Finding and Classifying Names in Japanese Texts , 1998, VLC@COLING/ACL.

[9]  Jian Su,et al.  Named Entity Recognition using an HMM-based Chunk Tagger , 2002, ACL.

[10]  Rohini K. Srihari,et al.  A Hybrid Approach for Named Entity and Sub-Type Tagging , 2000, ANLP.

[11]  Yorick Wilks,et al.  Named Entity Recognition from Diverse Text Types , 2001 .

[12]  Richard M. Schwartz,et al.  An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[13]  Jeong-Seok Kim,et al.  Named Entity Recognition using Machine Learning Methods and Pattern-Selection Rules , 2001, NLPRS.

[14]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.