Chinese Named Entity Recognition with the Improved Smoothed Conditional Random Fields

As a kind of state-of-the-art sequence classifier, Conditional Random Fields (CRFs) recently have been widely used for some natural language processing tasks which could be viewed as the sequence labeling problems such as POS tagging, named entity recognition(NER) etc. But CRFs suffer from the failing that they are prone to overfitting when the number of features grows. For NER task, the feature set is very large, especially for Chinese language, because of it’s complex characteristics. Existing approaches to avoid overfitting include the regularization and feature selection. The main shortcoming of these approaches is that they ignore the so-called unsupported features which are the features appearing in the test set but with zero count in the training set. Actually, without the information of them, the generalization of the CRFs suffers. This paper describes a model called Improved Smoothed CRF which could capture the information of the unsupported features using the smoothing features. It provides a very effective and practical way to improve the generalization performance of CRFs. Experiments on Chinese NER proved the effectiveness of our method.

[1]  Roman Klinger,et al.  Classical Probabilistic Models and Conditional Random Fields , 2007 .

[2]  Hitoshi Isahara,et al.  Chinese Named Entity Recognition with Conditional Random Fields , 2006, SIGHAN@COLING/ACL.

[3]  Andrew McCallum,et al.  Efficiently Inducing Features of Conditional Random Fields , 2002, UAI.

[4]  Stanley F. Chen,et al.  A Gaussian Prior for Smoothing Maximum Entropy Models , 1999 .

[5]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[6]  Ronald Rosenfeld,et al.  A survey of smoothing techniques for ME models , 2000, IEEE Trans. Speech Audio Process..

[7]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[8]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[9]  Manuela M. Veloso,et al.  Feature selection in conditional random fields for activity recognition , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Hai Zhao,et al.  Unsupervised Segmentation Helps Supervised Learning of Character Tagging for Word Segmentation and Named Entity Recognition , 2008, IJCNLP.

[11]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[12]  Aitao Chen,et al.  Chinese Named Entity Recognition with Conditional Probabilistic Models , 2006, SIGHAN@COLING/ACL.

[13]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[14]  Xian Qian,et al.  CRF-based Hybrid Model for Word Segmentation, NER and even POS Tagging , 2008, International Joint Conference on Natural Language Processing.

[15]  Trevor Cohn,et al.  Scaling conditional random fields for natural language processing , 2007 .

[16]  Wei Li,et al.  Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.

[17]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[18]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[19]  Andrew McCallum,et al.  Accurate Information Extraction from Research Papers using Conditional Random Fields , 2004, NAACL.

[20]  Zhang Da-kun A Rapid Algorithm to Chinese Named Entity Recognition Based on Single Character Hints , 2008 .