A CRF-Based Stacking Model with Meta-features for Named Entity Recognition

Named Entity Recognition (NER) is a challenging task in Natural Language Processing. Recently, machine learning based methods are widely used for the NER task and outperform traditional handcrafted rule based methods. As an alternative way to handle the NER task, stacking, which combines a set of classifiers into one classifier, has not been well explored for the NER task. In this paper, we propose a stacking model for the NER task. We extend the original stacking model from both model and feature aspects. We use Conditional Random Fields as the level-1 classifier, and we also apply meta-features from global aspect and local aspect of the level-0 classifiers and tokens in our model. In the experiments, our model achieves the state-of-the-art performance on the CoNLL 2003 Shared task.

[1]  Joseph Sill,et al.  Feature-Weighted Linear Stacking , 2009, ArXiv.

[2]  Christopher D. Manning,et al.  An Effective Two-Stage Model for Exploiting Non-Local Dependencies in Named Entity Recognition , 2006, ACL.

[3]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4]  Yue Zhang,et al.  Neural Network for Heterogeneous Annotations , 2016, EMNLP.

[5]  Chandra Bhagavatula,et al.  Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.

[6]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[7]  Tiejun Zhao,et al.  Biomedical Named Entity Recognition Based on Classifiers Ensemble , 2008, Int. J. Comput. Sci. Appl..

[8]  Asif Ekbal,et al.  Stacked ensemble coupled with feature selection for biomedical entity extraction , 2013, Knowl. Based Syst..

[9]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[10]  L. Breiman Stacked Regressions , 1996, Machine Learning.

[11]  Franck Dernoncourt,et al.  NeuroNER: an easy-to-use program for named-entity recognition based on neural networks , 2017, EMNLP.

[12]  Dan Klein,et al.  Coreference Semantics from Web Features , 2012, ACL.

[13]  Marine Carpuat,et al.  A Stacked, Voted, Stacked Model for Named Entity Recognition , 2003, CoNLL.

[14]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[15]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[16]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[17]  D. Sculley,et al.  Combined regression and ranking , 2010, KDD.

[18]  Raymond J. Mooney,et al.  Stacking With Auxiliary Features , 2016, IJCAI.

[19]  Koji Tsukamoto,et al.  Learning with Multiple Stacking for Named Entity Recognition , 2002, CoNLL.

[20]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[21]  Bernard Zenko,et al.  Stacking with an Extended Set of Meta-level Attributes and MLR , 2002, ECML.

[22]  Tong Zhang,et al.  Named Entity Recognition through Classifier Combination , 2003, CoNLL.

[23]  Wanxiang Che,et al.  Revisiting Embedding Features for Simple Semi-supervised Learning , 2014, EMNLP.

[24]  Weiwei Sun,et al.  Capturing Long-distance Dependencies in Sequence Models: A Case Study of Chinese Part-of-speech Tagging , 2013, IJCNLP.

[25]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[26]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[27]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.