Multilingual Named Entity Recognition using Hybrid Neural Networks

Named entity recognition is a significant subtask of information extraction. Most of the high performing NER systems model the task as a sequence labelling or a structured prediction problem (Lafferty et al., 2001; Finkel et al., 2005; Ratinov and Roth, 2009). In recent years, neural network based models obtain remarkable success in a wide range of natural language processing tasks, which outperforms the classical models in terms of accuracy (Collobert et al., 2011; Graves, 2012). The aim of this study is to systematically evaluate the performances of different neural networks for NER in various languages. We also investigate the impact of additional features and configurations. Three baseline models, including a feedforward network, a standard Bi-LSTM and a window-based Bi-LSTM are extensively tested with different feature and hyper-parameter settings on three data sets. The experimental results indicate that the neural network based models are generally robust and capable of achieving reasonable accuracies across different languages. In addition, the window-based Bi-LSTM is more robust than the standard Bi-LSTM when less information is available for the task. The effectiveness of the features depends on both the architecture of the model and the data set, except that all the models benefit greatly from the pre-trained word embeddings and the Conditional Random Fields (CRF) based interface. Overall, our best performing models are competitive when compared to the state-of-the-art NER systems.

[1]  Christian Biemann,et al.  GermEval 2014 Named Entity Recognition Shared Task , 2014 .

[2]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[3]  Yassine Benajiba,et al.  ANERsys: An Arabic Named Entity Recognition System Based on Maximum Entropy , 2009, CICLing.

[4]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[5]  Christian Biemann,et al.  NoSta-D Named Entity Annotation for German: Guidelines and Dataset , 2014, LREC.

[6]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[7]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[8]  Iryna Gurevych,et al.  GermEval-2014: Nested Named Entity Recognition with Neural Networks , 2014 .

[9]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 shared task , 2003 .

[10]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[11]  Tong Zhang,et al.  Named Entity Recognition through Classifier Combination , 2003, CoNLL.

[12]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[13]  German Rigau,et al.  Robust multilingual Named Entity Recognition with shallow semi-supervised features , 2016, Artif. Intell..

[14]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[15]  Christian Hänig,et al.  Modular Classifier Ensemble Architecture for Named Entity Recognition on Low Resource Systems , 2014 .

[16]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[17]  Yassine Benajiba,et al.  Arabic Named Entity Recognition using Conditional Random Fields , 2008 .