Entity Candidate Network for Whole-Aware Named Entity Recognition

Named Entity Recognition (NER) is a crucial upstream task in Natural Language Processing (NLP). Traditional tag scheme approaches offer a single recognition that does not meet the needs of many downstream tasks such as coreference resolution. Meanwhile, Tag scheme approaches ignore the continuity of entities. Inspired by one-stage object detection models in computer vision (CV), this paper proposes a new no-tag scheme, the Whole-Aware Detection, which makes NER an object detection task. Meanwhile, this paper presents a novel model, Entity Candidate Network (ECNet), and a specific convolution network, Adaptive Context Convolution Network (ACCN), to fuse multi-scale contexts and encode entity information at each position. ECNet identifies the full span of a named entity and its type at each position based on Entity Loss. Furthermore, ECNet is regulable between the highest precision and the highest recall, while the tag scheme approaches are not. Experimental results on the CoNLL 2003 English dataset and the WNUT 2017 dataset show that ECNet outperforms other previous state-of-the-art methods.

[1]  Fabio A. González,et al.  Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media , 2018, NAACL.

[2]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[3]  Mark Cieliebak,et al.  Transfer Learning and Sentence Level Features for Named Entity Recognition on Tweets , 2017, NUT@EMNLP.

[4]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[5]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[6]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[9]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[10]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[11]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[12]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[15]  Kai Zhao,et al.  Res2Net: A New Multi-Scale Backbone Architecture , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[18]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[19]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[20]  Jian Yang,et al.  Selective Kernel Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Hui Chen,et al.  GRN: Gated Relation Network to Enhance Convolutional Neural Network for Named Entity Recognition , 2019, AAAI.

[22]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[23]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Leon Derczynski,et al.  Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition , 2017, NUT@EMNLP.

[25]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[26]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[27]  Philip S. Yu,et al.  Multi-Grained Entity Proposal Network for Named Entity Recognition , 2018 .

[28]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[29]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[31]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Yaojie Lu,et al.  Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks , 2019, ACL.