A segment enhanced span-based model for nested named entity recognition

Abstract Named entity recognition (NER) is a fundamental problem in natural language processing. In particular, nested entities are commonly existed in real-life textual data for the NER task. However, the current span-based methods for nested NER are computationally expensive, lacking of explicit boundary supervision and generating many negative samples for span classification, which affect their overall performance. In this paper, we propose a Segment Enhanced Span-based model for nested NER (SESNER). The proposed model treats the nested NER task as a segment covering problem. First, it models entities as segments, detects the segment endpoints and identifies the positional relationship between neighboring endpoints. Then, it detects the outermost segments to generate candidate entity spans nested in it for span classification. Our proposed model has the advantages of enhancing boundary supervision in learning span representations by detecting segment endpoints, reducing the number of negative samples without losing long entities that are ignored by most span-based methods, and improving runtime performance. Moreover, a novel augmented training mechanism is also proposed to further improve the model performance by extending the training dataset with data that were wrongly predicted before. Experimental results show that our proposed SESNER model has achieved promising performance with near linear time complexity on the benchmark datasets.

[1]  Bowen Zhou,et al.  Neural Models for Sequence Chunking , 2017, AAAI.

[2]  Xuanjing Huang,et al.  Adaptive Co-attention Network for Named Entity Recognition in Tweets , 2018, AAAI.

[3]  Hongxia Jin,et al.  A Neural Transition-based Model for Nested Mention Recognition , 2018, EMNLP.

[4]  Christopher D. Manning,et al.  Nested Named Entity Recognition , 2009, EMNLP.

[5]  Mark A. Przybocki,et al.  The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[6]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[7]  Makoto Miwa,et al.  Deep Exhaustive Model for Nested Named Entity Recognition , 2018, EMNLP.

[8]  Tao Shen,et al.  Self-Attention Enhanced Selective Gate with Entity-Aware Embedding for Distantly Supervised Relation Extraction , 2019, AAAI.

[9]  Sophia Ananiadou,et al.  A Neural Layered Model for Nested Named Entity Recognition , 2018, NAACL.

[10]  Yoshua Bengio,et al.  Professor Forcing: A New Algorithm for Training Recurrent Networks , 2016, NIPS.

[11]  Shiliang Zhang,et al.  The Fixed-Size Ordinally-Forgetting Encoding Method for Neural Network Language Models , 2015, ACL.

[12]  Philip S. Yu,et al.  Multi-grained Named Entity Recognition , 2019, ACL.

[13]  Chandra Bhagavatula,et al.  Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.

[14]  Yasumasa Onoe,et al.  Fine-Grained Entity Typing for Domain Independent Entity Linking , 2020, AAAI.

[15]  Jiwei Li,et al.  A Unified MRC Framework for Named Entity Recognition , 2019, ACL.

[16]  Eduard Hovy,et al.  Nested Named Entity Recognition via Second-best Sequence Learning and Decoding , 2019, Transactions of the Association for Computational Linguistics.

[17]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[18]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[19]  Hui Chen,et al.  GRN: Gated Relation Network to Enhance Convolutional Neural Network for Named Entity Recognition , 2019, AAAI.

[20]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[21]  Guandong Xu,et al.  A Boundary-aware Neural Model for Nested Named Entity Recognition , 2019, EMNLP.

[22]  Jan Hajic,et al.  Neural Architectures for Nested NER through Linearization , 2019, ACL.

[23]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[24]  Wei Lu,et al.  Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators , 2017, EMNLP.

[25]  Wei Qiu,et al.  Boundary Enhanced Neural Span Classification for Nested Named Entity Recognition , 2020, AAAI.

[26]  Claire Cardie,et al.  Nested Named Entity Recognition Revisited , 2018, NAACL.

[27]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[28]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[30]  Andrew McCallum,et al.  Fast and Accurate Entity Recognition with Iterated Dilated Convolutions , 2017, EMNLP.

[31]  Yaojie Lu,et al.  Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks , 2019, ACL.

[32]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[33]  Hui Jiang,et al.  A Local Detection Approach for Named Entity Recognition and Mention Detection , 2017, ACL.

[34]  Yaojie Lu,et al.  Gazetteer-Enhanced Attentive Neural Networks for Named Entity Recognition , 2019, EMNLP.

[35]  Hai Zhao,et al.  Bipartite Flat-Graph Network for Nested Named Entity Recognition , 2020, ACL.

[36]  Fei Wu,et al.  Dice Loss for Data-imbalanced NLP Tasks , 2019, ACL.

[37]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[38]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[41]  Yu Cao,et al.  BAG: Bi-directional Attention Entity Graph Convolutional Network for Multi-hop Reasoning Question Answering , 2019, NAACL.

[42]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[43]  Andreas Vlachos,et al.  Merge and Label: A Novel Neural Network Architecture for Nested NER , 2019, ACL.

[44]  Shuming Shi,et al.  Exploiting Deep Representations for Neural Machine Translation , 2018, EMNLP.

[45]  Dan Roth,et al.  Joint Mention Extraction and Classification with Mention Hypergraphs , 2015, EMNLP.