Rethinking Boundaries: End-To-End Recognition of Discontinuous Mentions with Pointer Networks

A majority of research interests in irregular (e.g., nested or discontinuous) named entity recognition (NER) have been paid on nested entities, while discontinuous entities received limited attention. Existing work for discontinuous NER, however, either suffers from decoding ambiguity or predicting using token-level local features. In this work, we present an innovative model for discontinuous NER based on pointer networks, where the pointer simultaneously decides whether a token at each decoding frame constitutes an entity mention and where the next constituent token is. Our model has three major merits compared with previous work: (1) The pointer mechanism is memory-augmented, which enhances the mention boundary detection and interactions between the current decision and prior recognized mentions. (2) The encoderdecoder architecture can linearize the complexity of structure prediction, and thus reduce search costs. (3) The model makes every decision using global information, i.e., by consulting all the input, encoder and previous decoder output in a global view. Experimental results on the CADEC and ShARe13 datasets show that our model outperforms flat and hypergraph models as well as a state-of-the-art transitionbased model for discontinuous NER. Further in-depth analysis demonstrates that our model performs well in recognizing various entities including flat, overlapping and discontinuous ones. More crucially, our model is effective on boundary detection, which is the kernel source to NER.

[1]  Fei Li,et al.  Recognizing irregular entities in biomedical text via deep neural networks , 2017, Pattern Recognit. Lett..

[2]  Hua Xu,et al.  Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features , 2013, BMC Medical Informatics and Decision Making.

[3]  Jingzhou Liu,et al.  Stack-Pointer Networks for Dependency Parsing , 2018, ACL.

[4]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[5]  Xiaoqiang Luo,et al.  A Statistical Model for Multilingual Entity Detection and Tracking , 2004, NAACL.

[6]  Cecile Paris,et al.  An Effective Transition-based Model for Discontinuous NER , 2020, ACL.

[7]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[8]  Danielle L. Mowery,et al.  Task 1: ShARe/CLEF eHealth Evaluation Lab 2013 , 2013, CLEF.

[9]  Sarvnaz Karimi,et al.  Cadec: A corpus of adverse drug event annotations , 2015, J. Biomed. Informatics.

[10]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[11]  Joakim Nivre,et al.  Analyzing and Integrating Dependency Parsers , 2011, CL.

[12]  Shuo Shang,et al.  Adversarial Transfer for Named Entity Boundary Detection with Pointer Networks , 2019, IJCAI.

[13]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[14]  Christopher D. Manning,et al.  Nested Named Entity Recognition , 2009, EMNLP.

[15]  Yue Zhang,et al.  Transition-Based Disfluency Detection using LSTMs , 2017, EMNLP.

[16]  Anders Søgaard,et al.  Multi-Task Semantic Dependency Parsing with Policy Gradient for Learning Easy-First Strategies , 2019, ACL.

[17]  Yaoyun Zhang,et al.  UTH-CCB: The Participation of the SemEval 2015 Challenge – Task 14 , 2015, *SEMEVAL.

[18]  Juntao Yu,et al.  Named Entity Recognition as Dependency Parsing , 2020, ACL.

[19]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[20]  Donghong Ji,et al.  Dispatched attention with multi-task learning for nested mention recognition , 2020, Inf. Sci..

[21]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[22]  Beatrice Alex,et al.  Recognising Nested Named Entities in Biomedical Text , 2007, BioNLP@ACL.

[23]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[24]  Donghong Ji,et al.  Enriching contextualized language model from knowledge graph for biomedical information extraction , 2020, Briefings Bioinform..

[25]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[26]  Donghong Ji,et al.  Boundaries and edges rethinking: An end-to-end neural model for overlapping entity relation extraction , 2020, Inf. Process. Manag..

[27]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[28]  Dan Roth,et al.  Joint Mention Extraction and Classification with Mention Hypergraphs , 2015, EMNLP.

[29]  Nan Yu,et al.  Transition-based Neural RST Parsing with Implicit Syntax Features , 2018, COLING.

[30]  Claire Cardie,et al.  Nested Named Entity Recognition Revisited , 2018, NAACL.

[31]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[32]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[33]  Wei Lu,et al.  Combining Spans into Entities: A Neural Two-Stage Approach for Recognizing Discontiguous Entities , 2019, EMNLP.

[34]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[35]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[36]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[37]  Qingcai Chen,et al.  Recognizing Continuous and Discontinuous Adverse Drug Reaction Mentions from Social Media Using LSTM-CRF , 2018, Wirel. Commun. Mob. Comput..

[38]  Sarvnaz Karimi,et al.  Concept Identification and Normalisation for Adverse Drug Event Discovery in Medical Forums , 2016, BMDID@ISWC.

[39]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[40]  Wei Lu,et al.  Learning to Recognize Discontiguous Entities , 2016, EMNLP.

[41]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[42]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[43]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.