Incorporating Boundary and Category Feature for Nested Named Entity Recognition

In the natural language processing (NLP) field, it is fairly common that an entity is nested in another entity. Most existing named entity recognition (NER) models focus on flat entities but ignore nested entities. In this paper, we propose a neural model for nested named entity recognition. Our model employs a multi-label boundary detection module to detect entity boundaries, avoiding boundary detection conflict existing in the boundary-aware model. Besides, our model with a boundary detection module and a category detection module detects entity boundaries and entity categories simultaneously, avoiding the error propagation problem existing in current pipeline models. Furthermore, we introduce multitask learning to train the boundary detection module and the category detection module to capture the underlying association between entity boundary information and entity category information. In this way, our model achieves better performance of entity extraction. In evaluations on two nested NER datasets and a flat NER dataset, we show that our model outperforms previous state-of-the-art models on nested and flat NER.

[1]  Jian Su,et al.  Effective Adaptation of Hidden Markov Model-based Named Entity Recognizer for Biomedical Domain , 2003, BioNLP@ACL.

[2]  Jonathan Baxter,et al.  A Bayesian/Information Theoretic Model of Learning to Learn via Multiple Task Sampling , 1997, Machine Learning.

[3]  Oren Etzioni,et al.  Exploring Markov Logic Networks for Question Answering , 2015, EMNLP.

[4]  Lin Li,et al.  Co-training an Improved Recurrent Neural Network with Probability Statistic Models for Named Entity Recognition , 2017, DASFAA.

[5]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[6]  Jian Su,et al.  Enhancing HMM-based biomedical named entity recognition by studying special phenomena , 2004, J. Biomed. Informatics.

[7]  Meng Wang,et al.  Adversarial Discriminative Denoising for Distant Supervision Relation Extraction , 2019, DASFAA.

[8]  Baohua Gu Recognizing Nested Named Entities in GENIA corpus , 2006, BioNLP@NAACL-HLT.

[9]  Sophia Ananiadou,et al.  A Neural Layered Model for Nested Named Entity Recognition , 2018, NAACL.

[10]  Nigel Collier,et al.  Introduction to the Bio-entity Recognition Task at JNLPBA , 2004, NLPBA/BioNLP.

[11]  Christopher D. Manning,et al.  Nested Named Entity Recognition , 2009, EMNLP.

[12]  Claire Cardie,et al.  Nested Named Entity Recognition Revisited , 2018, NAACL.

[13]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[14]  Peihao Tong,et al.  Leveraging Domain Context for Question Answering Over Knowledge Graph , 2019, Data Science and Engineering.

[15]  Makoto Miwa,et al.  Deep Exhaustive Model for Nested Named Entity Recognition , 2018, EMNLP.

[16]  Mourad Gridach,et al.  Character-level neural network for biomedical named entity recognition , 2017, J. Biomed. Informatics.

[17]  Juan-Zi Li,et al.  Domain-Specific Entity Linking via Fake Named Entity Detection , 2016, DASFAA.

[18]  Jian Su,et al.  Recognizing Names in Biomedical Texts: a Machine Learning Approach , 2004 .

[19]  Dan Roth,et al.  Joint Mention Extraction and Classification with Mention Hypergraphs , 2015, EMNLP.

[20]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[21]  Wei Lu,et al.  Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators , 2017, EMNLP.

[22]  Christian Biemann,et al.  NoSta-D Named Entity Annotation for German: Guidelines and Dataset , 2014, LREC.

[23]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[24]  Guandong Xu,et al.  A Boundary-aware Neural Model for Nested Named Entity Recognition , 2019, EMNLP.

[25]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.