Effective Named Entity Recognition with Boundary-aware Bidirectional Neural Networks

Named Entity Recognition (NER) is a fundamental problem in Natural Language Processing and has received much research attention. Although the current neural-based NER approaches have achieved the state-of-the-art performance, they still suffer from one or more of the following three problems in their architectures: (1) boundary tag sparsity, (2) lacking of global decoding information; and (3) boundary error propagation. In this paper, we propose a novel Boundary-aware Bidirectional Neural Networks (Ba-BNN) model to tackle these problems for neural-based NER. The proposed Ba-BNN model is constructed based on the structure of pointer networks for tackling the first problem on boundary tag sparsity. Moreover, we also use a boundary-aware binary classifier to capture the global decoding information as input to the decoders. In the Ba-BNN model, we propose to use two decoders to process the information in two different directions (i.e., from left-to-right and right-to-left). The final hidden states of the left-to-right decoder are obtained by incorporating the hidden states of the right-to-left decoder in the decoding process. In addition, a boundary retraining strategy is also proposed to help reduce boundary error propagation caused by the pointer networks in boundary detection and entity classification. We have conducted extensive experiments based on three NER benchmark datasets. The performance results have shown that the proposed Ba-BNN model has outperformed the current state-of-the-art models.

[1]  Sam Coope,et al.  Named Entity Recognition With Parallel Recurrent Neural Networks , 2018, ACL.

[2]  Anima Anandkumar,et al.  Deep Active Learning for Named Entity Recognition , 2017, Rep4NLP@ACL.

[3]  Iryna Gurevych,et al.  Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , 2018, ACL 2018.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Shinsuke Mori,et al.  Domain Specific Named Entity Recognition Referring to the Real World by Deep Neural Networks , 2016, ACL.

[6]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[7]  Nigel Collier,et al.  Introduction to the Bio-entity Recognition Task at JNLPBA , 2004, NLPBA/BioNLP.

[8]  Bowen Zhou,et al.  Neural Models for Sequence Chunking , 2017, AAAI.

[9]  Leon Derczynski,et al.  Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition , 2017, NUT@EMNLP.

[10]  R. Notley Short Papers , 1971, 2009 5th IEEE International Workshop on Visualizing Software for Understanding and Analysis.

[11]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[12]  Andrew McCallum,et al.  Fast and Accurate Entity Recognition with Iterated Dilated Convolutions , 2017, EMNLP.

[13]  Dan Roth,et al.  Entity Linking via Joint Encoding of Types, Descriptions, and Context , 2017, EMNLP.

[14]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[15]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[16]  Chenliang Li,et al.  A Survey on Deep Learning for Named Entity Recognition , 2018, IEEE Transactions on Knowledge and Data Engineering.

[17]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[18]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[19]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[20]  Hai Zhao,et al.  Hierarchical Contextualized Representation for Named Entity Recognition , 2019, AAAI.

[21]  Roland Vollgraf,et al.  Contextual String Embeddings for Sequence Labeling , 2018, COLING.

[22]  Jing Li,et al.  Neural Named Entity Boundary Detection , 2021, IEEE Transactions on Knowledge and Data Engineering.

[23]  Yoshua Bengio,et al.  Professor Forcing: A New Algorithm for Training Recurrent Networks , 2016, NIPS.

[24]  Ming Zhou,et al.  Neural Question Generation from Text: A Preliminary Study , 2017, NLPCC.

[25]  Ido Dagan,et al.  Revisiting Joint Modeling of Cross-document Entity and Event Coreference Resolution , 2019, ACL.

[26]  Tie-Yan Liu,et al.  Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling , 2018, SIGIR.

[27]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[28]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[29]  Shafiq R. Joty,et al.  CODRA: A Novel Discriminative Framework for Rhetorical Analysis , 2015, CL.

[30]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[31]  G. Cong,et al.  Efficient and effective similar subtrajectory search with deep reinforcement learning , 2020, Proc. VLDB Endow..