Fast Neural Chinese Named Entity Recognition with Multi-head Self-attention

Named entity recognition (NER) is an important task in natural language processing. It is an essential step for many downstream tasks, such as relation extraction and entity linking which are important for knowledge graph building and application. Existing neural NER methods are usually based on the LSTM-CRF framework and its variants. However, since the LSTM network has high time complexity to compute, the efficiency of these LSTM-CRF based NER methods is usually unsatisfactory. In this paper, we propose a fast neural NER model for Chinese texts. Our approach is based on the CNN-SelfAttention-CRF architecture, where the convolutional neural network (CNN) is used to learn contextual character representations from local contexts, the multi-head self-attention network is used to learn contextual character representations from global contexts, and the conditional random fields (CRF) is used to jointly decode the labels of characters in a sentence. Since both CNN and self-attention network can be computed in parallel, our approach can have higher efficiency than those LSTM-CRF based methods. Extensive experiments on two benchmark datasets validate that our approach is more efficient than existing neural NER methods and can achieve comparable or even better performance on Chinese NER.

[1]  Anima Anandkumar,et al.  Deep Active Learning for Named Entity Recognition , 2017, Rep4NLP@ACL.

[2]  Asif Ekbal,et al.  Stacked ensemble coupled with feature selection for biomedical entity extraction , 2013, Knowl. Based Syst..

[3]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[4]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[5]  Wonyong Sung,et al.  Single stream parallelization of generalized LSTM-like RNNs on a GPU , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Antonio Jimeno-Yepes,et al.  Named Entity Recognition with Stack Residual LSTM and Trainable Bias Decoding , 2017, IJCNLP.

[7]  Andrew McCallum,et al.  Lexicon Infused Phrase Embeddings for Named Entity Resolution , 2014, CoNLL.

[8]  Aitao Chen,et al.  Chinese Named Entity Recognition with Conditional Probabilistic Models , 2006, SIGHAN@COLING/ACL.

[9]  Yue Zhang,et al.  Chinese NER Using Lattice LSTM , 2018, ACL.

[10]  Nanyun Peng,et al.  Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings , 2015, EMNLP.

[11]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[12]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[13]  Ngoc Thanh Nguyen,et al.  A combination of active learning and self-learning for named entity recognition on Twitter using conditional random fields , 2017, Knowl. Based Syst..

[14]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[15]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[16]  Yuji Matsumoto,et al.  Use of Support Vector Learning for Chunk Identification , 2000, CoNLL/LLL.

[17]  Nanyun Peng,et al.  Multi-task Domain Adaptation for Sequence Tagging , 2016, Rep4NLP@ACL.

[18]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[19]  Jun Zhao,et al.  Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism , 2018, EMNLP.

[20]  Mourad Gridach,et al.  Character-level neural network for biomedical named entity recognition , 2017, J. Biomed. Informatics.

[21]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[22]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[23]  Jun Zhao,et al.  Collective entity linking in web text: a graph-based method , 2011, SIGIR.

[24]  Michal Konkol,et al.  Latent semantics in Named Entity Recognition , 2015, Expert Syst. Appl..

[25]  Sam Coope,et al.  Named Entity Recognition With Parallel Recurrent Neural Networks , 2018, ACL.

[26]  Maryam Habibi,et al.  Deep learning with word embeddings improves biomedical named entity recognition , 2017, Bioinform..

[27]  Xing Xie,et al.  Neural Chinese Named Entity Recognition via CNN-LSTM-CRF and Joint Training with Word Segmentation , 2019, WWW.

[28]  Chandra Bhagavatula,et al.  Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.