BiGCNN: Bidirectional Gated Convolutional Neural Network for Chinese Named Entity Recognition

Recent advances on Chinese named entity recognition (NER) are mostly based on the recurrent neural network (RNN). Since RNNs are limited in parallel processing, some works apply the convolutional neural network (CNN) to perform NER. However, existing CNN-based models fail to explicitly distinguish the preceding and subsequent contexts, so they are difficult to handle cases that are sensitive to the location of the contexts. Moreover, they pay equal attention to the context within a convolution kernel, while not all the information is useful for semantic understanding. In this paper, we propose a novel CNN-based model, Bidirectional Gated Convolutional Neural Network (BiGCNN), to differentiate the entity-related information between preceding and subsequent contexts and filter out the convolution information adaptively. By incorporating automatic segmentation and glyph information, BiGCNN outperforms state-of-the-art models on four Chinese NER datasets. Additionally, benefiting from the parallelism processing, the proposed method enjoys higher training and testing efficiency, e.g., 12.04 times faster than RNN-based models, while with better performance.

[1]  Xiang Ren,et al.  Empower Sequence Labeling with Task-Aware Neural Language Model , 2017, AAAI.

[2]  Xu Sun,et al.  F-Score Driven Max Margin Neural Network for Named Entity Recognition in Chinese Social Media , 2016, EACL.

[3]  Xuanjing Huang,et al.  CNN-Based Chinese NER with Lexicon Rethinking , 2019, IJCAI.

[4]  Vladlen Koltun,et al.  An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.

[5]  Nanyun Peng,et al.  Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning , 2016, ACL.

[6]  Xiaoyong Du,et al.  Analogical Reasoning on Chinese Morphological and Semantic Relations , 2018, ACL.

[7]  Yue Zhang,et al.  Multi-prototype Chinese Character Embedding , 2016, LREC.

[8]  Jun Zhao,et al.  Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks , 2015, ACL.

[9]  Andrew McCallum,et al.  Fast and Accurate Entity Recognition with Iterated Dilated Convolutions , 2017, EMNLP.

[10]  Wei Wu,et al.  Glyce: Glyph-vectors for Chinese Character Representations , 2019, NeurIPS.

[11]  Guoxin Wang,et al.  CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition , 2019, NAACL.

[12]  Yueran Zu,et al.  An Encoding Strategy Based Word-Character LSTM for Chinese NER , 2019, NAACL.

[13]  Gina-Anne Levow,et al.  The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition , 2006, SIGHAN@COLING/ACL.

[14]  Masanori Hattori,et al.  Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition , 2016, NLPCC/ICCPOL.

[15]  Hui Chen,et al.  GRN: Gated Relation Network to Enhance Convolutional Neural Network for Named Entity Recognition , 2019, AAAI.

[16]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[18]  Zhoujun Li,et al.  Aggregating Inter-Sentence Information to Enhance Relation Extraction , 2016, AAAI.

[19]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[20]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[21]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[22]  Jun Zhao,et al.  Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism , 2018, EMNLP.

[23]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[24]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[25]  Xu Sun,et al.  Exploring Representations from Unlabeled Data with Co-training for Chinese Word Segmentation , 2013, EMNLP.

[26]  Pavlina Fragkou,et al.  Applying named entity recognition and co-reference resolution for segmenting English texts , 2017, Progress in Artificial Intelligence.

[27]  Yue Zhang,et al.  Chinese NER Using Lattice LSTM , 2018, ACL.

[28]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[29]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[30]  Satoshi Sekine,et al.  Definition, Dictionaries and Tagger for Extended Named Entity Hierarchy , 2004, LREC.

[31]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[32]  Nanyun Peng,et al.  Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings , 2015, EMNLP.

[33]  Wei Zhang,et al.  Adversarial Learning for Chinese NER from Crowd Annotations , 2018, AAAI.