Porous Lattice Transformer Encoder for Chinese NER

Incorporating lexicons into character-level Chinese NER by lattices is proven effective to exploit rich word boundary information. Previous work has extended RNNs to consume lattice inputs and achieved great success. However, due to the DAG structure and the inherently unidirectional sequential nature, this method precludes batched computation and sufficient semantic interaction. In this paper, we propose PLTE, an extension of transformer encoder that is tailored for Chinese NER, which models all the characters and matched lexical words in parallel with batch processing. PLTE augments self-attention with positional relation representations to incorporate lattice structure. It also introduces a porous mechanism to augment localness modeling and maintain the strength of capturing the rich long-term dependencies. Experimental results show that PLTE performs up to 11.4 times faster than state-of-the-art methods while realizing better performance. We also demonstrate that using BERT representations further substantially boosts the performance and brings out the best in PLTE.

[1]  Nanyun Peng,et al.  Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning , 2016, ACL.

[2]  Yue Zhang,et al.  Chinese NER Using Lattice LSTM , 2018, ACL.

[3]  John D. Lafferty,et al.  Information retrieval as statistical translation , 1999, SIGIR '99.

[4]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[5]  Kehai Chen,et al.  Lattice-Based Transformer Encoder for Neural Machine Translation , 2019, ACL.

[6]  Yaojie Lu,et al.  Gazetteer-Enhanced Attentive Neural Networks for Named Entity Recognition , 2019, EMNLP.

[7]  Kai Fan,et al.  Lattice Transformer for Speech Translation , 2019, ACL.

[8]  Xing Xie,et al.  Neural Chinese Named Entity Recognition via CNN-LSTM-CRF and Joint Training with Word Segmentation , 2019, WWW.

[9]  Xuanjing Huang,et al.  CNN-Based Chinese NER with Lexicon Rethinking , 2019, IJCAI.

[10]  Yue Zhang,et al.  Combining Discrete and Neural Features for Sequence Labeling , 2016, CICLing.

[11]  Tong Zhang,et al.  Modeling Localness for Self-Attention Networks , 2018, EMNLP.

[12]  Ashish Vaswani,et al.  Self-Attention with Relative Position Representations , 2018, NAACL.

[13]  Nanyun Peng,et al.  Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings , 2015, EMNLP.

[14]  Zheng Zhang,et al.  Star-Transformer , 2019, NAACL.

[15]  Alex Waibel,et al.  Self-Attentional Models for Lattice Inputs , 2019, ACL.

[16]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[17]  Gina-Anne Levow,et al.  The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition , 2006, SIGHAN@COLING/ACL.

[18]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[19]  Xu Sun,et al.  A Unified Model for Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media , 2017, AAAI.

[20]  Yueran Zu,et al.  An Encoding Strategy Based Word-Character LSTM for Chinese NER , 2019, NAACL.

[21]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[22]  Jingbo Zhu,et al.  Learning Deep Transformer Models for Machine Translation , 2019, ACL.

[23]  Zhang,et al.  Chinese Named Entity Recognition_via Joint Identification and Categorization , 2013 .

[24]  Yubao Liu,et al.  Neural Collective Entity Linking Based on Recurrent Random Walk Network Learning , 2019, IJCAI.

[25]  Cícero Nogueira dos Santos,et al.  Boosting Named Entity Recognition with Neural Character Embeddings , 2015, NEWS@ACL.

[26]  Rongrong Ji,et al.  Lattice-Based Recurrent Neural Network Encoders for Neural Machine Translation , 2016, AAAI.

[27]  Yue Zhang,et al.  Multi-prototype Chinese Character Embedding , 2016, LREC.

[28]  Guoxin Wang,et al.  CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition , 2019, NAACL.

[29]  Yidong Chen,et al.  Lattice-to-sequence attentional Neural Machine Translation models , 2018, Neurocomputing.

[30]  Wanxiang Che,et al.  Effective Bilingual Constraints for Semi-Supervised Learning of Named Entity Recognizers , 2013, AAAI.

[31]  Wanxiang Che,et al.  Named Entity Recognition with Bilingual Constraints , 2013, HLT-NAACL.

[32]  Xu Sun,et al.  F-Score Driven Max Margin Neural Network for Named Entity Recognition in Chinese Social Media , 2016, EACL.

[33]  Jun Zhao,et al.  Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism , 2018, EMNLP.

[34]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[35]  Bin Wang,et al.  Beyond Word Attention: Using Segment Attention in Neural Relation Extraction , 2019, IJCAI.

[36]  Shengping Liu,et al.  Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network , 2019, EMNLP.

[37]  Tao Gui,et al.  A Lexicon-Based Graph Neural Network for Chinese NER , 2019, EMNLP.