FLAT: Chinese NER Using Flat-Lattice Transformer

Recently, the character-word lattice structure has been proved to be effective for Chinese named entity recognition (NER) by incorporating the word information. However, since the lattice structure is complex and dynamic, most existing lattice-based models are hard to fully utilize the parallel computation of GPUs and usually have a low inference-speed. In this paper, we propose FLAT: Flat-LAttice Transformer for Chinese NER, which converts the lattice structure into a flat structure consisting of spans. Each span corresponds to a character or latent word and its position in the original lattice. With the power of Transformer and well-designed position encoding, FLAT can fully leverage the lattice information and has an excellent parallelization ability. Experiments on four datasets show FLAT outperforms other lexicon-based models in performance and efficiency.

[1]  Vanessa López,et al.  Core techniques of question answering systems over knowledge bases: a survey , 2017, Knowledge and Information Systems.

[2]  Kehai Chen,et al.  Lattice-Based Transformer Encoder for Neural Machine Translation , 2019, ACL.

[3]  Yue Zhang,et al.  Neural Reranking for Named Entity Recognition , 2017, RANLP.

[4]  Pengfei Liu,et al.  Learning Sparse Sharing Architectures for Multiple Tasks , 2019, AAAI.

[5]  Kai Fan,et al.  Lattice Transformer for Speech Translation , 2019, ACL.

[6]  Xuanjing Huang,et al.  CNN-Based Chinese NER with Lexicon Rethinking , 2019, IJCAI.

[7]  Przemyslaw Biecek,et al.  Named Entity Recognition - Is there a glass ceiling? , 2019, CoNLL.

[8]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[9]  Alex Waibel,et al.  Self-Attentional Models for Lattice Inputs , 2019, ACL.

[10]  Shengping Liu,et al.  Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network , 2019, EMNLP.

[11]  Wang Bin,et al.  Porous Lattice-based Transformer Encoder for Chinese NER , 2019 .

[12]  Xiang Ren,et al.  Empower Sequence Labeling with Task-Aware Neural Language Model , 2017, AAAI.

[13]  Jun Zhao,et al.  Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks , 2015, ACL.

[14]  Tao Gui,et al.  A Lexicon-Based Graph Neural Network for Chinese NER , 2019, EMNLP.

[15]  Xiaoyong Du,et al.  Analogical Reasoning on Chinese Morphological and Semantic Relations , 2018, ACL.

[16]  Nanyun Peng,et al.  Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings , 2015, EMNLP.

[17]  Yue Zhang,et al.  Chinese NER Using Lattice LSTM , 2018, ACL.

[18]  Wanxiang Che,et al.  Pre-Training with Whole Word Masking for Chinese BERT , 2019, ArXiv.

[19]  Yiming Yang,et al.  Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[20]  Gina-Anne Levow,et al.  The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition , 2006, SIGHAN@COLING/ACL.

[21]  Xipeng Qiu,et al.  TENER: Adapting Transformer Encoder for Named Entity Recognition , 2019, ArXiv.

[22]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[23]  Ani Nenkova,et al.  Interpretability Analysis for Named Entity Recognition to Understand System Predictions and How They Can Improve , 2020, ArXiv.

[24]  Xu Sun,et al.  F-Score Driven Max Margin Neural Network for Named Entity Recognition in Chinese Social Media , 2016, EACL.

[25]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.