Joint Self-Attention and Multi-Embeddings for Chinese Named Entity Recognition

Named Entity Recognition (NER) is a fundamental task in Natural Language Processing (NLP), but it remains more challenging in Chinese due to the particularity and complexity of Chinese. Traditional Chinese Named Entity Recognition (Chinese NER) methods require cumbersome feature engineering and domain-specific knowledge to achieve high performance. In this paper, we propose a simple yet effective neural network framework for Chinese NER, named A-NER. A-NER is the first Bidirectional Gated Recurrent Unit - Conditional Random Field (BiGRU-CRF) model that combines self-attention mechanism with multi-embeddings technology. It can extract richer linguistic information of characters from different granularities (e.g., radical, character, word) and find the correlations between characters in the sequence. Moreover, A-NER does not rely on any external resources and hand-crafted features. The experimental results show that our model outperforms (or approaches) existing state-of-the-art methods on different domain datasets.

[1]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[2]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[3]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[4]  Hitoshi Isahara,et al.  Chinese Named Entity Recognition with Conditional Random Fields , 2006, SIGHAN@COLING/ACL.

[5]  Chenliang Li,et al.  Exploiting Multiple Embeddings for Chinese Named Entity Recognition , 2019, CIKM.

[6]  Jun Zhao,et al.  Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism , 2018, EMNLP.

[7]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[8]  Ying Qin,et al.  Word Segmentation and Named Entity Recognition for SIGHAN Bakeoff3 , 2006, SIGHAN@COLING/ACL.

[9]  Nanyun Peng,et al.  Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings , 2015, EMNLP.

[10]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[11]  Nanyun Peng,et al.  Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning , 2016, ACL.

[12]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[13]  Yidong Chen,et al.  Deep Semantic Role Labeling with Self-Attention , 2017, AAAI.

[14]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[15]  Gina-Anne Levow,et al.  The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition , 2006, SIGHAN@COLING/ACL.

[16]  Masanori Hattori,et al.  Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition , 2016, NLPCC/ICCPOL.

[17]  Yue Zhang,et al.  Chinese NER Using Lattice LSTM , 2018, ACL.

[18]  Yang Xiang,et al.  Chinese Named Entity Recognition with Character-Word Mixed Embedding , 2017, CIKM.