SC-NER: A Sequence-to-Sequence Model with Sentence Classification for Named Entity Recognition

Named Entity Recognition (NER) is a basic task in Natural Language Processing (NLP). Recently, the sequence-to-sequence (seq2seq) model has been widely used in NLP task. Different from the general NLP task, 60% sentences in the NER task do not contain entities. Traditional seq2seq method cannot address this issue effectively. To solve the aforementioned problem, we propose a novel seq2seq model, named SC-NER, for NER task. We construct a classifier between the encoder and decoder. In particular, the classifier’s input is the last hidden state of the encoder. Moreover, we present the restricted beam search to improve the performance of the proposed SC-NER. To evaluate our proposed model, we construct the patent documents corpus in the communications field, and conduct experiments on it. Experimental results show that our SC-NER model achieves better performance than other baseline methods.

[1]  Denny Britz,et al.  Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models , 2017, EMNLP.

[2]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[3]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[4]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[5]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[6]  Yejin Choi,et al.  Dynamic Entity Representations in Neural Language Models , 2017, EMNLP.

[7]  Mark G. Lee,et al.  A Hybrid Approach to Features Representation for Fine-grained Arabic Named Entity Recognition , 2014, COLING.

[8]  Kenny Q. Zhu,et al.  Multi-channel BiLSTM-CRF Model for Emerging Named Entity Recognition in Social Media , 2017, NUT@EMNLP.

[9]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[10]  Alexander M. Rush,et al.  Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.

[11]  Wei-Yun Ma,et al.  Leveraging Linguistic Structures for Named Entity Recognition with Bidirectional Recursive Neural Networks , 2017, EMNLP.

[12]  Yejin Choi,et al.  Neural AMR: Sequence-to-Sequence Models for Parsing and Generation , 2017, ACL.

[13]  Wei Li,et al.  Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.

[14]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[15]  Yang Liu,et al.  Agreement-Based Joint Training for Bidirectional Attention-Based Neural Machine Translation , 2015, IJCAI.

[16]  Daniel Jurafsky,et al.  A Simple, Fast Diverse Decoding Algorithm for Neural Generation , 2016, ArXiv.

[17]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[18]  Hermann Ney,et al.  From Feedforward to Recurrent LSTM Neural Networks for Language Modeling , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[19]  Hua Xu,et al.  Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network , 2015, MedInfo.

[20]  Masanori Hattori,et al.  Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition , 2016, NLPCC/ICCPOL.

[21]  Andrew McCallum,et al.  Lexicon Infused Phrase Embeddings for Named Entity Resolution , 2014, CoNLL.

[22]  Jian Su,et al.  Named Entity Recognition using an HMM-based Chunk Tagger , 2002, ACL.

[23]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[24]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[25]  Zheng Huang,et al.  Bi-directional LSTM Recurrent Neural Network for Chinese Word Segmentation , 2016, ICONIP.

[26]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[27]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[28]  Yann Dauphin,et al.  A Convolutional Encoder Model for Neural Machine Translation , 2016, ACL.

[29]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[30]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[31]  Katherine A. Keith,et al.  Identifying civilians killed by police with distantly supervised entity-event extraction , 2017, EMNLP.

[32]  Nanyun Peng,et al.  Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning , 2016, ACL.

[33]  Hwee Tou Ng,et al.  Named Entity Recognition with a Maximum Entropy Approach , 2003, CoNLL.

[34]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.