Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning

Named entity recognition, and other information extraction tasks, frequently use linguistic features such as part of speech tags or chunkings. For languages where word boundaries are not readily identified in text, word segmentation is a key first step to generating features for an NER system. While using word boundary tags as features are helpful, the signals that aid in identifying these boundaries may provide richer information for an NER system. New state-of-the-art word segmentation systems use neural models to learn representations for predicting word boundaries. We show that these same representations, jointly trained with an NER system, yield significant improvements in NER for Chinese social media. In our experiments, jointly training NER and word segmentation with an LSTM-CRF model yields nearly 5% absolute improvement over previously published results.

[1]  Wei Li,et al.  Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.

[2]  Li Li,et al.  Entity linking and name disambiguation using SVM in Chinese micro-blogs , 2015, 2015 11th International Conference on Natural Computation (ICNC).

[3]  Yi Yang,et al.  Unsupervised Multi-Domain Adaptation with Feature Embeddings , 2015, NAACL.

[4]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[5]  Ruslan Salakhutdinov,et al.  Multi-Task Cross-Lingual Sequence Tagging from Scratch , 2016, ArXiv.

[6]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[7]  Xu Sun,et al.  F-Score Driven Max Margin Neural Network for Named Entity Recognition in Chinese Social Media , 2016, EACL.

[8]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[9]  Xiaoqing Zheng,et al.  Deep Learning for Chinese Word Segmentation and POS Tagging , 2013, EMNLP.

[10]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[11]  Houfeng Wang,et al.  The Task 2 of CIPS-SIGHAN 2012 Named Entity Recognition and Disambiguation in Chinese Bakeoff , 2012, CIPS-SIGHAN.

[12]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.

[13]  Zhifang Sui,et al.  The CIPS-SIGHAN CLP 2012 ChineseWord Segmentation onMicroBlog Corpora Bakeoff , 2012, CIPS-SIGHAN.

[14]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[15]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[16]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[17]  Dirk Hovy,et al.  Crowdsourcing and annotating NER for Twitter #drift , 2014, LREC.

[18]  Ming Zhou,et al.  Joint Inference of Named Entity Recognition and Normalization for Tweets , 2012, ACL.

[19]  Mark Dredze,et al.  Annotating Named Entities in Twitter Data with Crowdsourcing , 2010, Mturk@HLT-NAACL.

[20]  Mark Dredze,et al.  Entity Disambiguation for Knowledge Base Population , 2010, COLING.

[21]  Hongyu Guo,et al.  The Unreasonable Effectiveness of Word Representations for Twitter Named Entity Recognition , 2015, NAACL.

[22]  Nanyun Peng,et al.  Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings , 2015, EMNLP.

[23]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[24]  Ying Qin,et al.  Word Segmentation and Named Entity Recognition for SIGHAN Bakeoff3 , 2006, SIGHAN@COLING/ACL.

[25]  Andrew McCallum,et al.  Lexicon Infused Phrase Embeddings for Named Entity Resolution , 2014, CoNLL.

[26]  Xinnian Mao,et al.  Chinese Word Segmentation and Named Entity Recognition Based on Conditional Random Fields , 2008, IJCNLP.

[27]  Xiao Chen,et al.  The Fourth International Chinese Language Processing Bakeoff: Chinese Word Segmentation, Named Entity Recognition and Chinese POS Tagging , 2008, IJCNLP.

[28]  Xu Sun,et al.  A Unified Model for Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media , 2017, AAAI.

[29]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[30]  Changning Huang,et al.  Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach , 2005, CL.

[31]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[32]  Xuanjing Huang,et al.  Long Short-Term Memory Neural Networks for Chinese Word Segmentation , 2015, EMNLP.

[33]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[34]  Ming Zhou,et al.  Recognizing Named Entities in Tweets , 2011, ACL.

[35]  Oren Etzioni,et al.  Named Entity Recognition in Tweets: An Experimental Study , 2011, EMNLP.

[36]  Bu-Sung Lee,et al.  TwiNER: named entity recognition in targeted twitter stream , 2012, SIGIR '12.