A Deep Learning Solution to Named Entity Recognition

Identifying named entities is vital for many Natural Language Processing (NLP) applications. Much of the earlier work for identifying named entities focused on using handcrafted features and knowledge resources (feature engineering). This is a barrier for resource-scarce languages as many resources are not readily available. Recently, Deep Learning techniques have been proposed for various NLP tasks requiring little/no hand-crafted features and knowledge resources, instead the features are learned from the data. Many proposed deep learning solutions for Named Entity Recognition (NER) still rely on feature engineering as opposed to feature learning. However, it is not clear whether the deep learning system or the engineered features are responsible for the positive results reported. This is in contrast with the goal of deep learning systems i.e., to learn the features from the data itself. In this study, we show that a feature learned deep learning system is a viable solution to NER task. We test our deep learning systems on CoNLL English and Spanish NER datasets. Our system is able to give comparable results with the existing state-of-the-art feature engineered systems for English. We report the best performance of 89.27 F-Score for English when comparing with systems which do not use any handcrafted features or knowledge resources. Evaluation of our trained system on out-of-domain data indicate that the results are promising with the reported results. Our system when tested on Spanish NER achieves the best reported F-Score of 82.59 indicating its applicability to other languages.

[1]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[2]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[3]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[4]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[5]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition , 2002, CoNLL.

[6]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[7]  Veronika Vincze,et al.  Multiword Expressions and Named Entities in the Wiki50 Corpus , 2011, RANLP.

[8]  Yulia Tsvetkov,et al.  Sparse Overcomplete Word Vector Representations , 2015, ACL.

[9]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[10]  Tong Zhang,et al.  Named Entity Recognition through Classifier Combination , 2003, CoNLL.

[11]  Cícero Nogueira dos Santos,et al.  Learning Character-level Representations for Part-of-Speech Tagging , 2014, ICML.

[12]  Andrew McCallum,et al.  Lexicon Infused Phrase Embeddings for Named Entity Resolution , 2014, CoNLL.

[13]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[14]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[15]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[16]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[17]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[18]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[19]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[20]  Victor Guimar Boosting Named Entity Recognition with Neural Character Embeddings , 2015 .

[21]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[22]  Jeffrey Pennington,et al.  Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.