Named Entity Recognition With Parallel Recurrent Neural Networks

We present a new architecture for named entity recognition. Our model employs multiple independent bidirectional LSTM units across the same input and promotes diversity among them by employing an inter-model regularization term. By distributing computation across multiple smaller LSTMs we find a significant reduction in the total number of parameters. We find our architecture achieves state-of-the-art performance on the CoNLL 2003 NER dataset.

[1]  Hwee Tou Ng,et al.  Named Entity Recognition: A Maximum Entropy Approach Using Global Information , 2002, COLING.

[2]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[3]  Chandra Bhagavatula,et al.  Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.

[4]  Dekang Lin,et al.  Phrase Clustering for Discriminative Learning , 2009, ACL.

[5]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[6]  Jason Cong,et al.  FPGA-based accelerator for long short-term memory recurrent neural networks , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).

[7]  Jian Su,et al.  Named Entity Recognition using an HMM-based Chunk Tagger , 2002, ACL.

[8]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[9]  Christopher Joseph Pal,et al.  On orthogonality and learning recurrent networks with long term dependencies , 2017, ICML.

[10]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[11]  Andrew McCallum,et al.  Lexicon Infused Phrase Embeddings for Named Entity Resolution , 2014, CoNLL.

[12]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[13]  Alexandros Karatzoglou,et al.  Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations , 2016, RecSys.

[14]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[15]  Hiroyuki Shindo,et al.  Segment-Level Neural Conditional Random Fields for Named Entity Recognition , 2017, IJCNLP.

[16]  Jiajun Chen,et al.  Going Wider: Recurrent Neural Network With Parallel Cells , 2017, ArXiv.

[17]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[18]  Iryna Gurevych,et al.  Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging , 2017, EMNLP.

[19]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[20]  Tong Zhang,et al.  Named Entity Recognition through Classifier Combination , 2003, CoNLL.

[21]  Gang Luo,et al.  Joint Named Entity Recognition and Disambiguation , 2015 .

[22]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[23]  Richard Socher,et al.  Regularizing and Optimizing LSTM Language Models , 2017, ICLR.

[24]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[25]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[26]  Burr Settles,et al.  Biomedical Named Entity Recognition using Conditional Random Fields and Rich Feature Sets , 2004, NLPBA/BioNLP.

[27]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[28]  P J Webros BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.