Named Entity Recognition Architecture Combining Contextual and Global Features

Named entity recognition (NER) is an information extraction technique that aims to locate and classify named entities (e.g., organizations, locations,...) within a document into predefined categories. Correctly identifying these phrases plays a significant role in simplifying information access. However, it remains a difficult task because named entities (NEs) have multiple forms and they are context dependent. While the context can be represented by contextual features, the global relations are often misrepresented by those models. In this paper, we propose the combination of contextual features from XLNet and global features from Graph Convolution Network (GCN) to enhance NER performance. Experiments over a widely-used dataset, CoNLL 2003, show the benefits of our strategy, with results competitive with the state of the art (SOTA).

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[3]  Stan Matwin,et al.  Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity , 2006, Canadian AI.

[4]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[5]  Chenliang Li,et al.  A Survey on Deep Learning for Named Entity Recognition , 2018, IEEE Transactions on Knowledge and Data Engineering.

[6]  Aishan Wumaier,et al.  Named-Entity Recognition in Sports Field Based on a Character-Level Graph Convolutional Network , 2020, Inf..

[7]  Sriharsha Veeramachaneni,et al.  A Simple Semi-supervised Algorithm For Named Entity Recognition , 2009, HLT-NAACL 2009.

[8]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[9]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[10]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[11]  Marc Moens,et al.  EACL 1999, 9th Conference of the European Chapter of the Association for Computational Linguistics, June 8-12, 1999, University of Bergen, Bergen, Norway , 1999, EACL.

[12]  Steven Bethard,et al.  A Survey on Recent Advances in Named Entity Recognition from Deep Learning models , 2018, COLING.

[13]  Kenneth Ward Church,et al.  Word2Vec , 2016, Natural Language Engineering.

[14]  Ralph Grishman,et al.  Message Understanding Conference- 6: A Brief History , 1996, COLING.

[15]  Matthijs Douze,et al.  FastText.zip: Compressing text classification models , 2016, ArXiv.

[16]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[17]  Christopher Joseph Pal,et al.  Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning , 2018, ICLR.

[18]  Chin-Yew Lin,et al.  Towards Improving Neural Named Entity Recognition with Gazetteers , 2019, ACL.

[19]  Tomas Mikolov,et al.  Advances in Pre-Training Distributed Word Representations , 2017, LREC.

[20]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[21]  Gao Cong,et al.  Joint Recognition and Linking of Fine-Grained Locations from Tweets , 2016, WWW.

[22]  Xiang Zhao,et al.  HAMNER: Headword Amplified Multi-span Distantly Supervised Method for Domain Specific Named Entity Recognition , 2019, AAAI.

[23]  Oren Etzioni,et al.  Named Entity Recognition in Tweets: An Experimental Study , 2011, EMNLP.

[24]  Fandong Meng,et al.  GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling , 2019, ACL.

[25]  Hai Zhao,et al.  Hierarchical Contextualized Representation for Named Entity Recognition , 2019, AAAI.

[26]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[27]  Ulf Leser,et al.  ChemSpot: a hybrid system for chemical named entity recognition , 2012, Bioinform..

[28]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[29]  Roland Vollgraf,et al.  FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP , 2019, NAACL.

[30]  Girish Keshav Palshikar Techniques for Named Entity Recognition , 2012, Collaboration and the Semantic Web.

[31]  Nigel Collier,et al.  Use of Support Vector Machines in Extended Named Entity Recognition , 2002, CoNLL.

[32]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[33]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[34]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[35]  Stefano Bragaglia,et al.  Graph Convolutional Networks for Named Entity Recognition , 2017, TLT.

[36]  Yue Zhang,et al.  Sentence-State LSTM for Text Representation , 2018, ACL.

[37]  Yuan Luo,et al.  Graph Convolutional Networks for Text Classification , 2018, AAAI.

[38]  Girish Keshav Palshikar Techniques for Named Entity Recognition : A Survey , 2019 .

[39]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[40]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[41]  Christopher D. Manning,et al.  Stanza: A Python Natural Language Processing Toolkit for Many Human Languages , 2020, ACL.

[42]  Xiang Ren,et al.  Empower Sequence Labeling with Task-Aware Neural Language Model , 2017, AAAI.