SpanNER: Named Entity Re-/Recognition as Span Prediction

Recent years have seen the paradigm shift of Named Entity Recognition (NER) systems from sequence labeling to span prediction. Despite its preliminary effectiveness, the span prediction model’s architectural bias has not been fully understood. In this paper, we first investigate the strengths and weaknesses when the span prediction model is used for named entity recognition compared with the sequence labeling framework and how to further improve it, which motivates us to make complementary advantages of systems based on different paradigms. We then reveal that span prediction, simultaneously, can serve as a system combiner to re-recognize named entities from different systems’ outputs. We experimentally implement 154 systems on 11 datasets, covering three languages, comprehensive results show the effectiveness of span prediction models that both serve as base NER systems and system combiners. We make all code and datasets available: https:// github.com/neulab/spanner, as well as an online system demo: http://spanner. sh. Our model also has been deployed into the EXPLAINABOARD (Liu et al., 2021) platform, which allows users to flexibly perform the system combination of top-scoring systems in an interactive way: http://explainaboard. nlpedia.ai/leaderboard/task-ner/.

[1]  Axel-Cyrille Ngonga Ngomo,et al.  Ensemble Learning for Named Entity Recognition , 2014, SEMWEB.

[2]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[3]  Xuanjing Huang,et al.  Larger-Context Tagging: When and Why Does It Work? , 2021, NAACL.

[4]  Juntao Yu,et al.  Named Entity Recognition as Dependency Parsing , 2020, ACL.

[5]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[6]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[7]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[8]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[9]  Taro Watanabe,et al.  Machine Translation System Combination by Confusion Forest , 2011, ACL.

[10]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[11]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[12]  Jing Cai,et al.  A Weighted Voting Classifier Based on Differential Evolution , 2014 .

[13]  Liang Huang,et al.  Forest Reranking: Discriminative Parsing with Non-Local Features , 2008, ACL.

[14]  Kevin Duh,et al.  Generalized Minimum Bayes Risk System Combination , 2011, IJCNLP.

[15]  Marine Carpuat,et al.  A Stacked, Voted, Stacked Model for Named Entity Recognition , 2003, CoNLL.

[16]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[17]  Xiao Huang,et al.  TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition , 2020, ACL.

[18]  Xuanjing Huang,et al.  Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study , 2020, AAAI.

[19]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[20]  Kentaro Inui,et al.  Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition , 2020, ACL.

[21]  Tong Zhang,et al.  Named Entity Recognition through Classifier Combination , 2003, CoNLL.

[22]  Francisco Casacuberta,et al.  Minimum Bayes-risk System Combination , 2011, ACL.

[23]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[24]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[25]  Hai Zhao,et al.  Hierarchical Contextualized Representation for Named Entity Recognition , 2019, AAAI.

[26]  Dan Klein,et al.  A Joint Model for Entity Analysis: Coreference, Typing, and Linking , 2014, TACL.

[27]  Jinlan Fu,et al.  Is Chinese Word Segmentation a Solved Task? Rethinking Neural Chinese Word Segmentation , 2020, EMNLP.

[28]  Roland Vollgraf,et al.  Contextual String Embeddings for Sequence Labeling , 2018, COLING.

[29]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[30]  Jinlan Fu,et al.  ExplainaBoard: An Explainable Leaderboard for NLP , 2021, ACL.

[31]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[32]  Michael Collins,et al.  Discriminative Reranking for Natural Language Parsing , 2000, CL.

[33]  Dan Jurafsky,et al.  Utility is in the Eye of the User: A Critique of NLP Leaderboards , 2020, EMNLP.

[34]  Roland Vollgraf,et al.  Pooled Contextualized Embeddings for Named Entity Recognition , 2019, NAACL.

[35]  Ian H. Witten,et al.  Stacked Generalizations: When Does It Work? , 1997, IJCAI.

[36]  Jiwei Li,et al.  A Unified MRC Framework for Named Entity Recognition , 2019, ACL.

[37]  Graham Neubig,et al.  Generalizing Natural Language Analysis through Span-relation Representations , 2020, ACL.

[38]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[39]  Jinlan Fu,et al.  Interpretable Multi-dataset Evaluation for Named Entity Recognition , 2020, EMNLP.

[40]  Philip S. Yu,et al.  Multi-grained Named Entity Recognition , 2019, ACL.

[41]  Mirella Lapata,et al.  Text Summarization with Pretrained Encoders , 2019, EMNLP.

[42]  Hui Chen,et al.  GRN: Gated Relation Network to Enhance Convolutional Neural Network for Named Entity Recognition , 2019, AAAI.

[43]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition , 2002, CoNLL.

[44]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[45]  Yuji Matsumoto,et al.  Discriminative Reranking for Grammatical Error Correction with Statistical Machine Translation , 2016, NAACL.

[46]  Asif Ekbal,et al.  Combining multiple classifiers using vote based classifier ensemble technique for named entity recognition , 2013, Data Knowl. Eng..

[47]  Asif Ekbal,et al.  Weighted Vote-Based Classifier Ensemble for Named Entity Recognition: A Genetic Algorithm-Based Approach , 2011, TALIP.

[48]  Leon Derczynski,et al.  Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition , 2017, NUT@EMNLP.

[49]  Huanbo Luan,et al.  Modeling Voting for System Combination in Machine Translation , 2020, IJCAI.

[50]  Alan Ritter,et al.  Results of the WNUT16 Named Entity Recognition Shared Task , 2016, NUT@COLING.

[51]  Jiajun Zhang,et al.  Neural System Combination for Machine Translation , 2017, ACL.

[52]  Philippe Langlais,et al.  Robust Lexical Features for Improved Neural Network Named-Entity Recognition , 2018, COLING.

[53]  Dan Klein,et al.  A Minimal Span-Based Neural Constituency Parser , 2017, ACL.

[54]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[55]  Hiroyuki Shindo,et al.  LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention , 2020, EMNLP.