Unified Named Entity Recognition as Word-Word Relation Classification

So far, named entity recognition (NER) has been involved with three major types, including flat, overlapped (aka. nested), and discontinuous NER, which have mostly been studied individually. Recently, a growing interest has been built for unified NER, tackling the above three jobs concurrently with one single model. Current best-performing methods mainly include span-based and sequence-to-sequence models, where unfortunately the former merely focus on boundary identification and the latter may suffer from exposure bias. In this work, we present a novel alternative by modeling the unified NER as word-word relation classification, namely WNER. The architecture resolves the kernel bottleneck of unified NER by effectively modeling the neighboring relations between entity words with Next-Neighboring-Word (NNW) and Tail-Head-Word-* (THW-*) relations. Based on the WNER scheme we develop a neural framework, in which the unified NER is modeled as a 2D grid of word pairs. We then propose multi-granularity 2D convolutions for better refining the grid representations. Finally, a co-predictor is used to sufficiently reason the word-word relations. We perform extensive experiments on 14 widely-used benchmark datasets for flat, overlapped, and discontinuous NER (8 English and 6 Chinese datasets), where our model beats all the current top-performing baselines, pushing the state-of-the-art performances of unified NER.

[1]  Xipeng Qiu,et al.  FLAT: Chinese NER Using Flat-Lattice Transformer , 2020, ACL.

[2]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[3]  Danielle L. Mowery,et al.  Task 1: ShARe/CLEF eHealth Evaluation Lab 2013 , 2013, CLEF.

[4]  Danielle L. Mowery,et al.  Task 2 : ShARe/CLEF eHealth Evaluation Lab 2014 , 2013 .

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Mark A. Przybocki,et al.  The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[7]  Gina-Anne Levow,et al.  The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition , 2006, SIGHAN@COLING/ACL.

[8]  Dong-Hong Ji,et al.  MRN: A Locally and Globally Mention-Based Reasoning Network for Document-Level Relation Extraction , 2021, FINDINGS.

[9]  Donghong Ji,et al.  A Span-Based Model for Joint Overlapped and Discontinuous Named Entity Recognition , 2021, ACL.

[10]  Dan Roth,et al.  Joint Mention Extraction and Classification with Mention Hypergraphs , 2015, EMNLP.

[11]  Tao Gui,et al.  A Lexicon-Based Graph Neural Network for Chinese NER , 2019, EMNLP.

[12]  Xu Sun,et al.  F-Score Driven Max Margin Neural Network for Named Entity Recognition in Chinese Social Media , 2016, EACL.

[13]  Xiang Dai,et al.  Recognizing Complex Entity Mentions: A Review and Future Directions , 2018, ACL.

[14]  Zhepei Wei,et al.  A Novel Cascade Binary Tagging Framework for Relational Triple Extraction , 2019, ACL.

[15]  Cecile Paris,et al.  An Effective Transition-based Model for Discontinuous NER , 2020, ACL.

[16]  Nanyun Peng,et al.  Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings , 2015, EMNLP.

[17]  Minlong Peng,et al.  Simplify the Usage of Lexicon in Chinese NER , 2019, ACL.

[18]  Hiroyuki Shindo,et al.  LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention , 2020, EMNLP.

[19]  Qingcai Chen,et al.  Recognizing Continuous and Discontinuous Adverse Drug Reaction Mentions from Social Media Using LSTM-CRF , 2018, Wirel. Commun. Mob. Comput..

[20]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[21]  Mari Ostendorf,et al.  A general framework for information extraction using dynamic span graphs , 2019, NAACL.

[22]  Feng Hou,et al.  Improving Entity Linking through Semantic Reinforced Entity Embeddings , 2020, ACL.

[23]  Juntao Yu,et al.  Named Entity Recognition as Dependency Parsing , 2020, ACL.

[24]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[25]  Xipeng Qiu,et al.  TENER: Adapting Transformer Encoder for Named Entity Recognition , 2019, ArXiv.

[26]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016 .

[27]  Sophia Ananiadou,et al.  A Neural Layered Model for Nested Named Entity Recognition , 2018, NAACL.

[28]  Wei Lu,et al.  Neural Segmental Hypergraphs for Overlapping Mention Recognition , 2018, EMNLP.

[29]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[30]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[31]  Claire Cardie,et al.  Nested Named Entity Recognition Revisited , 2018, NAACL.

[32]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[33]  Guandong Xu,et al.  A Boundary-aware Neural Model for Nested Named Entity Recognition , 2019, EMNLP.

[34]  Jan Hajic,et al.  Neural Architectures for Nested NER through Linearization , 2019, ACL.

[35]  Lidan Shou,et al.  Pyramid: A Layered Model for Nested Named Entity Recognition , 2020, ACL.

[36]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[37]  Donghong Ji,et al.  Boundaries and edges rethinking: An end-to-end neural model for overlapping entity relation extraction , 2020, Inf. Process. Manag..

[38]  Zhiyuan Liu,et al.  Relation Classification via Multi-Level Attention CNNs , 2016, ACL.

[39]  Sarvnaz Karimi,et al.  Cadec: A corpus of adverse drug event annotations , 2015, J. Biomed. Informatics.

[40]  Hongsong Zhu,et al.  Discontinuous Named Entity Recognition as Maximal Clique Discovery , 2021, ACL.

[41]  Yongliang Shen,et al.  Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition , 2021, ACL.

[42]  Jiwei Li,et al.  A Unified MRC Framework for Named Entity Recognition , 2019, ACL.

[43]  Andrew McCallum,et al.  Fast and Accurate Entity Recognition with Iterated Dilated Convolutions , 2017, EMNLP.

[44]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[45]  Wei Lu,et al.  Combining Spans into Entities: A Neural Two-Stage Approach for Recognizing Discontiguous Entities , 2019, EMNLP.

[46]  Dong-Hong Ji,et al.  Rethinking Boundaries: End-To-End Recognition of Discontinuous Mentions with Pointer Networks , 2021, AAAI.

[47]  Xipeng Qiu,et al.  A Unified Generative Framework for Various NER Subtasks , 2021, ACL.

[48]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[49]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[50]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[51]  Yue Zhang,et al.  Chinese NER Using Lattice LSTM , 2018, ACL.

[52]  Hannaneh Hajishirzi,et al.  Entity, Relation, and Event Extraction with Contextualized Span Representations , 2019, EMNLP.

[53]  Jun Zhao,et al.  Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism , 2018, ACL.

[54]  Hwee Tou Ng,et al.  Towards Robust Linguistic Analysis using OntoNotes , 2013, CoNLL.

[55]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[56]  Hui Jiang,et al.  A Local Detection Approach for Named Entity Recognition and Mention Detection , 2017, ACL.

[57]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[58]  Oriol Vinyals,et al.  Multilingual Language Processing From Bytes , 2015, NAACL.

[59]  Ivan Titov,et al.  Improving Entity Linking by Modeling Latent Relations between Mentions , 2018, ACL.

[60]  Wei Lu,et al.  Learning to Recognize Discontiguous Entities , 2016, EMNLP.

[61]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[62]  Soroush Vosoughi,et al.  Modulating Language Models with Emotions , 2021, FINDINGS.

[63]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.