Enhanced Language Representation with Label Knowledge for Span Extraction

Span extraction, aiming to extract text spans (such as words or phrases) from plain texts, is a fundamental process in Information Extraction. Recent works introduce the label knowledge to enhance the text representation by formalizing the span extraction task into a question answering problem (QA Formalization), which achieves state-of-the-art performance. However, QA Formalization does not fully exploit the label knowledge and suffers from low efficiency in training/inference. To address those problems, we introduce a new paradigm to integrate label knowledge and further propose a novel model to explicitly and efficiently integrate label knowledge into text representations. Specifically, it encodes texts and label annotations independently and then integrates label knowledge into text representation with an elaborate-designed semantics fusion module. We conduct extensive experiments on three typical span extraction tasks: flat NER, nested NER, and event detection. The empirical results show that 1) our method achieves state-of-the-art performance on four benchmarks, and 2) reduces training time and inference time by 76% and 77% on average, respectively, compared with the QA Formalization paradigm. Our code and data are available at https://github.com/ Akeepers/LEAR.

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Nianwen Xue,et al.  CoNLL-2011 Shared Task: Modeling Unrestricted Coreference in OntoNotes , 2011, CoNLL Shared Task.

[3]  Jan Hajic,et al.  Neural Architectures for Nested NER through Linearization , 2019, ACL.

[4]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[5]  Dongsheng Li,et al.  Exploring Pre-trained Language Models for Event Extraction and Generation , 2019, ACL.

[6]  Zhepei Wei,et al.  A Novel Cascade Binary Tagging Framework for Relational Triple Extraction , 2019, ACL.

[7]  Yaojie Lu,et al.  Cost-sensitive Regularization for Label Confusion-aware Event Detection , 2019, ACL.

[8]  Satoshi Sekine,et al.  Definition, Dictionaries and Tagger for Extended Named Entity Hierarchy , 2004, LREC.

[9]  Eduard Hovy,et al.  Nested Named Entity Recognition via Second-best Sequence Learning and Decoding , 2019, Transactions of the Association for Computational Linguistics.

[10]  Jun Zhao,et al.  Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms , 2017, ACL.

[11]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[12]  Xiang Zhang,et al.  Automatically Labeled Data Generation for Large Scale Event Extraction , 2017, ACL.

[13]  Jun Zhao,et al.  Collective Event Detection via a Hierarchical and Bias Tagging Networks with Gated Multi-level Attention Mechanisms , 2018, EMNLP.

[14]  Christopher D. Manning,et al.  Nested Named Entity Recognition , 2009, EMNLP.

[15]  Mark A. Przybocki,et al.  The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[16]  Omer Levy,et al.  Zero-Shot Relation Extraction via Reading Comprehension , 2017, CoNLL.

[17]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[18]  Andrew McCallum,et al.  Fast and Accurate Entity Recognition with Iterated Dilated Convolutions , 2017, EMNLP.

[19]  Juan-Zi Li,et al.  Improving Event Detection via Open-domain Trigger Knowledge , 2020, ACL.

[20]  Jiwei Li,et al.  A Unified MRC Framework for Named Entity Recognition , 2019, ACL.

[21]  Juntao Yu,et al.  Named Entity Recognition as Dependency Parsing , 2020, ACL.

[22]  Xiao Liu,et al.  Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation , 2018, EMNLP.

[23]  Quan Wang,et al.  Event Extraction as Multi-turn Question Answering , 2020, FINDINGS.

[24]  Claire Cardie,et al.  Nested Named Entity Recognition Revisited , 2018, NAACL.

[25]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[26]  Jun Zhao,et al.  Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks , 2015, ACL.

[27]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[28]  Ralph Grishman,et al.  Joint Event Extraction via Recurrent Neural Networks , 2016, NAACL.

[29]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[30]  Yue Zhang,et al.  Chinese NER Using Lattice LSTM , 2018, ACL.

[31]  Jian Liu,et al.  Event Extraction as Machine Reading Comprehension , 2020, EMNLP.

[32]  Wei Wu,et al.  Glyce: Glyph-vectors for Chinese Character Representations , 2019, NeurIPS.

[33]  Mari Ostendorf,et al.  A general framework for information extraction using dynamic span graphs , 2019, NAACL.

[34]  Jun Zhao,et al.  Leveraging FrameNet to Improve Automatic Event Detection , 2016, ACL.

[35]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[36]  Mingxin Zhou,et al.  Entity-Relation Extraction as Multi-Turn Question Answering , 2019, ACL.

[37]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38]  Claire Cardie,et al.  Event Extraction by Answering (Almost) Natural Questions , 2020, EMNLP.

[39]  Gina-Anne Levow,et al.  The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition , 2006, SIGHAN@COLING/ACL.

[40]  Ken Chen,et al.  Label-Aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition , 2018, NAACL.

[41]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[42]  Wei Lu,et al.  Neural Segmental Hypergraphs for Overlapping Mention Recognition , 2018, EMNLP.

[43]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[44]  David Ahn,et al.  The stages of event extraction , 2006 .

[45]  Yaojie Lu,et al.  Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks , 2019, ACL.