AutoTriggER: Named Entity Recognition with Auxiliary Trigger Extraction

While deep neural models for named entity recognition (NER) have shown impressive results, these models often require additional human annotation that is expensive and time-consuming to generate, especially for new or low-resource domains. Some success has been shown replacing conventional human annotation with distant supervision or other meta-level information (e.g. explanations). However, the costs of generating this additional information can still be prohibitive, especially in domains where existing resources (e.g. databases to be used for distant supervision) may not exist. In this paper, we present a novel two-stage framework (AutoTriggER) to improve NER performance by automatically generating and leveraging “entity triggers” for named entity recognition. These triggers—essentially human-readable “clues” in the text that can help guide the model to better decisions—are first identified automatically using a sampling and occlusion algorithm. Next, we propose a trigger interpolation network to leverage these triggers in a transformer-based NER model. By combining these stages, AutoTriggER is able to both create and leverage auxiliary supervision by itself. Through experiments on three well-studied NER datasets, we show that our automatically extracted triggers are well-matched to human triggers, and AutoTriggER improves performance over a standard RoBERTa-CRF architecture by nearly 0.5 F1 points on average and much more in a low resource setting.1 ACM Reference Format: Dong-Ho Lee1,3*, Ravi Kiran Selvam1*, Sheikh Muhammad Sarwar2, Bill Yuchen Lin1,, Fred Morstatter3, Jay Pujara1,3, Elizabeth Boschee3, James Allan2, Xiang Ren1,3. 2020. AutoTriggER: Named Entity Recognition with Auxiliary Trigger Extraction. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’20), August 23– 27, 2020, Virtual Event, CA, USA. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3394486.3403153

[1]  James Allan,et al.  Term Relevance Feedback for Contextual Named Entity Retrieval , 2018, CHIIR.

[2]  Teng Ren,et al.  Learning Named Entity Tagger using Domain-Specific Dictionary , 2018, EMNLP.

[3]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[4]  James Allan,et al.  Named Entity Recognition with Extremely Limited Data , 2018, ArXiv.

[5]  Shiying Luo,et al.  Weakly Supervised Sequence Tagging from Noisy Rules , 2020, AAAI.

[6]  Xiang Ren,et al.  Empower Sequence Labeling with Task-Aware Neural Language Model , 2017, AAAI.

[7]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[8]  Min Zhang,et al.  Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning , 2018, COLING.

[9]  Yonatan Belinkov,et al.  Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks , 2016, ICLR.

[10]  Eric P. Xing,et al.  Harnessing Deep Neural Networks with Logic Rules , 2016, ACL.

[11]  Martin Hofmann-Apitius,et al.  COVID-19 Knowledge Graph: a computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology , 2020, bioRxiv.

[12]  Noah A. Smith,et al.  Evaluating Models’ Local Decision Boundaries via Contrast Sets , 2020, FINDINGS.

[13]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[14]  Jason Eisner,et al.  Modeling Annotators: A Generative Approach to Learning from Annotator Rationales , 2008, EMNLP.

[15]  Xuanjing Huang,et al.  Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning , 2019, ACL.

[16]  Zhiyong Lu,et al.  BioCreative V CDR task corpus: a resource for chemical disease relation extraction , 2016, Database J. Biol. Databases Curation.

[17]  Vishrav Chaudhary,et al.  Self-training Improves Pre-training for Natural Language Understanding , 2020, NAACL.

[18]  Christopher Ré,et al.  Snorkel: Rapid Training Data Creation with Weak Supervision , 2017, Proc. VLDB Endow..

[19]  Nigel Collier,et al.  Introduction to the Bio-entity Recognition Task at JNLPBA , 2004, NLPBA/BioNLP.

[20]  Xiang Wan,et al.  Improving Named Entity Recognition with Attentive Ensemble of Syntactic Information , 2020, FINDINGS.

[21]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[22]  Leonardo Neves,et al.  NERO: A Neural Rule Grounding Framework for Label-Efficient Relation Extraction , 2020, WWW.

[23]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[24]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[25]  Fei Xia,et al.  Improving biomedical named entity recognition with syntactic information , 2020, BMC Bioinform..

[26]  Mark Hopkins,et al.  Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples , 2018, ACL.

[27]  Xiang Ren,et al.  Teaching Machine Comprehension with Compositional Explanations , 2020, FINDINGS.

[28]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[29]  R. Mooney,et al.  Explanation-Based Learning: An Alternative View , 1986, Machine Learning.

[30]  Chandra Bhagavatula,et al.  Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.

[31]  Chin-Yew Lin,et al.  Towards Improving Neural Named Entity Recognition with Gazetteers , 2019, ACL.

[32]  Yuliang Li,et al.  Snippext: Semi-supervised Opinion Mining with Augmented Data , 2020, WWW.

[33]  Sameer Singh,et al.  Beyond Accuracy: Behavioral Testing of NLP Models with CheckList , 2020, ACL.

[34]  Kewei Tu,et al.  Cold-start and Interpretability: Turning Regular Expressions into Trainable Recurrent Neural Networks , 2020, EMNLP.

[35]  Xiang Ren,et al.  Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models , 2020, ICLR.

[36]  Rahul Khanna,et al.  LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation , 2020, ACL.

[37]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[38]  Thomas de Quincey [C] , 2000, The Works of Thomas De Quincey, Vol. 1: Writings, 1799–1820.

[39]  Guillaume Lample,et al.  What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.

[40]  Gorjan Alagic,et al.  #p , 2019, Quantum information & computation.

[41]  Daniel Jurafsky,et al.  Understanding Neural Networks through Representation Erasure , 2016, ArXiv.

[42]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[43]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[44]  R. Thomas McCoy,et al.  Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference , 2019, ACL.

[45]  Christopher Ré,et al.  Training Classifiers with Natural Language Explanations , 2018, ACL.

[46]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[47]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[48]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition , 2002, CoNLL.

[49]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[50]  Eric P. Xing,et al.  Toward Controlled Generation of Text , 2017, ICML.

[51]  Jacob Eisenstein,et al.  AdvAug: Robust Adversarial Augmentation for Neural Machine Translation , 2020, ACL.

[52]  Xiao Huang,et al.  TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition , 2020, ACL.

[53]  Jun Yan,et al.  Learning from Explanations with Neural Execution Tree , 2020, ICLR.