LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation

Successfully training a deep neural network demands a huge corpus of labeled data. However, each label only provides limited information to learn from and collecting the requisite number of labels involves massive human effort. In this work, we introduce LEAN-LIFE, a web-based, Label-Efficient AnnotatioN framework for sequence labeling and classification tasks, with an easy-to-use UI that not only allows an annotator to provide the needed labels for a task, but also enables LearnIng From Explanations for each labeling decision. Such explanations enable us to generate useful additional labeled data from unlabeled instances, bolstering the pool of available training data. On three popular NLP tasks (named entity recognition, relation extraction, sentiment analysis), we find that using this enhanced supervision allows our models to surpass competitive baseline F1 scores by more than 5-10 percentage points, while using 2X times fewer labeled instances. Our framework is the first to utilize this enhanced supervision technique and does so for three important tasks -- thus providing improved annotation recommendations to users and an ability to build datasets of (data, label, explanation) triples instead of the regular (data, label) pair.

[1]  Tom M. Mitchell,et al.  Joint Concept Learning and Semantic Parsing from Natural Language Explanations , 2017, EMNLP.

[2]  Dan Klein,et al.  Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Iryna Gurevych,et al.  A Web-based Tool for the Integrated Annotation of Semantic and Syntactic Structures , 2016, LT4DH@COLING.

[4]  Sampo Pyysalo,et al.  brat: a Web-based Tool for NLP-Assisted Text Annotation , 2012, EACL.

[5]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[6]  Danqi Chen,et al.  Position-aware Attention and Supervised Data Improve Slot Filling , 2017, EMNLP.

[7]  Jie Yang,et al.  YEDDA: A Lightweight Collaborative Text Span Annotation Tool , 2017, ACL.

[8]  Thomas S. Morton,et al.  WordFreak: An Open Tool for Linguistic Annotation , 2003, HLT-NAACL.

[9]  Jun Zhao,et al.  Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks , 2015, EMNLP.

[10]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[11]  Jian Ni,et al.  Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection , 2017, ACL.

[12]  Leonardo Neves,et al.  NERO: A Neural Rule Grounding Framework for Label-Efficient Relation Extraction , 2020, WWW.

[13]  Yue Zhang,et al.  Domain Adaptation for CRF-based Chinese Word Segmentation using Free Annotations , 2014, EMNLP.

[14]  Anima Anandkumar,et al.  Deep Active Learning for Named Entity Recognition , 2017, Rep4NLP@ACL.

[15]  He Jiang,et al.  Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling , 2020, ACL.

[16]  Li Zhao,et al.  Attention-based LSTM for Aspect-level Sentiment Classification , 2016, EMNLP.

[17]  Min Zhang,et al.  Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning , 2018, COLING.

[18]  Kalina Bontcheva,et al.  The GATE Crowdsourcing Plugin: Crowdsourcing Annotated Corpora Made Easy , 2014, EACL.

[19]  Thomas Lukasiewicz,et al.  e-SNLI: Natural Language Inference with Natural Language Explanations , 2018, NeurIPS.

[20]  Liyuan Liu,et al.  Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction , 2019, EMNLP.

[21]  Kenny Q. Zhu,et al.  Multi-channel BiLSTM-CRF Model for Emerging Named Entity Recognition in Social Media , 2017, NUT@EMNLP.

[22]  Teng Ren,et al.  Learning Named Entity Tagger using Domain-Specific Dictionary , 2018, EMNLP.

[23]  Wei Lu,et al.  Neural Adaptation Layers for Cross-domain Named Entity Recognition , 2018, EMNLP.

[24]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[25]  Xiang Ren,et al.  AlpacaTag: An Active Learning-based Crowd Annotation Framework for Sequence Tagging , 2019, ACL.

[26]  Xiao Huang,et al.  TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition , 2020, ACL.

[27]  Kenny Q. Zhu,et al.  ExtRA: Extracting Prominent Review Aspects from Customer Feedback , 2018, EMNLP.

[28]  Christopher Ré,et al.  Training Classifiers with Natural Language Explanations , 2018, ACL.

[29]  Jun Yan,et al.  Learning from Explanations with Neural Execution Tree , 2020, ICLR.

[30]  Haris Papageorgiou,et al.  SemEval-2016 Task 5: Aspect Based Sentiment Analysis , 2016, *SEMEVAL.