On the Limits of Learning to Actively Learn Semantic Representations

One of the goals of natural language understanding is to develop models that map sentences into meaning representations. However, training such models requires expensive annotation of complex structures, which hinders their adoption. Learning to actively-learn (LTAL) is a recent paradigm for reducing the amount of labeled data by learning a policy that selects which samples should be labeled. In this work, we examine LTAL for learning semantic representations, such as QA-SRL. We show that even an oracle policy that is allowed to pick examples that maximize performance on the test set (and constitutes an upper bound on the potential of LTAL), does not substantially improve performance compared to a random policy. We investigate factors that could explain this finding and show that a distinguishing characteristic of successful applications of LTAL is the interaction between optimization and the oracle policy selection process. In successful applications of LTAL, the examples selected by the oracle policy do not substantially depend on the optimization procedure, while in our setup the stochastic nature of optimization strongly affects the examples selected by the oracle. We conclude that the current applicability of LTAL for improving data efficiency in learning semantic meaning representations is limited.

[1]  Yuan Li,et al.  Learning how to Active Learn: A Deep Reinforcement Learning Approach , 2017, EMNLP.

[2]  Andrew McCallum,et al.  Automatically Extracting Action Graphs from Materials Science Synthesis Procedures , 2017, ArXiv.

[3]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[4]  Ari Rappoport,et al.  Universal Conceptual Cognitive Annotation (UCCA) , 2013, ACL.

[5]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[6]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[7]  Sampo Pyysalo,et al.  Overview of BioNLP’09 Shared Task on Event Extraction , 2009, BioNLP@HLT-NAACL.

[8]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[9]  Stephan Oepen,et al.  SemEval 2014 Task 8: Broad-Coverage Semantic Dependency Parsing , 2014, *SEMEVAL.

[10]  Gholamreza Haffari,et al.  Learning How to Active Learn by Dreaming , 2019, ACL.

[11]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[12]  Peter Clark,et al.  Modeling Biological Processes for Reading Comprehension , 2014, EMNLP.

[13]  Luke S. Zettlemoyer,et al.  Question-Answer Driven Semantic Role Labeling: Using Natural Language to Annotate Natural Language , 2015, EMNLP.

[14]  Luke S. Zettlemoyer,et al.  Human-in-the-Loop Parsing , 2016, EMNLP.

[15]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[16]  Philip Bachman,et al.  Learning Algorithms for Active Learning , 2017, ICML.

[17]  Ari Rappoport,et al.  Multitask Parsing Across Semantic Representations , 2018, ACL.

[18]  Sampo Pyysalo,et al.  Overview of BioNLP Shared Task 2013 , 2013, BioNLP@ACL.

[19]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[20]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[21]  Guillaume Lample,et al.  Massively Multilingual Word Embeddings , 2016, ArXiv.

[22]  Andrew McCallum,et al.  Reducing Labeling Effort for Structured Prediction Tasks , 2005, AAAI.

[23]  David Lowell,et al.  Practical Obstacles to Deploying Active Learning , 2019, EMNLP/IJCNLP.

[24]  Andrew McCallum,et al.  Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks , 2018, J. Chem. Inf. Model..

[25]  Mingbo Ma,et al.  Breaking the Beam Search Curse: A Study of (Re-)Scoring Methods and Stopping Criteria for Neural Machine Translation , 2018, EMNLP.

[26]  Luke S. Zettlemoyer,et al.  Large-Scale QA-SRL Parsing , 2018, ACL.

[27]  Gholamreza Haffari,et al.  Learning How to Actively Learn: A Deep Imitation Learning Approach , 2018, ACL.

[28]  Pascal Fua,et al.  Learning Active Learning from Data , 2017, NIPS.

[29]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[30]  Zachary C. Lipton,et al.  Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study , 2018, EMNLP.

[31]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[32]  Julien Cornebise,et al.  Weight Uncertainty in Neural Networks , 2015, ArXiv.