Exploring the Efficiency of Batch Active Learning for Human-in-the-Loop Relation Extraction

Domain-specific relation extraction requires training data for supervised learning models, and thus, significant labeling effort. Distant supervision is often leveraged for creating large annotated corpora however these methods require handling the inherent noise. On the other hand, active learning approaches can reduce the annotation cost by selecting the most beneficial examples to label in order to learn a good model. The choice of examples can be performed sequentially, i.e. select one example in each iteration, or in batches, i.e. select a set of examples in each iteration. The optimization of the batch size is a practical problem faced in every real-world application of active learning, however it is often treated as a parameter decided in advance. In this work, we study the trade-off between model performance, the number of requested labels in a batch and the time spent in each round for real-time, domain specific relation extraction. Our results show that the use of an appropriate batch size produces competitive performance, even compared to a fully sequential strategy, while reducing the training time dramatically.

[1]  Quoc V. Le,et al.  Don't Decay the Learning Rate, Increase the Batch Size , 2017, ICLR.

[2]  Anna Lisa Gentile,et al.  Language Agnostic Dictionary Extraction , 2017, International Semantic Web Conference.

[3]  Ralph Grishman,et al.  Extracting Relations with Integrated Information Using Kernel Methods , 2005, ACL.

[4]  Bowen Zhou,et al.  Classifying Relations by Ranking with Convolutional Neural Networks , 2015, ACL.

[5]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Christopher D. Manning,et al.  Combining Distant and Partial Supervision for Relation Extraction , 2014, EMNLP.

[8]  Christopher De Sa,et al.  Data Programming: Creating Large Training Sets, Quickly , 2016, NIPS.

[9]  Qiongkai Xu,et al.  Unsupervised Pre-training With Seq2Seq Reconstruction Loss for Deep Relation Extraction Models , 2016, ALTA.

[10]  Razvan C. Bunescu,et al.  Subsequence Kernels for Relation Extraction , 2005, NIPS.

[11]  Jian Su,et al.  Exploring Various Knowledge in Relation Extraction , 2005, ACL.

[12]  Heike Adel,et al.  Comparing Convolutional Neural Networks to Traditional Models for Slot Filling , 2016, NAACL.

[13]  Hsuan-Tien Lin,et al.  Active Learning by Learning , 2015, AAAI.

[14]  Houfeng Wang,et al.  Bidirectional Recurrent Convolutional Neural Network for Relation Classification , 2016, ACL.

[15]  Dale Schuurmans,et al.  Discriminative Batch Mode Active Learning , 2007, NIPS.

[16]  Anna Lisa Gentile,et al.  Mining Relations from Unstructured Content , 2018, PAKDD.

[17]  Nguyen Bach,et al.  A Review of Relation Extraction , 2007 .

[18]  Klaus Brinker,et al.  Incorporating Diversity in Active Learning with Support Vector Machines , 2003, ICML.

[19]  Isabelle Augenstein,et al.  Distantly supervised Web relation extraction for knowledge base population , 2016, Semantic Web.

[20]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[21]  Lorenzo Bruzzone,et al.  Batch-Mode Active-Learning Methods for the Interactive Classification of Remote Sensing Images , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[22]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[23]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[24]  Isabelle Augenstein,et al.  Unsupervised wrapper induction using linked data , 2013, K-CAP.

[25]  Gerhard Weikum,et al.  Combining linguistic and statistical analysis to extract relations from web documents , 2006, KDD '06.

[26]  Silvio Savarese,et al.  A Geometric Approach to Active Learning for Convolutional Neural Networks , 2017, ArXiv.

[27]  Thomas Demeester,et al.  Using active learning and semantic clustering for noise reduction in distant supervision , 2014, NIPS 2014.

[28]  Preslav Nakov,et al.  SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals , 2009, SEW@NAACL-HLT.

[29]  Xuanjing Huang,et al.  Attention-Based Convolutional Neural Network for Semantic Relation Extraction , 2016, COLING.

[30]  Zhiyuan Liu,et al.  Relation Classification via Multi-Level Attention CNNs , 2016, ACL.

[31]  Heng Ji,et al.  A Dependency-Based Neural Network for Relation Classification , 2015, ACL.

[32]  Ralph Grishman,et al.  Relation Extraction: Perspective from Convolutional Neural Networks , 2015, VS@HLT-NAACL.

[33]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[35]  Sethuraman Panchanathan,et al.  Adaptive Batch Mode Active Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[36]  Jun Zhao,et al.  Distant Supervision for Relation Extraction with Sentence-Level Attention and Entity Descriptions , 2017, AAAI.

[37]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[38]  Ralph Grishman,et al.  An Efficient Active Learning Framework for New Relation Types , 2013, IJCNLP.

[39]  Huang Xun,et al.  A Review of Relation Extraction , 2013 .

[40]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[41]  Rong Jin,et al.  Large-scale text categorization by batch mode active learning , 2006, WWW '06.

[42]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[43]  Rishabh K. Iyer,et al.  Submodularity in Data Subset Selection and Active Learning , 2015, ICML.

[44]  Ngoc Thang Vu,et al.  Combining Recurrent and Convolutional Neural Networks for Relation Classification , 2016, NAACL.

[45]  Nanda Kambhatla,et al.  Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Information Extraction , 2004, ACL.

[46]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[47]  Cong Liu,et al.  Semantic Relation Classification via Hierarchical Recurrent Neural Network with Attention , 2016, COLING.

[48]  Gabriel Stanovsky,et al.  Recognizing Mentions of Adverse Drug Reaction in Social Media Using Knowledge-Infused Recurrent Models , 2017, EACL.

[49]  Ralph Grishman,et al.  Employing Word Representations and Regularization for Domain Adaptation of Relation Extraction , 2014, ACL.

[50]  Neal Lewis,et al.  SPOT the Drug! An Unsupervised Pattern Matching Method to Extract Drug Names from Very Large Clinical Corpora , 2012, 2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology.

[51]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[52]  Dietrich Klakow,et al.  A survey of noise reduction methods for distant supervision , 2013, AKBC '13.

[53]  Ralph Grishman,et al.  Active learning for relation type extension with local and global data views , 2012, CIKM '12.

[54]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[55]  Razvan C. Bunescu,et al.  Learning to Extract Relations from the Web using Minimal Supervision , 2007, ACL.

[56]  Xiaobin Wang,et al.  A Novel Approach for Relation Extraction with Few Labeled Data , 2016, SMP.

[57]  Jieping Ye,et al.  Querying discriminative and representative samples for batch mode active learning , 2013, KDD.

[58]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[59]  Wanxiang Che,et al.  Convolution Neural Network for Relation Extraction , 2013, ADMA.

[60]  Fang Kong,et al.  Exploiting Constituent Dependencies for Tree Kernel-Based Semantic Relation Extraction , 2008, COLING.

[61]  Dongyan Zhao,et al.  Semantic Relation Classification via Convolutional Neural Networks with Simple Negative Sampling , 2015, EMNLP.