Mining Relations from Unstructured Content

Extracting relations from unstructured Web content is a challenging task and for any new relation a significant effort is required to design, train and tune the extraction models. In this work, we investigate how to obtain suitable results for relation extraction with modest human efforts, relying on a dynamic active learning approach. We propose a method to reliably generate high quality training/test data for relation extraction - for any generic user-demonstrated relation, starting from a few user provided examples and extracting valuable samples from unstructured and unlabeled Web content. To this extent we propose a strategy which learns how to identify the best order to human-annotate data, maximizing learning performance early in the process. We demonstrate the viability of the approach (i) against state of the art datasets for relation extraction as well as (ii) a real case study identifying text expressing a causal relation between a drug and an adverse reaction from user generated Web content.

[1]  Wei Shi,et al.  Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification , 2016, ACL.

[2]  Christopher D. Manning,et al.  Combining Distant and Partial Supervision for Relation Extraction , 2014, EMNLP.

[3]  Ralph Grishman,et al.  An Efficient Active Learning Framework for New Relation Types , 2013, IJCNLP.

[4]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[5]  Paul N. Bennett,et al.  Dual Strategy Active Learning , 2007, ECML.

[6]  Cong Liu,et al.  Semantic Relation Classification via Hierarchical Recurrent Neural Network with Attention , 2016, COLING.

[7]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[8]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[11]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[12]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[13]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[14]  Preslav Nakov,et al.  SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals , 2009, SEW@NAACL-HLT.

[15]  Ngoc Thang Vu,et al.  Combining Recurrent and Convolutional Neural Networks for Relation Classification , 2016, NAACL.

[16]  Hervé Bourlard,et al.  Generalization and Parameter Estimation in Feedforward Netws: Some Experiments , 1989, NIPS.

[17]  Razvan C. Bunescu,et al.  Subsequence Kernels for Relation Extraction , 2005, NIPS.

[18]  Rong Jin,et al.  Active Learning by Querying Informative and Representative Examples , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[20]  Ralph Grishman,et al.  Relation Extraction: Perspective from Convolutional Neural Networks , 2015, VS@HLT-NAACL.

[21]  Gabriel Stanovsky,et al.  Recognizing Mentions of Adverse Drug Reaction in Social Media Using Knowledge-Infused Recurrent Models , 2017, EACL.

[22]  Sanjeev Arora,et al.  A Simple but Tough-to-Beat Baseline for Sentence Embeddings , 2017, ICLR.

[23]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[24]  Dietrich Klakow,et al.  A survey of noise reduction methods for distant supervision , 2013, AKBC '13.

[25]  Thomas Demeester,et al.  Using active learning and semantic clustering for noise reduction in distant supervision , 2014, NIPS 2014.

[26]  Christopher De Sa,et al.  Data Programming: Creating Large Training Sets, Quickly , 2016, NIPS.

[27]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[28]  Hsuan-Tien Lin,et al.  Active Learning by Learning , 2015, AAAI.

[29]  Isabelle Augenstein,et al.  Distantly supervised Web relation extraction for knowledge base population , 2016, Semantic Web.

[30]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[31]  Isabelle Augenstein,et al.  Unsupervised wrapper induction using linked data , 2013, K-CAP.

[32]  Anna Lisa Gentile,et al.  Language Agnostic Dictionary Extraction , 2017, International Semantic Web Conference.

[33]  Ralph Grishman,et al.  Extracting Relations with Integrated Information Using Kernel Methods , 2005, ACL.

[34]  Heike Adel,et al.  Comparing Convolutional Neural Networks to Traditional Models for Slot Filling , 2016, NAACL.

[35]  Jun Zhao,et al.  Distant Supervision for Relation Extraction with Sentence-Level Attention and Entity Descriptions , 2017, AAAI.