GraphIE: A Graph-Based Framework for Information Extraction

Most modern Information Extraction (IE) systems are implemented as sequential taggers and only model local dependencies. Non-local and non-sequential context is, however, a valuable source of information to improve predictions. In this paper, we introduce GraphIE, a framework that operates over a graph representing a broad set of dependencies between textual units (i.e. words or sentences). The algorithm propagates information between connected nodes through graph convolutions, generating a richer representation that can be exploited to improve word-level predictions. Evaluation on three different tasks — namely textual, social media and visual information extraction — shows that GraphIE consistently outperforms the state-of-the-art sequence tagging model by a significant margin.

[1]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[2]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[3]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[4]  Khalil Sima'an,et al.  Graph Convolutional Encoders for Syntax-aware Neural Machine Translation , 2017, EMNLP.

[5]  Regina Barzilay,et al.  Event Discovery in Social Media Feeds , 2011, ACL.

[6]  Diego Marcheggiani,et al.  Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling , 2017, EMNLP.

[7]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[8]  Eric P. Xing,et al.  Harnessing Deep Neural Networks with Logic Rules , 2016, ACL.

[9]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[10]  Heng Ji,et al.  Joint Event Extraction via Structured Prediction with Global Features , 2013, ACL.

[11]  Rui Zhang,et al.  Graph-based Neural Multi-Document Summarization , 2017, CoNLL.

[12]  Krishna P. Gummadi,et al.  You are who you know: inferring user profiles in online social networks , 2010, WSDM '10.

[13]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[16]  Eduard H. Hovy,et al.  Weakly Supervised User Profile Extraction from Twitter , 2014, ACL.

[17]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[18]  Nanyun Peng,et al.  Cross-Sentence N-ary Relation Extraction with Graph LSTMs , 2017, TACL.

[19]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[20]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[21]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[22]  Zhen-Hua Ling,et al.  Hybrid semi-Markov CRF for Neural Sequence Labeling , 2018, ACL.

[23]  Christopher D. Manning,et al.  Graph Convolution over Pruned Dependency Trees Improves Relation Extraction , 2018, EMNLP.

[24]  Alfonso Valencia,et al.  CHEMDNER: The drugs and chemical names extraction challenge , 2015, Journal of Cheminformatics.

[25]  Regina Barzilay,et al.  Multi-Event Extraction Guided by Global Constraints , 2012, NAACL.

[26]  Angus Roberts,et al.  Extracting Clinical Relationships from Patient Narratives , 2008, BioNLP.

[27]  Yonatan Aumann,et al.  Visual information extraction , 2006, Knowledge and Information Systems.

[28]  Hoifung Poon,et al.  Distant Supervision for Relation Extraction beyond the Sentence Boundary , 2016, EACL.

[29]  Toru Hirano,et al.  Recognizing Relation Expression between Named Entities based on Inherent and Context-dependent Features of Relational words , 2010, COLING.

[30]  Stanford,et al.  Learning to Discover Social Circles in Ego Networks , 2012 .

[31]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[32]  Ruslan Salakhutdinov,et al.  Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text , 2018, EMNLP.

[33]  Zhiyong Lu,et al.  The CHEMDNER corpus of chemicals and drugs and its annotation principles , 2015, Journal of Cheminformatics.

[34]  Yue Zhang,et al.  N-ary Relation Extraction using Graph-State LSTM , 2018, EMNLP.

[35]  Gideon S. Mann,et al.  Generalized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data , 2010, J. Mach. Learn. Res..

[36]  Martijn J. Schuemie,et al.  A dictionary to identify small molecules and drugs in free text , 2009, Bioinform..

[37]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[38]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[39]  Mark Stevenson,et al.  Extracting Relations Within and Across Sentences , 2011, RANLP.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.