A Lexicalized Tree Kernel for Open Information Extraction

In contrast with traditional relation extraction, which only considers a fixed set of relations, Open Information Extraction (Open IE) aims at extracting all types of relations from text. Because of data sparseness, Open IE systems typically ignore lexical information, and instead employ parse trees and Part-of-Speech (POS) tags. However, the same syntactic structure may correspond to different relations. In this paper, we propose to use a lexicalized tree kernel based on the word embeddings created by a neural network model. We show that the lexicalized tree kernel model surpasses the unlexicalized model. Experiments on three datasets indicate that our Open IE system performs better on the task of relation extraction than the stateof-the-art Open IE systems of Xu et al. (2013) and Mesquita et al. (2013).

[1]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[2]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[3]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[4]  Gene H. Golub,et al.  Calculating the singular values and pseudo-inverse of a matrix , 2007, Milestones in Matrix Computation.

[5]  Alessandro Moschitti,et al.  Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction , 2013, ACL.

[6]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[7]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[8]  Michael Collins,et al.  New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron , 2002, ACL.

[9]  Dirk Hovy,et al.  A Walk-Based Semantically Enriched Tree Kernel Over Distributed Word Representations , 2013, EMNLP.

[10]  Andrew McCallum,et al.  Relation Extraction with Matrix Factorization and Universal Schemas , 2013, NAACL.

[11]  Denilson Barbosa,et al.  Effectiveness and Efficiency of Open Relation Extraction , 2013, EMNLP.

[12]  Denilson Barbosa,et al.  Open Information Extraction with Tree Kernels , 2013, NAACL.

[13]  Stephan Bloehdorn,et al.  Structure and semantics for expressive text kernels , 2007, CIKM '07.

[14]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[15]  Roberto Basili,et al.  Structured Lexical Similarity via Convolution Kernels on Dependency Trees , 2011, EMNLP.

[16]  Oren Etzioni,et al.  Open Language Learning for Information Extraction , 2012, EMNLP.