A Re-examination of Dependency Path Kernels for Relation Extraction

Extracting semantic relations between entities from natural language text is an important step towards automatic knowledge extraction from large text collections and the Web. The state-of-the-art approach to relation extraction employs Support Vector Machines (SVM) and kernel methods for classification. Despite the diversity of kernels and the near exhaustive trial-and-error on kernel combination, there lacks a clear understanding of how these kernels relate to each other and why some are superior than others. In this paper, we provide an analysis of the relative strength and weakness of several kernels through systematic experimentation. We show that relation extraction can benefit from increasing the feature space through convolution kernel and introducing bias towards more syntactically meaningful feature space. Based on our analysis, we propose a new convolution dependency path kernel that combines the above two benefits. Our experimental results on the standard ACE 2003 datasets demonstrate that our new kernel gives consistent and significantly better performance than baseline methods, obtaining very competitive results to the state-ofthe-art performance.

[1]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[2]  Claudio Giuliano,et al.  Exploiting Shallow Linguistic Information for Relation Extraction from Biomedical Literature , 2006, EACL.

[3]  Jian Su,et al.  Exploring Syntactic Features for Relation Extraction using a Convolution Tree Kernel , 2006, NAACL.

[4]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[5]  Jian Su,et al.  A Composite Kernel to Extract Relations between Entities with Both Flat and Structured Features , 2006, ACL.

[6]  Yang Jin,et al.  Simple Algorithms for Complex Relation Extraction with Applications to Biomedical IE , 2005, ACL.

[7]  Razvan C. Bunescu,et al.  Subsequence Kernels for Relation Extraction , 2005, NIPS.

[8]  Jian Su,et al.  Exploring Various Knowledge in Relation Extraction , 2005, ACL.

[9]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[10]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[11]  Nanda Kambhatla,et al.  Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Information Extraction , 2004, ACL.

[12]  Ralph Grishman,et al.  Extracting Relations with Integrated Information Using Kernel Methods , 2005, ACL.

[13]  ChengXiang Zhai,et al.  A Systematic Exploration of the Feature Space for Relation Extraction , 2007, NAACL.

[14]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[15]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[16]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[17]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[18]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .