Discovering Relations Between Named Entities from a Large Raw Corpus Using Tree Similarity-Based Clustering

We propose a tree-similarity-based unsupervised learning method to extract relations between Named Entities from a large raw corpus. Our method regards relation extraction as a clustering problem on shallow parse trees. First, we modify previous tree kernels on relation extraction to estimate the similarity between parse trees more efficiently. Then, the similarity between parse trees is used in a hierarchical clustering algorithm to group entity pairs into different clusters. Finally, each cluster is labeled by an indicative word and unreliable clusters are pruned out. Evaluation on the New York Times (1995) corpus shows that our method outperforms the only previous work by 5 in F-measure. It also shows that our method performs well on both high-frequent and less-frequent entity pairs. To the best of our knowledge, this is the first work to use a tree similarity metric in relation clustering.

[1]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[2]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[3]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[4]  Satoshi Sekine,et al.  Extended Named Entity Hierarchy , 2002, LREC.

[5]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[6]  Ralph Grishman,et al.  Discovering Relations among Named Entities from Large Corpora , 2004, ACL.

[7]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[8]  Jun Suzuki,et al.  Convolution Kernels with Feature Selection for Natural Language Processing Tasks , 2004, ACL.

[9]  Alessandro Moschitti,et al.  A Study on Convolution Kernels for Shallow Statistic Parsing , 2004, ACL.

[10]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[11]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[12]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[13]  Michael Collins,et al.  New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron , 2002, ACL.

[14]  Jun Suzuki,et al.  Hierarchical Directed Acyclic Graph Kernel: Methods for Structured Natural Language Data , 2003, ACL.

[15]  Nanda Kambhatla,et al.  Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Information Extraction , 2004, ACL.

[16]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[17]  Scott Miller,et al.  A Novel Use of Statistical Parsing to Extract Information from Text , 2000, ANLP.