Exploiting Syntactic and Semantic Information for Relation Extraction from Wikipedia

The exponential growth of Wikipedia recently attracts the attention of a large number of researchers and practitioners. One of the current challenge on Wikipedia is to make the encyclopedia processable for machines. In this paper, we deal with the problem of extracting relations between entities from Wikipedia's En- glish articles, which can straightforwardly be transformed into Semantic Web meta data. We propose a method to exploit syntactic and semantic information for relation extraction. In addition, our method can utilize the nature of Wikipedia to automatically obtain training data. The preliminary results of our experiments strongly support our hyperthesis that using information in higher level of description is better for relation extraction on Wikipedia and show that our method is promising for text understanding.

[1]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[2]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[3]  Thomas S. Morton,et al.  Coreference for NLP Applications , 2000, ACL.

[4]  Raymond J. Mooney,et al.  Text mining with information extraction , 2004 .

[5]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[6]  Andrew McCallum,et al.  Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text , 2006, NAACL.

[7]  Collin F. Baker,et al.  Frame semantics for text understanding , 2001 .

[8]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[9]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[10]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[11]  Dan Roth,et al.  Generalized Inference with Multiple Semantic Role Labeling Systems , 2005, CoNLL.

[12]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[13]  Razvan C. Bunescu,et al.  Extracting Relations from Text: From Word Sequences to Dependency Paths , 2007 .

[14]  Markus Krötzsch,et al.  Semantic Wikipedia , 2006, WikiSym '06.

[15]  Dekang Lin,et al.  Dependency-Based Evaluation of Minipar , 2003 .

[16]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.