Extract Reliable Relations from Wikipedia Texts for Practical Ontology Construction

A feature based relation classification approach is presented in this paper. We aimed to exact relation candidates from Wikipedia texts. A probabilistic and a semantic relatedness features are employed with other linguistic information for the purpose. The experiments show that, relation classification using the proposed relatedness features with surface information like word and part-of-speech tags is competitive with or even outperforms the one of using deep syntactic information. Meanwhile, an approach is proposed to distinguish reliable relation candidates from others, so that these reliable results can be accepted for knowledge building without human verification. The experiments show that, with the relation classification approach presented in this paper, more than 40% of the classification results are reliable, which means, at least 40% of the human and time costs can be saved in practice.

[1]  Barbara Rosario,et al.  Multi-way Relation Classification: Application to Protein-Protein Interactions , 2005, HLT.

[2]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[3]  Dong-Hong Ji,et al.  Relation Extraction Using Label Propagation Based Semi-Supervised Learning , 2006, ACL.

[4]  Dan I. Moldovan,et al.  Automatic Discovery of Part-Whole Relations , 2006, CL.

[5]  Jian Su,et al.  Exploring Syntactic Features for Relation Extraction using a Convolution Tree Kernel , 2006, NAACL.

[6]  Dan I. Moldovan,et al.  Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations , 2003, NAACL.

[7]  Chang-Hyun Kim,et al.  Feature-Based Relation Classification Using Quantified Relatedness Information , 2010 .

[8]  Du-Seong Chang,et al.  Incremental cue phrase learning and bootstrapping method for causality extraction using cue phrase and word pair probabilities , 2006, Inf. Process. Manag..

[9]  Key-Sun Choi,et al.  Enriching Core Ontology with Domain Thesaurus through Concept and Relation Classification , 2007 .

[10]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[11]  Nanda Kambhatla,et al.  Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Information Extraction , 2004, ACL.

[12]  Razvan C. Bunescu,et al.  Learning to Extract Relations from the Web using Minimal Supervision , 2007, ACL.

[13]  Zhu Zhang,et al.  Mining relational data from text: From strictly supervised to weakly supervised learning , 2008, Inf. Syst..

[14]  Hugo Liu,et al.  ConceptNet — A Practical Commonsense Reasoning Tool-Kit , 2004 .

[15]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[16]  Jian Su,et al.  Exploring Various Knowledge in Relation Extraction , 2005, ACL.

[17]  Guodong Zhou,et al.  Extracting relation information from text documents by exploring various types of knowledge , 2007, Inf. Process. Manag..