Nonlinear Evidence Fusion and Propagation for Hyponymy Relation Mining

This paper focuses on mining the hyponymy (or is-a) relation from large-scale, open-domain web documents. A nonlinear probabilistic model is exploited to model the correlation between sentences in the aggregation of pattern matching results. Based on the model, we design a set of evidence combination and propagation algorithms. These significantly improve the result quality of existing approaches. Experimental results conducted on 500 million web pages and hypernym labels for 300 terms show over 20% performance improvement in terms of P@5, MAP and R-Precision.

[1]  Daniel Jurafsky,et al.  Learning Syntactic Patterns for Automatic Hypernym Discovery , 2004, NIPS.

[2]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[3]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[4]  Daniel Jurafsky,et al.  Semantic Taxonomy Induction from Heterogenous Evidence , 2006, ACL.

[5]  Shuming Shi,et al.  Employing Topic Models for Pattern-based Semantic Class Discovery , 2009, ACL/IJCNLP.

[6]  Patrick Pantel,et al.  Automatically Labeling Semantic Classes , 2004, NAACL.

[7]  Partha Pratim Talukdar,et al.  Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition , 2010, ACL.

[8]  Daisy Zhe Wang,et al.  WebTables: exploring the power of tables on the web , 2008, Proc. VLDB Endow..

[9]  Marius Pasca,et al.  The Role of Queries in Ranking Labeled Instances Extracted from Text , 2010, COLING.

[10]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[11]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[12]  Kentaro Torisawa,et al.  Acquiring Hyponymy Relations from Web Documents , 2004, NAACL.

[13]  Shuming Shi,et al.  Nonlinear static-rank computation , 2009, CIKM.

[14]  Ellen Riloff,et al.  Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs , 2008, ACL.

[15]  Xiaojie Yuan,et al.  Corpus-based Semantic Class Mining: Distributional vs. Pattern-Based Approaches , 2010, COLING.

[16]  Marius Pasca,et al.  Acquisition of categorized named entities for web search , 2004, CIKM '04.

[17]  Fabrizio Sebastiani,et al.  Cluster Generation and Cluster Labelling for Web Snippets: A Fast and Accurate Hierarchical Solution , 2006, SPIRE.

[18]  William W. Cohen,et al.  Automatic Set Instance Extraction using the Web , 2009, ACL/IJCNLP.

[19]  Partha Pratim Talukdar,et al.  Weakly-Supervised Acquisition of Labeled Class Instances using Graph Random Walks , 2008, EMNLP.

[20]  Eric Crestan,et al.  Web-Scale Distributional Similarity and Entity Set Expansion , 2009, EMNLP.

[21]  Benjamin Van Durme,et al.  Finding Cars, Goddesses and Enzymes: Parametrizable Acquisition of Labeled Instances for Open-Domain Information Extraction , 2008, AAAI.