Wikipedia Sets: Context-Oriented Related Entity Acquisition from Multiple Words

In this paper, we propose a method which acquires related words (entities) from multiple words by naturally disambiguating their meaning and considering their contexts. In addition, we introduce a bootstrapping method for improving the coverage of association relations. Experimental result shows that our method can acquire related words depending on the contexts of multiple words compared to the ESA-based method.

[1]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[2]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[3]  J. Curran,et al.  Minimising semantic drift with Mutual Exclusion Bootstrapping , 2007 .

[4]  Katherine A. Heller,et al.  Bayesian Sets , 2005, NIPS.

[5]  William W. Cohen,et al.  Language-Independent Set Expansion of Named Entities Using the Web , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[6]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[7]  Takahiro Hara,et al.  Wikipedia Mining for an Association Web Thesaurus Construction , 2007, WISE.

[8]  Eduardo Mena,et al.  Web-Based Measure of Semantic Relatedness , 2008, WISE.

[9]  Takahiro Hara,et al.  A Thesaurus Construction Method from Large ScaleWeb Dictionaries , 2007, 21st International Conference on Advanced Information Networking and Applications (AINA '07).

[10]  Hinrich Schütze,et al.  A Cooccurrence-Based Thesaurus and Two Applications to Information Retrieval , 1994, Inf. Process. Manag..

[11]  Yuji Matsumoto,et al.  Graph-based Analysis of Semantic Drift in Espresso-like Bootstrapping Algorithms , 2008, EMNLP.

[12]  Ellen Riloff,et al.  Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[13]  Wei-Ying Ma,et al.  Building a web thesaurus from web link structure , 2003, SIGIR.