论文信息 - Automatic Discovery of Attribute Words from Web Documents

Automatic Discovery of Attribute Words from Web Documents

We propose a method of acquiring attribute words for a wide range of objects from Japanese Web documents. The method is a simple unsupervised method that utilizes the statistics of words, lexico-syntactic patterns, and HTML tags. To evaluate the attribute words, we also establish criteria and a procedure based on question-answerability about the candidate word.

Kentaro Torisawa | Jun'ichi Kazama | Kosuke Tokunaga

[1] Jun'ichi Tsujii,et al. Integrating Tables on the World Wide Web , 2004 .

[2] Kentaro Torisawa,et al. Extracting Attributes and their Values from Web pages , 2003, Web Document Analysis.

[3] Massimo Poesio,et al. Attribute-Based and Value-Based Clustering: An Evaluation , 2004, EMNLP.

[4] Nicola Guarino,et al. Concepts, attributes and arbitrary relations , 1992, Data Knowl. Eng..

[5] James Pustejovsky,et al. The Generative Lexicon , 1995, CL.

[6] Eugene Charniak,et al. Finding Parts in Very Large Corpora , 1999, ACL.

[7] Eduard H. Hovy,et al. Offline Strategies for Online Question Answering: Answering Questions Before They Are Asked , 2003, ACL.

[8] William A. Woods,et al. What's in a Link: Foundations for Semantic Networks , 1975 .

[9] Marti A. Hearst. Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[10] Jun'ichi Tsujii,et al. A Hybrid Japanese Parser with Hand-crafted Grammar and Statistics , 2000, COLING.

[11] J. R. Landis,et al. The measurement of observer agreement for categorical data. , 1977, Biometrics.

[12] Regina Barzilay,et al. Extracting Paraphrases from a Parallel Corpus , 2001, ACL.

[13] Kentaro Torisawa,et al. Acquiring Hyponymy Relations from Web Documents , 2004, NAACL.

[14] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.