论文信息 - Extracting Mnemonic Names of People from the Web

Extracting Mnemonic Names of People from the Web

The web has gained much attention as new media reflecting real-time interest in the world. This attention is driven by the proliferation of tools like bulletin boards and weblogs. The web is a source from which we can collect and summarize information about a particular object (e.g., business organization, product, person, etc.) For example, the extraction of reputation information is a major research topic in information extraction and knowledge extraction from the web. The ability to collect web pages about a particular object is essential in obtaining such information and extracting knowledge from it. A big problem in the web page collection process is that the same objects are referred to in different ways in different web documents. For example, a person may be referred to by full name, first name, affiliation and title, or nicknames. This paper proposes a method for extracting these mnemonic names of people from the web and shows experimental results using real web data.

Hiroyuki Kitagawa | Tomoko Hokama | H. Kitagawa | Tomoko Hokama

[1] Craig A. Knoblock,et al. Learning domain-independent string transformation weights for high accuracy object identification , 2002, KDD.

[2] Tommi S. Jaakkola,et al. Using term informativeness for named entity detection , 2005, SIGIR '05.

[3] Jayant Madhavan,et al. Reference reconciliation in complex information spaces , 2005, SIGMOD '05.

[4] Anuradha Bhamidipaty,et al. Interactive deduplication using active learning , 2002, KDD.

[5] Ralph Grishman,et al. Message Understanding Conference- 6: A Brief History , 1996, COLING.

[6] David M. Pennock,et al. Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[7] Salvatore J. Stolfo,et al. The merge/purge problem for large databases , 1995, SIGMOD '95.

[8] Yoram Singer,et al. Unsupervised Models for Named Entity Classification , 1999, EMNLP.