论文信息 - A practical system of keyphrase extraction for web pages

A practical system of keyphrase extraction for web pages

Keyphrases can be used to facilitate Web users grasping the main topic(s) of a Web page. We present a practical system of automatic keyphrase extraction for Web pages. In this system, a regression model was first trained based on a set of human-labeled documents. Then it was used to extract keyphrases from new pages automatically. This paper makes three contributions. First, the structure information in a Web page was investigated for keyphrase extraction task. Second, the query log data associated with a Web page collected by a search engine server were used to help keyphrase extraction. Third, a method was put forward in this paper in order to evaluate the similarity of phrases.

Mo Chen | Jian-Tao Sun | Kwok-Yan Lam | Hua-Jun Zeng

[1] Tao Tao,et al. A formal study of information retrieval heuristics , 2004, SIGIR '04.

[2] Peter D. Turney. Coherent Keyphrase Extraction via Web Mining , 2003, IJCAI.

[3] Carl Gutwin,et al. KEA: practical automatic keyphrase extraction , 1999, DL '99.