论文信息 - KX: A Flexible System for Keyphrase eXtraction

KX: A Flexible System for Keyphrase eXtraction

In this paper we present KX, a system for key-phrase extraction developed at FBK-IRST, which exploits basic linguistic annotation combined with simple statistical measures to select a list of weighted keywords from a document. The system is flexible in that it offers to the user the possibility of setting parameters such as frequency thresholds for collocation extraction and indicators for key-phrase relevance, as well as it allows for domain adaptation exploiting a corpus of documents in an unsupervised way. KX is also easily adaptable to new languages in that it requires only a PoS-Tagger to derive lexical patterns. In the SemEval task 5 "Automatic Key-phrase Extraction from Scientific Articles", KX performance achieved satisfactory results both in finding reader-assigned keywords and in the combined keywords subtask.

Emanuele Pianta | Sara Tonelli | E. Pianta | Sara Tonelli

[1] Hideki Mima,et al. Automatic recognition of multi-word terms:. the C-value/NC-value method , 2000, International Journal on Digital Libraries.

[2] Gordon W. Paynter,et al. Interactive document summarisation using automatically extracted keyphrases , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[3] Min-Yen Kan,et al. Keyphrase Extraction in Scientific Publications , 2007, ICADL.

[4] Paolo Tonella,et al. An empirical study on keyword-based Web site clustering , 2004, Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004..

[5] Emanuele Pianta,et al. The TextPro Tool Suite , 2008, LREC.