论文信息 - Thesaurus Based Term Ranking for Keyword Extraction

Thesaurus Based Term Ranking for Keyword Extraction

In many cases keywords from a restricted set of possible keywords have to be assigned to texts. A common way to find the best keywords is to rank terms occurring in the text according to their tf.idf value. This requires a corpus of texts from which document frequencies can be derived. In this paper we show that we can obtain results of the same quality without the usage of a background corpus, using relations between terms provided in a thesaurus.

[1] Karl Cox,et al. Identifying Domain Context for the Intentional Modelling Technique MAP , 2007 .

[2] David A. Ferrucci,et al. UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[3] Unisist. Guidelines for the establishment and development of monolingual thesauri : UNISIST , 1973 .

[4] Cong Wang,et al. Keyword Extraction Based on PageRank , 2007, PAKDD.

[5] Véronique Malaisé,et al. A Method to Convert Thesauri to SKOS , 2006, ESWC.

[6] Anette Hulth,et al. Automatic Keyword Extraction Using Domain Knowledge , 2001, CICLing.

[7] Djoerd Hiemstra,et al. A probabilistic justification for using tf×idf term weighting in information retrieval , 2000, International Journal on Digital Libraries.

[8] Luis M. de Campos,et al. Automatic Indexing from a Thesaurus Using Bayesian Networks: Application to the Classification of Parliamentary Initiatives , 2007, ECSQARU.

[9] Karen Spärck Jones. A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[10] Kurt Leininger,et al. Interindexer consistency in PsycINFO , 2000, J. Libr. Inf. Sci..

[11] Gerard Salton,et al. Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[12] Ian H. Witten,et al. Thesaurus based automatic keyphrase indexing , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[13] Jaap Kamps,et al. Improving Retrieval Effectiveness by Reranking Documents Based on Controlled Vocabulary , 2004, ECIR.

[14] Christian Wartena,et al. Apolda: A Practical Tool for Semantic Annotation , 2007, 18th International Workshop on Database and Expert Systems Applications (DEXA 2007).

[15] Stephen E. Robertson,et al. Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[16] Ian H. Witten,et al. Thesaurus-based index term extraction for agricultural documents , 2005 .

[17] Warren R. Greiff,et al. A theory of term weighting based on exploratory data analysis , 1998, SIGIR '98.

[18] Susan T. Dumais,et al. Improving the retrieval of information from external sources , 1991 .