论文信息 - Automatic indexing of scientific papers Presentation and results of DEFT 2016 text mining challenge

Automatic indexing of scientific papers Presentation and results of DEFT 2016 text mining challenge

This paper presents the 2016 edition of the DEFT text mining challenge. This edition adresses the keyword-based indexing of scientific papers with the aim of simulating a professional indexer. The corpus is composed of French bibliographic records from four domains : linguistics, information science, archaeology and chemisty. The results have been evaluated in terms of precision, recall and f-measure computed on stemmed texts against a reference manual indexation.

[1] Ian H. Witten,et al. Human-competitive tagging using automatic keyphrase extraction , 2009, EMNLP.

[2] Rada Mihalcea,et al. TextRank: Bringing Order into Text , 2004, EMNLP.

[3] Rui Wang,et al. How Preprocessing Affects Unsupervised Keyphrase Extraction , 2014, CICLing.

[4] Xiaojun Wan,et al. Single Document Keyphrase Extraction Using Neighborhood Knowledge , 2008, AAAI.

[5] Timo Honkela,et al. Likey: Unsupervised Language-Independent Keyphrase Extraction , 2010, SemEval@ACL.

[6] Pascal Denis,et al. Coupling an Annotated Corpus and a Morphosyntactic Lexicon for State-of-the-Art POS Tagging with Less Human Effort , 2009, PACLIC.

[7] Karen Spärck Jones. A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[8] Anette Hulth,et al. Improved Automatic Keyword Extraction Given More Linguistic Knowledge , 2003, EMNLP.

[9] Chengzhi Zhang,et al. Automatic Keyword Extraction from Documents Using Conditional Random Fields , 2008 .

[10] Zhiyuan Liu,et al. Clustering to Find Exemplar Terms for Keyphrase Extraction , 2009, EMNLP.

[11] Mita Nasipuri,et al. A New Approach to Keyphrase Extraction Using Neural Networks , 2010, ArXiv.

[12] Ellen M. Voorhees,et al. The Philosophy of Information Retrieval Evaluation , 2001, CLEF.

[13] Carl Gutwin,et al. KEA: practical automatic keyphrase extraction , 1999, DL '99.