Large-Scale Evaluation of Keyphrase Extraction Models

Keyphrase extraction models are usually evaluated under different, not directly comparable, experimental setups. As a result, it remains unclear how well proposed models actually perform, and how they compare to each other. In this work, we address this issue by presenting a systematic large-scale analysis of state-of-the-art keyphrase extraction models involving multiple benchmark datasets from various sources and domains. Our main results reveal that state-of-the-art models are in fact still challenged by simple baselines on some datasets. We also present new insights about the impact of using author- or reader-assigned keyphrases as a proxy for gold standard, and give recommendations for strong baselines and reliable benchmark datasets.

[1]  Richard Mitchell,et al.  A comparison of automated keyphrase extraction techniquesand of automatic evaluation vs. human evaluation , 2012 .

[2]  Anette Hulth,et al.  A Study on Automatically Extracted Keywords in Text Categorization , 2006, ACL.

[3]  Regina Barzilay,et al.  Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , 2017, ACL 2017.

[4]  Carl Gutwin,et al.  KEA: practical automatic keyphrase extraction , 1999, DL '99.

[5]  Florian Boudin,et al.  Unsupervised Keyphrase Extraction with Multipartite Graphs , 2018, NAACL.

[6]  Zhiyong Lu,et al.  MeSH Now: automatic MeSH indexing at PubMed scale via learning to rank , 2017, Journal of Biomedical Semantics.

[7]  Frances H. Barker,et al.  COMPARATIVE EFFICIENCY OF SEARCHING TITLES, ABSTRACTS, AND INDEX TERMS IN A FREE‐TEXT DATA BASE , 1972 .

[8]  Florian Boudin,et al.  pke: an open source python-based keyphrase extraction toolkit , 2016, COLING.

[9]  Noah A. Smith,et al.  Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2016, ACL 2016.

[10]  Rui Wang,et al.  How Preprocessing Affects Unsupervised Keyphrase Extraction , 2014, CICLing.

[11]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[12]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[13]  Mark Last,et al.  Graph-Based Keyword Extraction for Single-Document Summarization , 2008, COLING 2008.

[14]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[15]  Maurizio Marchese,et al.  Large Dataset for Keyphrases Extraction , 2009 .

[16]  Isabelle Augenstein,et al.  SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications , 2017, *SEMEVAL.

[17]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[18]  Xiaojun Wan,et al.  Single Document Keyphrase Extraction Using Neighborhood Knowledge , 2008, AAAI.

[19]  Cornelia Caragea,et al.  Citation-Enhanced Keyphrase Extraction from Research Papers: A Supervised Approach , 2014, EMNLP.

[20]  Florian Boudin,et al.  How Document Pre-processing affects Keyphrase Extraction Performance , 2016, NUT@COLING.

[21]  Martin Jaggi,et al.  Simple Unsupervised Keyphrase Extraction using Sentence Embeddings , 2018, CoNLL.

[22]  Lawrence Birnbaum,et al.  TagAssist: Automatic Tag Suggestion for Blog Posts , 2007, ICWSM.

[23]  ChengXiang Zhai,et al.  Fast Statistical Parsing of Noun Phrases for Document Indexing , 1997, ANLP.

[24]  Alexander Schutz,et al.  Keyphrase Extraction from Single Documents in the Open Domain Exploiting Linguistic and Statistical Methods , 2008 .

[25]  Lu Wang,et al.  Semi-Supervised Learning for Neural Keyphrase Generation , 2018, EMNLP.

[26]  Rui Wang,et al.  Using Word Embeddings to Enhance Keyword Identification for Scientific Publications , 2015, ADC.

[27]  Jaime G. Carbonell,et al.  Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization , 2012, LREC.

[28]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[29]  Weiwei Cheng,et al.  Salience Rank: Efficient Keyphrase Extraction with Topic Modeling , 2017, ACL.

[30]  Iryna Gurevych,et al.  Approximate Matching for Evaluating Keyphrase Extraction , 2009, RANLP.

[31]  Cornelia Caragea,et al.  PositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents , 2017, ACL.

[32]  Anette Hulth,et al.  Improved Automatic Keyword Extraction Given More Linguistic Knowledge , 2003, EMNLP.

[33]  Jade Goldstein-Stewart,et al.  Summarization: (1) Using MMR for Diversity- Based Reranking and (2) Evaluating Summaries , 1998, TIPSTER.

[34]  Shuguang Han,et al.  Deep Keyphrase Generation , 2017, ACL.

[35]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[36]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[37]  Gábor Berend,et al.  Opinion Expression Mining by Exploiting Keyphrase Extraction , 2011, IJCNLP.

[38]  Vincent Ng,et al.  Automatic Keyphrase Extraction: A Survey of the State of the Art , 2014, ACL.

[39]  Xiaoming Zhang,et al.  Keyphrase Generation with Correlation Constraints , 2018, EMNLP.

[40]  Carl Gutwin,et al.  Improving browsing in digital libraries with keyphrase indexes , 1999, Decis. Support Syst..

[41]  Ian H. Witten,et al.  How to Build a Digital Library , 2002 .

[42]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[43]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[44]  Kathleen McKeown,et al.  Content Selection in Deep Learning Models of Summarization , 2018, EMNLP.

[45]  Florian Boudin,et al.  KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents , 2019, INLG.

[46]  Andrew Collins,et al.  Document Embeddings vs. Keyphrases vs. Terms for Recommender Systems: A Large-Scale Online Evaluation , 2019, 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[47]  Eric P. Xing,et al.  Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2014, ACL 2014.

[48]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[49]  Timothy Baldwin,et al.  SemEval-2010 Task 5 : Automatic Keyphrase Extraction from Scientific Articles , 2010, *SEMEVAL.

[50]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.