Automatic keyphrase extraction from scientific articles

This paper describes the organization and results of the automatic keyphrase extraction task held at the Workshop on Semantic Evaluation 2010 (SemEval-2010). The keyphrase extraction task was specifically geared towards scientific articles. Systems were automatically evaluated by matching their extracted keyphrases against those assigned by the authors as well as the readers to the same documents. We outline the task, present the overall ranking of the submitted systems, and discuss the improvements to the state-of-the-art in keyphrase extraction.

[1]  Ian H. Witten,et al.  Thesaurus based automatic keyphrase indexing , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[2]  Mark Last,et al.  Graph-Based Keyword Extraction for Single-Document Summarization , 2008, COLING 2008.

[3]  Min-Yen Kan,et al.  Keyphrase Extraction in Scientific Publications , 2007, ICADL.

[4]  Minh-Thang Luong,et al.  WINGNUS: Keyphrase Extraction Utilizing Document Logical Structure , 2010, *SEMEVAL.

[5]  Emanuele Pianta,et al.  KX: A Flexible System for Keyphrase eXtraction , 2010, *SEMEVAL.

[6]  Mireya Tovar,et al.  BUAP: An Unsupervised Approach to Automatic Keyphrase Extraction from Scientific Articles , 2010, SemEval@ACL.

[7]  Ken Barker,et al.  Using Noun Phrase Heads to Extract Document Keyphrases , 2000, Canadian Conference on AI.

[8]  Qian Liu,et al.  Improving keyword based web image search with visual feature distribution and term expansion , 2009, Knowledge and Information Systems.

[9]  Ashish Verma,et al.  A Language Independent Approach to Audio Search , 2011, INTERSPEECH.

[10]  Claude Pasquier Single Document Keyphrase Extraction Using Sentence Clustering and Latent Dirichlet Allocation , 2010, SemEval@ACL.

[11]  Marti A. Hearst,et al.  A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text , 2002, Pacific Symposium on Biocomputing.

[12]  Xiaojun Wan,et al.  CollabRank: Towards a Collaborative Approach to Single-Document Keyphrase Extraction , 2008, COLING.

[13]  Laurent Romary,et al.  HUMB: Automatic Key Term Extraction from Scientific Articles in GROBID , 2010, *SEMEVAL.

[14]  Gonenc Ercan AUTOMATED TEXT SUMMARIZATION AND KEYPHRASE EXTRACTION , 2006 .

[15]  Hideki Mima,et al.  Automatic recognition of multi-word terms:. the C-value/NC-value method , 2000, International Journal on Digital Libraries.

[16]  Timo Honkela,et al.  A Language-Independent Approach to Keyphrase Extraction and Evaluation , 2008, COLING.

[17]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[18]  Matthew Hurst,et al.  A Language Model Approach to Keyphrase Extraction , 2003, ACL 2003.

[19]  Rada Mihalcea,et al.  SenseLearner: Minimally supervised Word Sense Disambiguation for all words in open text , 2004, SENSEVAL@ACL.

[20]  Feifan Liu,et al.  Unsupervised Approaches for Automatic Keyword Extraction Using Meeting Transcripts , 2009, NAACL.

[21]  Mitsuru Ishizuka,et al.  Keyword extraction from a single document using word co-occurrence statistical information , 2004, Int. J. Artif. Intell. Tools.

[22]  Fang Li,et al.  SJTULTLAB: Chunk Based Method for Keyphrase Extraction , 2010, SemEval@ACL.

[23]  Anette Hulth Combining Machine Learning and Natural Language Processing for Automatic Keyword Extraction , 2004 .

[24]  Timothy Baldwin,et al.  Evaluating N-gram based Evaluation Metrics for Automatic Keyphrase Extraction , 2010, COLING.

[25]  Enrico Blanzieri,et al.  Improving Machine Learning Approaches for Keyphrases Extraction from Scientific Documents with Natural Language Knowledge , 2010 .

[26]  Wenjie Li,et al.  273. Task 5. Keyphrase Extraction Based on Core Word Identification and Word Expansion , 2010, SemEval@ACL.

[27]  Min-Yen Kan,et al.  Re-examining Automatic Keyphrase Extraction Approaches in Scientific Articles , 2009, MWE@IJCNLP.

[28]  Min Zhang,et al.  An Automatic Online News Topic Keyphrase Extraction System , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[29]  Zhiyuan Liu,et al.  Clustering to Find Exemplar Terms for Keyphrase Extraction , 2009, EMNLP.

[30]  Iryna Gurevych,et al.  Approximate Matching for Evaluating Keyphrase Extraction , 2009, RANLP.

[31]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[32]  Weiguang Qu,et al.  A Semi-Supervised Key Phrase Extraction Approach: Learning from Title Phrases through a Document Semantic Network , 2010, ACL.

[33]  Alexander Schutz,et al.  Keyphrase Extraction from Single Documents in the Open Domain Exploiting Linguistic and Statistical Methods , 2008 .

[34]  T. Mexia,et al.  Author ' s personal copy , 2009 .

[35]  KimSu Nam,et al.  Automatic keyphrase extraction from scientific articles , 2013 .

[36]  Peter D. Turney Coherent Keyphrase Extraction via Web Mining , 2003, IJCAI.

[37]  Jong Gun Lee,et al.  UNPMC: Naive Approach to Extract Keyphrases from Scientific Articles , 2010, SemEval@ACL.

[38]  Anette Hulth,et al.  A Study on Automatically Extracted Keywords in Text Categorization , 2006, ACL.

[39]  Mohamed S. Kamel,et al.  CorePhrase: Keyphrase Extraction for Document Clustering , 2005, MLDM.

[40]  Paul Buitelaar,et al.  DERIUNLP: A Context Based Approach to Automatic Keyphrase Extraction , 2010, SemEval@ACL.

[41]  Anette Hulth,et al.  Improved Automatic Keyword Extraction Given More Linguistic Knowledge , 2003, EMNLP.

[42]  Ahmed A. Rafea,et al.  KP-Miner: Participation in SemEval-2 , 2010, *SEMEVAL.

[43]  Ian H. Witten,et al.  Human-competitive tagging using automatic keyphrase extraction , 2009, EMNLP.

[44]  Caroline Barrière,et al.  Keyphrase Extraction : Enhancing Lists , 2012, ArXiv.

[45]  Kalliopi Zervanou UvT: The UvT Term Extraction System in the Keyphrase Extraction Task , 2010, SemEval@ACL.

[46]  Peter D. Turney Learning to Extract Keyphrases from Text , 2002, ArXiv.

[47]  Timo Honkela,et al.  Likey: Unsupervised Language-Independent Keyphrase Extraction , 2010, SemEval@ACL.

[48]  Enrico Blanzieri,et al.  Keyphrases Extraction from Scientific Documents: Improving Machine Learning Approaches with Natural Language Processing , 2010, ICADL.

[49]  Carl Gutwin,et al.  KEA: practical automatic keyphrase extraction , 1999, DL '99.

[50]  Carl Gutwin,et al.  Domain-Specific Keyphrase Extraction , 1999, IJCAI.

[51]  C. Lee Giles,et al.  SEERLAB: A System for Extracting Keyphrases from Scholarly Documents , 2010, SemEval@ACL.

[52]  B. Magnini,et al.  A Keyphrase-Based Approach to Summarization : the LAKE System at DUC-2005 , 2005 .

[53]  Arnold L. Rosenberg,et al.  Finding topic words for hierarchical summarization , 2001, SIGIR '01.

[54]  Carl Gutwin,et al.  Improving browsing in digital libraries with keyphrase indexes , 1999, Decis. Support Syst..

[55]  Timothy Baldwin,et al.  The Use of Topic Representative Words in Text Categorization , 2009 .

[56]  Maurizio Marchese,et al.  Large Dataset for Keyphrases Extraction , 2009 .

[57]  Günter Neumann,et al.  DFKI KeyWE: Ranking Keyphrases Extracted from Scientific Articles , 2010, SemEval@ACL.

[58]  Evangelos E. Milios,et al.  Term-Based Clustering and Summarization of Web Page Collections , 2004, Canadian Conference on AI.

[59]  Gábor Berend,et al.  SZTERGAK : Feature Engineering for Keyphrase Extraction , 2010, *SEMEVAL.

[60]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.