TOP-Rank: A TopicalPostionRank for Extraction and Classification of Keyphrases in Text

Abstract Keyphrase extraction is the task of extracting the most important phrases from a document. Automatic keyphrase extraction attempts to itemize a document content as metainformation and facilitate efficient information retrieval. In this paper we propose TOP-Rank, an approach for keyphrase extraction and keyphrase classification. For keyphrase extraction, we build an approach based on the position of keyphrases in the document and expand it with topical ranking of keyphrases. In particular, keyphrase extraction technique analyzes the documents and extracts keyphrases from the document by giving a higher rank to topical phrases. After keyphrase extraction, we classify keyphrases as process, material and task. Our evaluation on diverse datasets shows that TOP-Rank achieves F1-score of 0.73 for keyphrase classification improving upon state-of-the-art methods by a huge margin.

[1]  Rui Wang,et al.  A Two-Level Keyphrase Extraction Approach , 2015, CICLing.

[2]  Josiane Mothe,et al.  Automatic keyphrase extraction using graph-based methods , 2018, SAC.

[3]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[4]  Abdalfattah M. Alfarra,et al.  Graph-Based Technique for Extracting Keyphrases in a Single-Document (GTEK) , 2018, 2018 International Conference on Promising Electronic Technologies (ICPET).

[5]  Florian Boudin,et al.  TopicRank: Graph-Based Topic Ranking for Keyphrase Extraction , 2013, IJCNLP.

[6]  Cornelia Caragea,et al.  Bi-LSTM-CRF Sequence Labeling for Keyphrase Extraction from Scholarly Documents , 2019, WWW.

[7]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[8]  Xiaojun Wan,et al.  Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction , 2007, ACL.

[9]  Jiawei Han,et al.  Automatic Construction and Ranking of Topical Keyphrases on Collections of Short Documents , 2014, SDM.

[10]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[11]  Chuan Wu,et al.  Keyphrase Extraction Based on Prior Knowledge , 2018, JCDL.

[12]  Jaime G. Carbonell,et al.  Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization , 2012, LREC.

[13]  Jure Leskovec,et al.  Citing for high impact , 2010, JCDL '10.

[14]  Vincent Ng,et al.  Automatic Keyphrase Extraction: A Survey of the State of the Art , 2014, ACL.

[15]  Pasquale Lops,et al.  Content-based Recommender Systems: State of the Art and Trends , 2011, Recommender Systems Handbook.

[16]  Mohamed S. Kamel,et al.  CorePhrase: Keyphrase Extraction for Document Clustering , 2005, MLDM.

[17]  Michael R. Lyu,et al.  Title-Guided Encoding for Keyphrase Generation , 2018, AAAI.

[18]  Cornelia Caragea,et al.  Citation-Enhanced Keyphrase Extraction from Research Papers: A Supervised Approach , 2014, EMNLP.

[19]  Carlo Tasso,et al.  Keyphrase Extraction via an Attentive Model , 2019, IRCDL.

[20]  Tao Xu,et al.  Research on archives text classification based on Naive bayes , 2017, 2017 IEEE 2nd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC).

[21]  James H. Martin,et al.  SGRank: Combining Statistical and Graphical Methods to Improve the State of the Art in Unsupervised Keyphrase Extraction , 2015, *SEMEVAL.

[22]  B. Magnini,et al.  Keyphrase Extraction for Summarization Purposes : The LAKE System at DUC-2004 , 2004 .

[23]  Shailendra Singh Kathait,et al.  Unsupervised Key-phrase Extraction using Noun Phrases , 2017 .

[24]  Cornelia Caragea,et al.  PositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents , 2017, ACL.

[25]  Daraksha Parveen,et al.  Topical Coherence for Graph-based Extractive Summarization , 2015, EMNLP.

[26]  Fengxin Li,et al.  Research of Text Categorization Model based on Random Forests , 2015, 2015 IEEE International Conference on Computational Intelligence & Communication Technology.

[27]  Davide Buscaldi,et al.  Classification of keyphrases from scientific publications using WordNet and word embeddings , 2017, VADOR@INFORSID.

[28]  Min-Yen Kan,et al.  Keyphrase Extraction in Scientific Publications , 2007, ICADL.

[29]  Shanmugasundaram Hariharan,et al.  A Comparison of Similarity Measures for Text Documents , 2008, J. Inf. Knowl. Manag..

[30]  Cornelia Caragea,et al.  Extracting Keyphrases from Research Papers Using Citation Networks , 2014, AAAI.

[31]  Florian Boudin,et al.  Unsupervised Keyphrase Extraction with Multipartite Graphs , 2018, NAACL.