Deep Keyphrase Completion

Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval. Though previous studies have made substantial efforts for automated keyphrase extraction and generation, surprisingly, few studies have been made for keyphrase completion (KPC). KPC aims to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases, which can be applied to improve text indexing system, etc. In this paper, we propose a novel KPC method with an encoder-decoder framework. We name it deep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework. Specifically, the encoder and the decoder in DKPC play different roles to make full use of the known keyphrases. The former considers the keyphrase-guiding factors, which aggregates information of known keyphrases into context. On the contrary, the latter considers the keyphrase-inhibited factor to inhibit semantically repeated keyphrase generation. Extensive experiments on benchmark datasets demonstrate the efficacy of our proposed model.

[1]  Min-Yen Kan,et al.  Keyphrase Extraction in Scientific Publications , 2007, ICADL.

[2]  Xuanjing Huang,et al.  Keyphrase Extraction Using Deep Recurrent Neural Networks on Twitter , 2016, EMNLP.

[3]  Maurizio Marchese,et al.  Large Dataset for Keyphrases Extraction , 2009 .

[4]  Thomas Demeester,et al.  Supervised Keyphrase Extraction as Positive Unlabeled Learning , 2016, EMNLP.

[5]  Mohamed S. Kamel,et al.  CorePhrase: Keyphrase Extraction for Document Clustering , 2005, MLDM.

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Jian-Yun Nie,et al.  DivGraphPointer: A Graph Pointer Network for Extracting Diverse Keyphrases , 2019, SIGIR.

[8]  Michael R. Lyu,et al.  Title-Guided Encoding for Keyphrase Generation , 2018, AAAI.

[9]  Rajiv Ratn Shah,et al.  Keyphrase Generation for Scientific Articles using GANs , 2019, ArXiv.

[10]  Isabelle Augenstein,et al.  SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications , 2017, *SEMEVAL.

[11]  Ricardo Campos,et al.  YAKE! Collection-Independent Automatic Keyword Extractor , 2018, ECIR.

[12]  Zhiyuan Liu,et al.  Improving Neural Fine-Grained Entity Typing With Knowledge Attention , 2018, AAAI.

[13]  Cornelia Caragea,et al.  Citation-Enhanced Keyphrase Extraction from Research Papers: A Supervised Approach , 2014, EMNLP.

[14]  Bin Wu,et al.  Automatic Keyword Extraction Using Linguistic Features , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[15]  Xiaojun Wan,et al.  Single Document Keyphrase Extraction Using Neighborhood Knowledge , 2008, AAAI.

[16]  Piji Li,et al.  Exclusive Hierarchical Decoding for Deep Keyphrase Generation , 2020, ACL.

[17]  Timothy Baldwin,et al.  SemEval-2010 Task 5 : Automatic Keyphrase Extraction from Scientific Articles , 2010, *SEMEVAL.

[18]  Wang Chen,et al.  Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards , 2019, ACL.

[19]  Roger Zimmermann,et al.  Key2Vec: Automatic Ranked Keyphrase Extraction from Scientific Articles using Phrase Embeddings , 2018, NAACL.

[20]  Rui Liu,et al.  Keyphrase Prediction With Pre-trained Language Model , 2020, ArXiv.

[21]  Xiaoli Li,et al.  MIKE: Keyphrase Extraction by Integrating Multidimensional Information , 2017, CIKM.

[22]  Jing Zhao,et al.  Incorporating Linguistic Constraints into Keyphrase Generation , 2019, ACL.

[23]  Anette Hulth,et al.  Improved Automatic Keyword Extraction Given More Linguistic Knowledge , 2003, EMNLP.

[24]  Dragomir R. Radev,et al.  Citation Summarization Through Keyphrase Extraction , 2010, COLING.

[25]  Peter Brusilovsky,et al.  Generating Diverse Numbers of Diverse Keyphrases , 2018, ArXiv.

[26]  Florian Boudin,et al.  Keyphrase Generation for Scientific Document Retrieval , 2020, ACL.

[27]  Kai-Wei Chang,et al.  Select, Extract and Generate: Neural Keyphrase Generation with Layer-wise Coverage Attention , 2020, ACL.

[28]  Ondrej Bojar,et al.  Keyphrase Generation: A Text Summarization Struggle , 2019, NAACL.

[29]  Cornelia Caragea,et al.  Bi-LSTM-CRF Sequence Labeling for Keyphrase Extraction from Scholarly Documents , 2019, WWW.

[30]  Min-Yen Kan,et al.  Glocal: Incorporating Global Information in Local Convolution for Keyphrase Extraction , 2019, NAACL-HLT.

[31]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[32]  Lu Wang,et al.  Semi-Supervised Learning for Neural Keyphrase Generation , 2018, EMNLP.

[33]  KimSu Nam,et al.  Automatic keyphrase extraction from scientific articles , 2013 .

[34]  Rajiv Ratn Shah,et al.  A Preliminary Exploration of GANs for Keyphrase Generation , 2020, EMNLP.

[35]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[36]  Shuguang Han,et al.  Deep Keyphrase Generation , 2017, ACL.

[37]  Cornelia Caragea,et al.  Extracting Keyphrases from Research Papers Using Citation Networks , 2014, AAAI.

[38]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[39]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[40]  Peng Yang,et al.  Incorporating Expert Knowledge into Keyphrase Extraction , 2017, AAAI.

[41]  Qi Zhang,et al.  One2Set: Generating Diverse Keyphrases as a Set , 2021, ACL.

[42]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[43]  Sunil Kumar Sahu,et al.  Autoencoding Keyword Correlation Graph for Document Clustering , 2020, ACL.

[44]  Laurent Romary,et al.  HUMB: Automatic Key Term Extraction from Scientific Articles in GROBID , 2010, *SEMEVAL.

[45]  Iraklis Varlamis,et al.  SemanticRank: Ranking Keywords and Sentences Using Semantic Graphs , 2010, COLING.

[46]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[47]  Cornelia Caragea,et al.  Keyphrase Extraction from Disaster-related Tweets , 2019, WWW.