Co-occurrence graph based hierarchical neural networks for keyphrase generation

Abstract More and more attention has been paid to automatic keyphrase generation as it facilitates a wide variety of downstream AI applications, such as information retrieval, text summarization and opinion mining. Although sequence-to-sequence architecture with attention and copy mechanisms (CopyNet) to this task shows promising results, it still suffered from the following shortcomings: (i) it only encodes the keyphrase (usually consists of several words) in word level which can not adequately capture the overall meaning of keyphrase; (ii) it lacks a suitable way to model the correlation among different keyphrases which is very helpful for generating richer and more comprehensive candidate phrases. To overcome these challenges, a novel keyphrase generation model named Hierarchical CopyNet with graph attention networks (HCopy-GAT) is proposed. Firstly, the Hierarchical Recurrent Encode-Decoder neural network (HRED) is employed to learn the expressive embeddings of keyphrases in both word-level and phrase-level. Secondly, the graph attention neural networks (GAT) is applied to model the correlation among different keyphrases. Furthermore, we developed a new dataset named SOFTWARE, which can be taken as a new testbed for keyword generation tasks. With empirical experiments on several real datasets (including our newly built dataset), the proposed HCopy-GAT model outperforms state-of-the-art keyphrase generation models.

[1]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[2]  Aytug Onan,et al.  Ensemble of keyword extraction methods and classifiers in text classification , 2016, Expert Syst. Appl..

[3]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[4]  Jiawei Han,et al.  Automated Phrase Mining from Massive Text Corpora , 2017, IEEE Transactions on Knowledge and Data Engineering.

[5]  Gang Liu,et al.  Multi-Documents Summarization Based on TextRank and its Application in Online Argumentation Platform , 2018, Int. J. Data Warehous. Min..