Abstract More and more attention has been paid to automatic keyphrase generation as it facilitates a wide variety of downstream AI applications, such as information retrieval, text summarization and opinion mining. Although sequence-to-sequence architecture with attention and copy mechanisms (CopyNet) to this task shows promising results, it still suffered from the following shortcomings: (i) it only encodes the keyphrase (usually consists of several words) in word level which can not adequately capture the overall meaning of keyphrase; (ii) it lacks a suitable way to model the correlation among different keyphrases which is very helpful for generating richer and more comprehensive candidate phrases. To overcome these challenges, a novel keyphrase generation model named Hierarchical CopyNet with graph attention networks (HCopy-GAT) is proposed. Firstly, the Hierarchical Recurrent Encode-Decoder neural network (HRED) is employed to learn the expressive embeddings of keyphrases in both word-level and phrase-level. Secondly, the graph attention neural networks (GAT) is applied to model the correlation among different keyphrases. Furthermore, we developed a new dataset named SOFTWARE, which can be taken as a new testbed for keyword generation tasks. With empirical experiments on several real datasets (including our newly built dataset), the proposed HCopy-GAT model outperforms state-of-the-art keyphrase generation models.
[1]
Ah Chung Tsoi,et al.
The Graph Neural Network Model
,
2009,
IEEE Transactions on Neural Networks.
[2]
Aytug Onan,et al.
Ensemble of keyword extraction methods and classifiers in text classification
,
2016,
Expert Syst. Appl..
[3]
Lukasz Kaiser,et al.
Attention is All you Need
,
2017,
NIPS.
[4]
Jiawei Han,et al.
Automated Phrase Mining from Massive Text Corpora
,
2017,
IEEE Transactions on Knowledge and Data Engineering.
[5]
Gang Liu,et al.
Multi-Documents Summarization Based on TextRank and its Application in Online Argumentation Platform
,
2018,
Int. J. Data Warehous. Min..