Knowledge graph based natural language generation with adapted pointer-generator networks

Abstract Pointer-generator networks have recently shown superior performance in Natural Language Generation (NLG) tasks, such as automatic generating descriptions for entities in Knowledge Graph (KG). In the absence of the introductory description about an entity we intend to know, the natural language text automatically generated by neural network model can greatly help people to better understand the entity. Entities in KG always have multiple property fields and corresponding values. The target generated description sentences should consist of these slot types and slot values in a consistent and high coverage manner. In order to cover the facts in an input KG, pointer-generator networks always copy certain segments from the input sequences via softmax pointing while generating novel words through the generator. But the challenges that when and where to integrate the copied information with the generated one, how to prevent duplicate generation and information loss are hard to conquer. In this paper, the KG2TEXT model based on adapted pointer-generator networks is proposed. In it, a varied coverage loss function is devised to cover attribute-value pairs as many as possible when generating natural language descriptions for entities in KG. Secondly, an attention mechanism named supervised attention mechanism is added to the model which aims to guide the soft switch transformation process (to generate or to copy). With empirical experiments on two kinds of realistic datasets, the KG2TEXT model achieves promising results and outperforms the state-of-the-art approaches.

[1]  Sivaji Bandyopadhyay,et al.  Statistical Natural Language Generation from Tabular Non-textual Data , 2016, INLG.

[2]  Karen Kukich,et al.  Design of a Knowledge-Based Report Generator , 1983, ACL.

[3]  Bonnie L. Webber,et al.  Brief Review: Natural Language Generation in Health Care , 1997, J. Am. Medical Informatics Assoc..

[4]  Raymond J. Mooney,et al.  Learning to sportscast: a test of grounded language acquisition , 2008, ICML '08.

[5]  Ion Androutsopoulos,et al.  Generating Multilingual Descriptions from Linguistically Annotated OWL Ontologies: the NaturalOWL System , 2007, ENLG.

[6]  Mirella Lapata,et al.  Concept-to-text Generation via Discriminative Reranking , 2012, ACL.

[7]  Yue Zhang,et al.  A Graph-to-Sequence Model for AMR-to-Text Generation , 2018, ACL.

[8]  Nancy Green,et al.  Generation of Biomedical Arguments for Lay Readers , 2006, INLG.

[9]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[10]  Nicholas Jing Yuan,et al.  Collaborative Knowledge Base Embedding for Recommender Systems , 2016, KDD.

[11]  Matthew R. Walter,et al.  What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment , 2015, NAACL.

[12]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[13]  Zhiguo Wang,et al.  Supervised Attentions for Neural Machine Translation , 2016, EMNLP.

[14]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[15]  Shashi Narayan,et al.  Creating Training Corpora for NLG Micro-Planners , 2017, ACL.

[16]  David Vandyke,et al.  Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.

[17]  C. Lawrence Zitnick,et al.  CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Mirella Lapata,et al.  Unsupervised Concept-to-text Generation with Hypergraphs , 2012, NAACL.

[19]  Anja Belz,et al.  Comparing Automatic and Human Evaluation of NLG Systems , 2006, EACL.

[20]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[21]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[22]  Pascal Poupart,et al.  Order-Planning Neural Text Generation From Structured Data , 2017, AAAI.

[23]  Anja Belz,et al.  Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models , 2008, Natural Language Engineering.

[24]  Geoffrey E. Hinton,et al.  Generating Text with Recurrent Neural Networks , 2011, ICML.

[25]  Björn Buchhold,et al.  Semantic Search on Text and Knowledge Bases , 2016, Found. Trends Inf. Retr..

[26]  David Grangier,et al.  Neural Text Generation from Structured Data with Application to the Biography Domain , 2016, EMNLP.

[27]  Seung-won Hwang,et al.  KBQA: Learning Question Answering over QA Corpora and Knowledge Bases , 2019, Proc. VLDB Endow..

[28]  Dan Klein,et al.  A Simple Domain-Independent Probabilistic Approach to Generation , 2010, EMNLP.

[29]  Raphaël Troncy,et al.  entity2rec: Learning User-Item Relatedness from Knowledge Graphs for Top-N Item Recommendation , 2017, RecSys.

[30]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[31]  Jimmy J. Lin,et al.  Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks , 2017, NAACL.

[32]  Thien Huu Nguyen,et al.  Who is Killed by Police: Introducing Supervised Attention for Hierarchical LSTMs , 2018, COLING.

[33]  Zhifang Sui,et al.  Table-to-text Generation by Structure-aware Seq2seq Learning , 2017, AAAI.

[34]  Alexander M. Rush,et al.  Challenges in Data-to-Document Generation , 2017, EMNLP.

[35]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[36]  Heng Ji,et al.  Describing a Knowledge Base , 2018, INLG.

[37]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[38]  Daniel Duma,et al.  Generating Natural Language from Linked Data: Unsupervised template extraction , 2013, IWCS.