Describing a Knowledge Base

We aim to automatically generate natural language descriptions about an input structured knowledge base (KB). We build our generation framework based on a pointer network which can copy facts from the input KB, and add two attention mechanisms: (i) slot-aware attention to capture the association between a slot type and its corresponding slot value; and (ii) a new \emph{table position self-attention} to capture the inter-dependencies among related slots. For evaluation, besides standard metrics including BLEU, METEOR, and ROUGE, we propose a KB reconstruction based metric by extracting a KB from the generation output and comparing it with the input KB. We also create a new data set which includes 106,216 pairs of structured KBs and their corresponding natural language descriptions for two distinct entity types. Experiments show that our approach significantly outperforms state-of-the-art methods. The reconstructed KB achieves 68.8% - 72.6% F-score.

[1]  Mirella Lapata,et al.  Concept-to-text Generation via Discriminative Reranking , 2012, ACL.

[2]  Alexander M. Rush,et al.  Structured Attention Networks , 2017, ICLR.

[3]  Bowen Zhou,et al.  Pointing the Unknown Words , 2016, ACL.

[4]  Heng Ji,et al.  Paper Abstract Writing through Editing Mechanism , 2018, ACL.

[5]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[6]  Tao Shen,et al.  DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding , 2017, AAAI.

[7]  Alon Lavie,et al.  Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.

[8]  Jaime G. Carbonell,et al.  Generation from Abstract Meaning Representation using Tree Transducers , 2016, NAACL.

[9]  Dan Klein,et al.  A Simple Domain-Independent Probabilistic Approach to Generation , 2010, EMNLP.

[10]  Yue Zhang,et al.  A Graph-to-Sequence Model for AMR-to-Text Generation , 2018, ACL.

[11]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[12]  Yejin Choi,et al.  Globally Coherent Text Generation with Neural Checklist Models , 2016, EMNLP.

[13]  Matthew R. Walter,et al.  What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment , 2015, NAACL.

[14]  Chengqi Zhang,et al.  Bi-Directional Block Self-Attention for Fast and Memory-Efficient Sequence Modeling , 2018, ICLR.

[15]  James T. Kwok,et al.  Accelerated Gradient Methods for Stochastic Optimization and Online Learning , 2009, NIPS.

[16]  Kevin Knight,et al.  Generating English from Abstract Meaning Representations , 2016, INLG.

[17]  Will Radford,et al.  Learning to generate one-sentence biographies from Wikidata , 2017, EACL.

[18]  Sivaji Bandyopadhyay,et al.  Statistical Natural Language Generation from Tabular Non-textual Data , 2016, INLG.

[19]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[20]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[21]  Misha Denil,et al.  Learning Where to Attend with Deep Architectures for Image Tracking , 2011, Neural Computation.

[22]  Pascal Poupart,et al.  Order-Planning Neural Text Generation From Structured Data , 2017, AAAI.

[23]  R. B. Jones,et al.  Natural language generation in health care. , 1997, Journal of the American Medical Informatics Association : JAMIA.

[24]  Quoc V. Le,et al.  Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[25]  Alexander M. Rush,et al.  Challenges in Data-to-Document Generation , 2017, EMNLP.

[26]  David Vandyke,et al.  Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.

[27]  Yang Liu,et al.  Learning Structured Text Representations , 2017, TACL.

[28]  Dragomir R. Radev,et al.  Nested Propositions in Open Information Extraction , 2016, EMNLP.

[29]  Shashi Narayan,et al.  Creating Training Corpora for NLG Micro-Planners , 2017, ACL.

[30]  Christophe Gravier,et al.  Mind the (Language) Gap: Generation of Multilingual Wikipedia Summaries from Wikidata for ArticlePlaceholders , 2018, ESWC.

[31]  Heng Ji,et al.  Entity-aware Image Caption Generation , 2018, EMNLP.

[32]  Mirella Lapata,et al.  Inducing Document Plans for Concept-to-Text Generation , 2013, EMNLP.

[33]  Leo Wanner,et al.  Natural Language Generation in the context of the Semantic Web , 2014, Semantic Web.

[34]  Christopher D. Manning,et al.  Leveraging Linguistic Structure For Open Domain Information Extraction , 2015, ACL.

[35]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[36]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[37]  Zhifang Sui,et al.  Table-to-text Generation by Structure-aware Seq2seq Learning , 2017, AAAI.

[38]  Daniel Duma,et al.  Generating Natural Language from Linked Data: Unsupervised template extraction , 2013, IWCS.

[39]  David Grangier,et al.  Neural Text Generation from Structured Data with Application to the Biography Domain , 2016, EMNLP.

[40]  Dan Klein,et al.  Learning Semantic Correspondences with Less Supervision , 2009, ACL.

[41]  Ralph Grishman,et al.  Ensemble Semantics for Large-scale Unsupervised Relation Extraction , 2012, EMNLP-CoNLL.

[42]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[43]  Anja Belz,et al.  Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models , 2008, Natural Language Engineering.

[44]  Geoffrey E. Hinton,et al.  Generating Text with Recurrent Neural Networks , 2011, ICML.

[45]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[46]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[47]  Christophe Gravier,et al.  Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata , 2018, NAACL.

[48]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[49]  Heng Ji,et al.  Incorporating Background Knowledge into Video Description Generation , 2018, EMNLP.

[50]  Raymond J. Mooney,et al.  Learning to sportscast: a test of grounded language acquisition , 2008, ICML '08.

[51]  Karen Kukich,et al.  Design of a Knowledge-Based Report Generator , 1983, ACL.

[52]  Mirella Lapata,et al.  Unsupervised Concept-to-text Generation with Hypergraphs , 2012, NAACL.

[53]  Xu Sun,et al.  Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization , 2017, ACL.

[54]  Heng Ji,et al.  Open Relation Extraction and Grounding , 2017, IJCNLP.

[55]  Mirella Lapata,et al.  A Global Model for Concept-to-Text Generation , 2013, J. Artif. Intell. Res..

[56]  Denilson Barbosa,et al.  Open Information Extraction with Tree Kernels , 2013, NAACL.

[57]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[58]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[59]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[60]  Dimitra Gkatzia,et al.  A Snapshot of NLG Evaluation Practices 2005 - 2014 , 2015, ENLG.