Variational Template Machine for Data-to-Text Generation

How to generate descriptions from structured data organized in tables? Existing approaches using neural encoder-decoder models often suffer from lacking diversity. We claim that an open set of templates is crucial for enriching the phrase constructions and realizing varied generations. Learning such templates is prohibitive since it often requires a large paired corpus, which is seldom available. This paper explores the problem of automatically learning reusable "templates" from paired and non-paired data. We propose the variational template machine (VTM), a novel method to generate text descriptions from data tables. Our contributions include: a) we carefully devise a specific model architecture and losses to explicitly disentangle text template and semantic content information, in the latent spaces, and b)we utilize both small parallel data and large raw text without aligned tables to enrich the template learning. Experiments on datasets from a variety of different domains show that VTM is able to generate more diversely while keeping a good fluency and quality.

[1]  Lili Mou,et al.  Disentangled Representation Learning for Non-Parallel Text Style Transfer , 2018, ACL.

[2]  Lei Li,et al.  Generating Sentences from Disentangled Syntactic and Semantic Spaces , 2019, ACL.

[3]  Maxine Eskénazi,et al.  Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation , 2018, ACL.

[4]  Tiejun Zhao,et al.  Table-to-Text: Describing Table Region With Natural Language , 2018, AAAI.

[5]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[6]  Marilyn A. Walker,et al.  Can Neural Generators for Dialogue Learn Sentence Planning and Discourse Structuring? , 2018, INLG.

[7]  Chin-Yew Lin,et al.  Data2Text Studio: Automated Text Generation from Structured Data , 2018, EMNLP.

[8]  Zhifang Sui,et al.  Table-to-text Generation by Structure-aware Seq2seq Learning , 2017, AAAI.

[9]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[10]  Xiaocheng Feng,et al.  Table-to-Text Generation with Effective Hierarchical Encoder on Three Dimensions (Row, Column and Time) , 2019, EMNLP.

[11]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[12]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[13]  François Yvon,et al.  Using Monolingual Data in Neural Machine Translation: a Systematic Study , 2018, WMT.

[14]  Dan Klein,et al.  A Simple Domain-Independent Probabilistic Approach to Generation , 2010, EMNLP.

[15]  Alexander M. Rush,et al.  Learning Neural Templates for Text Generation , 2018, EMNLP.

[16]  Lei Zheng,et al.  Texygen: A Benchmarking Platform for Text Generation Models , 2018, SIGIR.

[17]  Regina Barzilay,et al.  Style Transfer from Non-Parallel Text by Cross-Alignment , 2017, NIPS.

[18]  Michael I. Jordan,et al.  A generalized mean field algorithm for variational inference in exponential families , 2002, UAI.

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[21]  Heng Ji,et al.  Describing a Knowledge Base , 2018, INLG.

[22]  Eric P. Xing,et al.  Toward Controlled Generation of Text , 2017, ICML.

[23]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[24]  Alexander M. Rush,et al.  Challenges in Data-to-Document Generation , 2017, EMNLP.

[25]  Matthew R. Walter,et al.  What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment , 2015, NAACL.

[26]  Stefano Ermon,et al.  InfoVAE: Balancing Learning and Inference in Variational Autoencoders , 2019, AAAI.

[27]  David Grangier,et al.  Neural Text Generation from Structured Data with Application to the Biography Domain , 2016, EMNLP.

[28]  Eneko Agirre,et al.  Unsupervised Neural Machine Translation , 2017, ICLR.

[29]  Yejin Choi,et al.  The Curious Case of Neural Text Degeneration , 2019, ICLR.

[30]  Graham Neubig,et al.  Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction , 2017, ACL.

[31]  Mitesh M. Khapra,et al.  A Mixed Hierarchical Attention Based Encoder-Decoder Approach for Standard Table Summarization , 2018, NAACL.

[32]  Shuming Ma,et al.  Key Fact as Pivot: A Two-Stage Model for Low Resource Table-to-Text Generation , 2019, ACL.

[33]  Yoav Goldberg,et al.  Controlling Linguistic Style Aspects in Neural Language Generation , 2017, ArXiv.

[34]  Will Radford,et al.  Learning to generate one-sentence biographies from Wikidata , 2017, EACL.