Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence

The neural attention model has achieved great success in data-to-text generation tasks. Though usually excelling at producing fluent text, it suffers from the problem of information missing, repetition and "hallucination". Due to the black-box nature of the neural attention architecture, avoiding these problems in a systematic way is non-trivial. To address this concern, we propose to explicitly segment target text into fragment units and align them with their data correspondences. The segmentation and correspondence are jointly learned as latent variables without any human annotations. We further impose a soft statistical constraint to regularize the segmental granularity. The resulting architecture maintains the same expressive power as neural attention models, while being able to generate fully interpretable outputs with several times less computational cost. On both E2E and WebNLG benchmarks, we show the proposed model consistently outperforms its neural attention counterparts.

[1]  Dietrich Klakow,et al.  Estimation of Gap Between Current Language Models and Human Performance , 2017, INTERSPEECH.

[2]  Hermann Ney,et al.  Neural Hidden Markov Model for Machine Translation , 2018, ACL.

[3]  Dietrich Klakow,et al.  Improving Latent Alignment in Text Summarization by Generalizing the Pointer Generator , 2019, EMNLP.

[4]  Emiel Krahmer,et al.  Neural data-to-text generation: A comparison between pipeline and end-to-end architectures , 2019, EMNLP.

[5]  Kentaro Inui,et al.  Select and Attend: Towards Controllable Content Selection in Text Generation , 2019, EMNLP.

[6]  Jason Eisner,et al.  Inside-Outside and Forward-Backward Algorithms Are Just Backprop (tutorial paper) , 2016, SPNLP@EMNLP.

[7]  Michael Backes,et al.  Simulating the Large-Scale Erosion of Genomic Privacy Over Time , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  David Grangier,et al.  Neural Text Generation from Structured Data with Application to the Biography Domain , 2016, EMNLP.

[9]  Dan Klein,et al.  Learning Semantic Correspondences with Less Supervision , 2009, ACL.

[10]  Alexander M. Rush,et al.  Learning Neural Templates for Text Generation , 2018, EMNLP.

[11]  Xiaoyu Shen,et al.  Unsupervised Pidgin Text Generation By Pivoting English Data and Self-Training , 2020, ArXiv.

[12]  Blake Howald,et al.  A Statistical NLG Framework for Aggregated Planning and Realization , 2013, ACL.

[13]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[14]  Karen Kukich,et al.  Design of a Knowledge-Based Report Generator , 1983, ACL.

[15]  Ben Taskar,et al.  Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..

[16]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[17]  Sunita Sarawagi,et al.  Posterior Attention Models for Sequence to Sequence Learning , 2019, ICLR.

[18]  Alexander M. Rush,et al.  Structured Attention Networks , 2017, ICLR.

[19]  Kartikeya Upasani,et al.  Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue , 2019, ACL.

[20]  Zhoujun Li,et al.  Low-Resource Response Generation with Template Prior , 2019, EMNLP/IJCNLP.

[21]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[22]  Alexander M. Rush,et al.  Coarse-to-Fine Attention Models for Document Summarization , 2017, NFiS@EMNLP.

[23]  Iryna Gurevych,et al.  E2E NLG Challenge: Neural Models vs. Templates , 2018, INLG.

[24]  Dan Klein,et al.  A Simple Domain-Independent Probabilistic Approach to Generation , 2010, EMNLP.

[25]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[26]  Mirella Lapata,et al.  Data-to-Text Generation with Content Selection and Planning , 2018, AAAI.

[27]  Alexander M. Rush,et al.  Latent Alignment and Variational Attention , 2018, NeurIPS.

[28]  Ryan Cotterell,et al.  Hard Non-Monotonic Attention for Character-Level Transduction , 2018, EMNLP.

[29]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[30]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[31]  Daniel Marcu,et al.  Induction of Word and Phrase Alignments for Automatic Document Summarization , 2005, CL.

[32]  Colin Raffel,et al.  Monotonic Chunkwise Attention , 2017, ICLR.

[33]  C. Lawrence Zitnick,et al.  CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Lei Yu,et al.  Online Segment to Segment Neural Transduction , 2016, EMNLP.

[35]  Anja Belz,et al.  Comparing Automatic and Human Evaluation of NLG Systems , 2006, EACL.

[36]  Chong Wang,et al.  Sequence Modeling via Segmentations , 2017, ICML.

[37]  Verena Rieser,et al.  Why We Need New Evaluation Metrics for NLG , 2017, EMNLP.

[38]  Alexander M. Rush,et al.  Image-to-Markup Generation with Coarse-to-Fine Attention , 2016, ICML.

[39]  Dietrich Klakow,et al.  NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation , 2018, EMNLP.

[40]  Raymond J. Mooney,et al.  Generative Alignment and Semantic Parsing for Learning from Ambiguous Supervision , 2010, COLING.

[41]  Robert Dale,et al.  Building applied natural language generation systems , 1997, Natural Language Engineering.

[42]  Verena Rieser,et al.  The E2E Dataset: New Challenges For End-to-End Generation , 2017, SIGDIAL Conference.

[43]  Verena Rieser,et al.  Findings of the E2E NLG Challenge , 2018, INLG.

[44]  Cheng Niu,et al.  Improving Multi-turn Dialogue Modelling with Utterance ReWriter , 2019, ACL.

[45]  Mirella Lapata,et al.  A Global Model for Concept-to-Text Generation , 2013, J. Artif. Intell. Res..

[46]  Anja Belz,et al.  Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models , 2008, Natural Language Engineering.

[47]  Mirella Lapata,et al.  Collective Content Selection for Concept-to-Text Generation , 2005, HLT.

[48]  Alexander M. Rush,et al.  A Tutorial on Deep Latent Variable Models of Natural Language , 2018, ArXiv.

[49]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[50]  Chris Dyer,et al.  Unsupervised Word Discovery with Segmental Neural Language Models , 2018, ArXiv.

[51]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[52]  Jason Eisner,et al.  Parameter Estimation for Probabilistic Finite-State Transducers , 2002, ACL.

[53]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[54]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[55]  Anja Belz,et al.  An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems , 2009, CL.

[56]  Claire Gardent,et al.  The WebNLG Challenge: Generating Text from DBPedia Data , 2016, INLG.

[57]  Giuseppe Carenini,et al.  A Template-based Abstractive Meeting Summarization: Leveraging Summary and Source Text Relationships , 2014, INLG.

[58]  Andreas Vlachos,et al.  Sheffield at E2E: structured prediction approaches to end-to-end language generation. , 2018 .

[59]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[60]  Guy Lapalme,et al.  Text generation , 1990 .

[61]  Chin-Yew Lin,et al.  Learning Latent Semantic Annotations for Grounding Natural Language to Structured Data , 2018, EMNLP.

[62]  Ido Dagan,et al.  Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation , 2019, NAACL.

[63]  Matthew R. Walter,et al.  What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment , 2015, NAACL.

[64]  Marilyn A. Walker,et al.  Can Neural Generators for Dialogue Learn Sentence Planning and Discourse Structuring? , 2018, INLG.

[65]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[66]  Alexander M. Rush,et al.  Challenges in Data-to-Document Generation , 2017, EMNLP.

[67]  Marilyn A. Walker,et al.  A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation , 2018, NAACL.

[68]  Verena Rieser,et al.  Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge , 2019, Comput. Speech Lang..

[69]  Ali Farhadi,et al.  Multi-Resolution Language Grounding with Weak Supervision , 2014, EMNLP.

[70]  Yang Zhao,et al.  A comprehensive study: Sentence compression with linguistic knowledge-enhanced gated neural network , 2018, Data Knowl. Eng..