Neural Text Generation from Structured Data with Application to the Biography Domain

This paper introduces a neural model for concept-to-text generation that scales to large, rich domains. We experiment with a new dataset of biographies from Wikipedia that is an order of magnitude larger than existing resources with over 700k samples. The dataset is also vastly more diverse with a 400k vocabulary, compared to a few hundred words for Weathergov or Robocup. Our model builds upon recent work on conditional neural language model for text generation. To deal with the large vocabulary, we extend these models to mix a fixed vocabulary with copy actions that transfer sample-specific words from the input database to the generated output sentence. Our neural model significantly out-performs a classical Kneser-Ney language model adapted to this task by nearly 15 BLEU.

[1]  Kevin Knight,et al.  Generation that Exploits Corpus-Based Statistical Knowledge , 1998, ACL.

[2]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[3]  Alexander I. Rudnicky,et al.  Stochastic Language Generation for Spoken Dialogue Systems , 2000 .

[4]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[5]  Kathleen McKeown,et al.  Content Planner Construction via Evolutionary Algorithms and a Corpus-based Fitness Function , 2002, INLG.

[6]  Adwait Ratnaparkhi,et al.  Trainable approaches to surface natural language generation and their application to conversational dialog systems , 2002, Comput. Speech Lang..

[7]  Sabine Geldof,et al.  CORAL: using natural language generation for navigational assistance , 2003 .

[8]  Mirella Lapata,et al.  Collective Content Selection for Concept-to-Text Generation , 2005, HLT.

[9]  Jim Hunter,et al.  Choosing words in computer-generated weather forecasts , 2005, Artif. Intell..

[10]  Daniel Marcu,et al.  Stochastic Language Generation Using WIDL-Expressions and its Application in Machine Translation and Summarization , 2006, ACL.

[11]  Mirella Lapata,et al.  Aggregation via Set Partitioning for Natural Language Generation , 2006, NAACL.

[12]  Nancy Green,et al.  Generation of Biomedical Arguments for Lay Readers , 2006, INLG.

[13]  Anja Belz,et al.  Comparing Automatic and Human Evaluation of NLG Systems , 2006, EACL.

[14]  Raymond J. Mooney,et al.  Generation by Inverting a Semantic Parser that Uses Statistical Machine Translation , 2007, NAACL.

[15]  Ion Androutsopoulos,et al.  Generating Multilingual Descriptions from Linguistically Annotated OWL Ontologies: the NaturalOWL System , 2007, ENLG.

[16]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[17]  Anja Belz,et al.  Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models , 2008, Natural Language Engineering.

[18]  Ehud Reiter,et al.  Generating Approximate Geographic Descriptions , 2009, ENLG.

[19]  Dan Klein,et al.  Learning Semantic Correspondences with Less Supervision , 2009, ACL.

[20]  Raymond J. Mooney,et al.  Generative Alignment and Semantic Parsing for Learning from Ambiguous Supervision , 2010, COLING.

[21]  Dan Klein,et al.  A Simple Domain-Independent Probabilistic Approach to Generation , 2010, EMNLP.

[22]  Marilyn A. Walker,et al.  Controlling User Perceptions of Linguistic Style: Trainable Generation of Personality Traits , 2011, CL.

[23]  Hwee Tou Ng,et al.  A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions , 2011, EMNLP.

[24]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[25]  David A. Ferrucci,et al.  Introduction to "This is Watson" , 2012, IBM J. Res. Dev..

[26]  Mirella Lapata,et al.  A Global Model for Concept-to-Text Generation , 2013, J. Artif. Intell. Res..

[27]  Philipp Koehn,et al.  Scalable Modified Kneser-Ney Language Model Estimation , 2013, ACL.

[28]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[29]  Ruslan Salakhutdinov,et al.  Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.

[30]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[31]  Claire Gardent,et al.  Surface Realisation from Knowledge-Bases , 2014, ACL.

[32]  Volodymyr Kindratenko,et al.  Numerical Computations with GPUs , 2014, Springer International Publishing.

[33]  Ronan Collobert,et al.  Word Embeddings through Hellinger PCA , 2013, EACL.

[34]  Geoffrey Zweig,et al.  From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Quoc V. Le,et al.  Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[36]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[37]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[38]  Geoffrey Zweig,et al.  Attention with Intention for a Neural Network Conversation Model , 2015, ArXiv.

[39]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[41]  Matthew R. Walter,et al.  What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment , 2015, NAACL.

[42]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).