Few-shot NLG with Pre-trained Language Model

Neural-based end-to-end approaches to natural language generation (NLG) from structured data or knowledge are data-hungry, making their adoption for real-world applications difficult with limited data. In this work, we propose the new task of \textit{few-shot natural language generation}. Motivated by how humans tend to summarize tabular data, we propose a simple yet effective approach and show that it not only demonstrates strong performance but also provides good generalization across domains. The design of the model architecture is based on two aspects: content selection from input data and language modeling to compose coherent sentences, which can be acquired from prior knowledge. With just 200 training examples, across multiple domains, we show that our approach achieves very reasonable performances and outperforms the strongest baseline by an average of over 8.0 BLEU points improvement. Our code and data can be found at \url{this https URL}

[1]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[2]  Donald D. Cowan,et al.  The Development of a Natural Language Generation System for Personalized e-Health Information , 2007 .

[3]  Brian M. Sadler,et al.  On Generating Characteristic-rich Question Sets for QA Evaluation , 2016, EMNLP.

[4]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[5]  Mirella Lapata,et al.  Data-to-Text Generation with Content Selection and Planning , 2018, AAAI.

[6]  Tilman Becker,et al.  Practical, Template–Based Natural Language Generation with TAG , 2002, TAG+.

[7]  Alexander M. Rush,et al.  Learning Neural Templates for Text Generation , 2018, EMNLP.

[8]  Percy Liang,et al.  Learning Symmetric Collaborative Dialogue Agents with Dynamic Knowledge Graph Embeddings , 2017, ACL.

[9]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[10]  Ming-Wei Chang,et al.  A Knowledge-Grounded Neural Conversation Model , 2017, AAAI.

[11]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[12]  Ankur Parikh,et al.  Handling Divergent Reference Texts when Evaluating Table-to-Text Generation , 2019, ACL.

[13]  Xin Jiang,et al.  Neural Generative Question Answering , 2015, IJCAI.

[14]  Shuming Ma,et al.  Key Fact as Pivot: A Two-Stage Model for Low Resource Table-to-Text Generation , 2019, ACL.

[15]  Emiel Krahmer,et al.  Squibs and Discussions: Real versus Template-Based Natural Language Generation: A False Opposition? , 2005, CL.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[19]  Claire Gardent,et al.  The WebNLG Challenge: Generating Text from RDF Data , 2017, INLG.

[20]  Ido Dagan,et al.  Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation , 2019, NAACL.

[21]  Zhifang Sui,et al.  Table-to-text Generation by Structure-aware Seq2seq Learning , 2017, AAAI.

[22]  Kathleen McKeown,et al.  Content Planner Construction via Evolutionary Algorithms and a Corpus-based Fitness Function , 2002, INLG.

[23]  Hwee Tou Ng,et al.  Natural Language Generation with Tree Conditional Random Fields , 2009, EMNLP.

[24]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[25]  Robert Dale,et al.  Building applied natural language generation systems , 1997, Natural Language Engineering.

[26]  Verena Rieser,et al.  The E2E Dataset: New Challenges For End-to-End Generation , 2017, SIGDIAL Conference.

[27]  Mirella Lapata,et al.  Unsupervised Concept-to-text Generation with Hypergraphs , 2012, NAACL.

[28]  Mirella Lapata,et al.  A Global Model for Concept-to-Text Generation , 2013, J. Artif. Intell. Res..

[29]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[30]  Emiel Krahmer,et al.  Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation , 2017, J. Artif. Intell. Res..

[31]  Thorsten Brants,et al.  One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.

[32]  Christophe Gravier,et al.  Zero-Shot Question Generation from Knowledge Graphs for Unseen Predicates and Entity Types , 2018, NAACL.

[33]  Oladimeji Farri,et al.  Clinical Natural Language Processing with Deep Learning , 2019, Data Science for Healthcare.

[34]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[35]  Wenhu Chen,et al.  Global Textual Relation Embedding for Relational Understanding , 2019, ACL.

[36]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[37]  David Grangier,et al.  Neural Text Generation from Structured Data with Application to the Biography Domain , 2016, EMNLP.

[38]  Dan Klein,et al.  Learning Semantic Correspondences with Less Supervision , 2009, ACL.

[39]  Alexander M. Rush,et al.  Challenges in Data-to-Document Generation , 2017, EMNLP.

[40]  Alexander M. Rush,et al.  Sequence-Level Knowledge Distillation , 2016, EMNLP.

[41]  R. B. Jones,et al.  Natural language generation in health care. , 1997, Journal of the American Medical Informatics Association : JAMIA.

[42]  Mitesh M. Khapra,et al.  Complex Sequential Question Answering: Towards Learning to Converse Over Linked Question Answer Pairs with a Knowledge Graph , 2018, AAAI.

[43]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[44]  Marilyn A. Walker,et al.  SPoT: A Trainable Sentence Planner , 2001, NAACL.