A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation

Story generation, namely, generating a reasonable story from a leading context, is an important but challenging task. In spite of the success in modeling fluency and local coherence, existing neural language generation models (e.g., GPT-2) still suffer from repetition, logic conflicts, and lack of long-range coherence in generated stories. We conjecture that this is because of the difficulty of associating relevant commonsense knowledge, understanding the causal relationships, and planning entities and events with proper temporal order. In this paper, we devise a knowledge-enhanced pretraining model for commonsense story generation. We propose to utilize commonsense knowledge from external knowledge bases to generate reasonable stories. To further capture the causal and temporal dependencies between the sentences in a reasonable story, we use multi-task learning, which combines a discriminative objective to distinguish true and fake stories during fine-tuning. Automatic and manual evaluation shows that our model can generate more reasonable stories than state-of-the-art baselines, particularly in terms of logic and global coherence.

[1]  Yejin Choi,et al.  COMET: Commonsense Transformers for Automatic Knowledge Graph Construction , 2019, ACL.

[2]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[3]  Omer Levy,et al.  Zero-Shot Relation Extraction via Reading Comprehension , 2017, CoNLL.

[4]  Minlie Huang,et al.  Story Ending Generation with Incremental Encoding and Commonsense Knowledge , 2018, AAAI.

[5]  Yejin Choi,et al.  Dynamic Entity Representations in Neural Language Models , 2017, EMNLP.

[6]  Catherine Havasi,et al.  Representing General Relational Knowledge in ConceptNet 5 , 2012, LREC.

[7]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[8]  Xu Tan,et al.  MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.

[9]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[10]  Maosong Sun,et al.  ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.

[11]  Nathanael Chambers,et al.  A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories , 2016, NAACL.

[12]  Thomas Wolf,et al.  TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents , 2019, ArXiv.

[13]  Yejin Choi,et al.  The Curious Case of Neural Text Degeneration , 2019, ICLR.

[14]  Yann Dauphin,et al.  Strategies for Structuring Story Generation , 2019, ACL.

[15]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[16]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[17]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[18]  Leonhard Hennig,et al.  Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction , 2019, ACL.

[19]  Yejin Choi,et al.  Event2Mind: Commonsense Inference on Events, Intents, and Reactions , 2018, ACL.

[20]  Yejin Choi,et al.  Modeling Naive Psychology of Characters in Simple Commonsense Stories , 2018, ACL.

[21]  Xu Sun,et al.  A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation , 2018, EMNLP.

[22]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[23]  Xiaoyan Zhu,et al.  Story Ending Selection by Finding Hints From Pairwise Candidate Endings , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[24]  Todor Mihaylov,et al.  Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge , 2018, ACL.

[25]  Pushmeet Kohli,et al.  Story Cloze Evaluator: Vector Space Representation Evaluation by Predicting What Happens Next , 2016, RepEval@ACL.

[26]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[27]  Enhong Chen,et al.  Chinese Poetry Generation with Planning based Neural Network , 2016, COLING.

[28]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[29]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[30]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[31]  Christopher D. Manning,et al.  Do Massively Pretrained Language Models Make Better Storytellers? , 2019, CoNLL.

[32]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[33]  Anirban Laha,et al.  Story Generation from Sequence of Independent Short Descriptions , 2017, ArXiv.

[34]  Francis Ferraro,et al.  Visual Storytelling , 2016, NAACL.

[35]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[36]  An Yang,et al.  Enhancing Pre-Trained Language Representations with Rich Knowledge for Machine Reading Comprehension , 2019, ACL.

[37]  Mark O. Riedl,et al.  Event Representations for Automated Story Generation with Deep Neural Nets , 2017, AAAI.

[38]  Jianfei Yu,et al.  Learning Sentence Embeddings with Auxiliary Tasks for Cross-Domain Sentiment Classification , 2016, EMNLP.

[39]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[40]  Yejin Choi,et al.  ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning , 2019, AAAI.

[41]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[42]  Dongyan Zhao,et al.  Plan-And-Write: Towards Better Automatic Storytelling , 2018, AAAI.

[43]  Tiancheng Zhao,et al.  Pretraining Methods for Dialog Context Representation Learning , 2019, ACL.

[44]  Alexander Yates,et al.  Types of Common-Sense Knowledge Needed for Recognizing Textual Entailment , 2011, ACL.

[45]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[46]  Xiang Li,et al.  Commonsense Knowledge Base Completion , 2016, ACL.

[47]  Minlie Huang,et al.  Long and Diverse Text Generation with Planning-based Hierarchical Variational Model , 2019, EMNLP.

[48]  Melissa Roemmele,et al.  Writing Stories with Help from Recurrent Neural Networks , 2016, AAAI.

[49]  Xiaoyan Zhu,et al.  Commonsense Knowledge Aware Conversation Generation with Graph Attention , 2018, IJCAI.

[50]  Lei Li,et al.  Enhancing Topic-to-Essay Generation with External Commonsense Knowledge , 2019, ACL.

[51]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[52]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[53]  Yu Sun,et al.  ERNIE: Enhanced Representation through Knowledge Integration , 2019, ArXiv.

[54]  Yann Dauphin,et al.  Hierarchical Neural Story Generation , 2018, ACL.

[55]  Nick Cramer,et al.  Automatic Keyword Extraction from Individual Documents , 2010 .