Consistency and Coherency Enhanced Story Generation

Story generation is a challenging task, which demands to maintain consistency of the plots and characters throughout the story. Previous works have shown that GPT2, a large-scale language model, has achieved good performance on story generation. However, we observe that several serious issues still exist in the stories generated by GPT2 which can be categorized into two folds: consistency and coherency. In terms of consistency, on one hand, GPT2 cannot guarantee the consistency of the plots explicitly. On the other hand, the generated stories usually contain coreference errors. In terms of coherency, GPT2 does not take account of the discourse relations between sentences of stories directly. To enhance the consistency and coherency of the generated stories, we propose a two-stage generation framework, where the first stage is to organize the story outline which depicts the story plots and events, and the second stage is to expand the outline into a complete story. Therefore the plots consistency can be controlled and guaranteed explicitly. In addition, coreference supervision signals are incorporated to reduce coreference errors and improve the coreference consistency. Moreover, we design an auxiliary task of discourse relation modeling to improve the coherency of the generated stories. Experimental results on a story dataset show that our model outperforms the baseline approaches in terms of both automatic metrics and human evaluation.

[1]  Dongyan Zhao,et al.  Plan-And-Write: Towards Better Automatic Storytelling , 2018, AAAI.

[2]  Christopher D. Manning,et al.  Do Massively Pretrained Language Models Make Better Storytellers? , 2019, CoNLL.

[3]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[4]  Rafael Pérez y Pérez,et al.  MEXICA: A computer model of a cognitive account of creative writing , 2001, J. Exp. Theor. Artif. Intell..

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Hai Zhao,et al.  Deep Enhanced Representation for Implicit Discourse Relation Recognition , 2018, COLING.

[7]  Rashmi Prasad,et al.  The Penn Discourse Treebank , 2004, LREC.

[8]  Robert Michael Young,et al.  Narrative Planning: Balancing Plot and Character , 2010, J. Artif. Intell. Res..

[9]  Xuanjing Huang,et al.  Implicit Discourse Relation Detection via a Deep Architecture with Gated Relevance Network , 2016, ACL.

[10]  Yann Dauphin,et al.  Hierarchical Neural Story Generation , 2018, ACL.

[11]  Meng Zhang,et al.  Learning to Predict Explainable Plots for Neural Story Generation , 2019, ArXiv.

[12]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[13]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[14]  Anirban Laha,et al.  Story Generation from Sequence of Independent Short Descriptions , 2017, ArXiv.

[15]  Zheng-Yu Niu,et al.  Multi-task Attention-based Neural Networks for Implicit Discourse Relationship Representation and Identification , 2017, EMNLP.

[16]  Minlie Huang,et al.  A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation , 2020, TACL.

[17]  Yann Dauphin,et al.  Strategies for Structuring Story Generation , 2019, ACL.

[18]  Marc Cavazza,et al.  Controlling Narrative Generation with Planning Trajectories: The Role of Constraints , 2009, ICIDS.

[19]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[20]  Luis Argerich,et al.  Variations of the Similarity Function of TextRank for Automated Summarization , 2016, ArXiv.

[21]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[22]  Noah D. Goodman,et al.  DisSent: Learning Sentence Representations from Explicit Discourse Relations , 2019, ACL.

[23]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[24]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[25]  Nick Cramer,et al.  Automatic Keyword Extraction from Individual Documents , 2010 .

[26]  Mark O. Riedl,et al.  Event Representations for Automated Story Generation with Deep Neural Nets , 2017, AAAI.

[27]  Xu Sun,et al.  A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation , 2018, EMNLP.

[28]  Nathanael Chambers,et al.  A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories , 2016, NAACL.

[29]  Yong Jiang,et al.  A Large-Scale Chinese Short-Text Conversation Dataset , 2020, NLPCC.

[30]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.