Hierarchical Neural Story Generation

We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. We collect a large dataset of 300K human-written stories paired with writing prompts from an online forum. Our dataset enables hierarchical story generation, where the model first generates a premise, and then transforms it into a passage of text. We gain further improvements with a novel form of model fusion that improves the relevance of the story to the prompt, and adding a new gated multi-scale self-attention mechanism to model long-range context. Experiments show large improvements over strong baselines on both automated and human evaluations. Human judges prefer stories generated by our approach to those from a strong non-hierarchical model by a factor of two to one.

[1]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[2]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[3]  Mirella Lapata,et al.  Chinese Poetry Generation with Recurrent Neural Networks , 2014, EMNLP.

[4]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[5]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[6]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[7]  Yoshua Bengio,et al.  On Using Monolingual Corpora in Neural Machine Translation , 2015, ArXiv.

[8]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[9]  Daniel Jurafsky,et al.  A Hierarchical Neural Autoencoder for Paragraphs and Documents , 2015, ACL.

[10]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[11]  Yonghui Wu,et al.  Exploring the Limits of Language Modeling , 2016, ArXiv.

[12]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[13]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[14]  Melissa Roemmele,et al.  Writing Stories with Help from Recurrent Neural Networks , 2016, AAAI.

[15]  Ashwin K. Vijayakumar,et al.  Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models , 2016, ArXiv.

[16]  Alexander M. Rush,et al.  Challenges in Data-to-Document Generation , 2017, EMNLP.

[17]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[18]  Anirban Laha,et al.  Story Generation from Sequence of Independent Short Descriptions , 2017, ArXiv.

[19]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[20]  Navdeep Jaitly,et al.  Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.

[21]  Denny Britz,et al.  Generating Long and Diverse Responses with Neural Conversation Models , 2017, ArXiv.

[22]  Quoc V. Le,et al.  Unsupervised Pretraining for Sequence to Sequence Learning , 2016, EMNLP.

[23]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[24]  Mark O. Riedl,et al.  Toward Automated Story Generation with Markov Chain Monte Carlo Methods and Deep Neural Networks , 2021, AIIDE Workshops.

[25]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[26]  Adam Coates,et al.  Cold Fusion: Training Seq2Seq Models Together with Language Models , 2017, INTERSPEECH.

[27]  Mark O. Riedl,et al.  Event Representations for Automated Story Generation with Deep Neural Nets , 2017, AAAI.

[28]  Lukasz Kaiser,et al.  Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.

[29]  Mike Lewis,et al.  Hierarchical Text Generation and Planning for Strategic Dialogue , 2017, ICML.