论文信息 - Hierarchical Neural Story Generation - 字舞流文

Hierarchical Neural Story Generation

We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. We collect a large dataset of 300K human-written stories paired with writing prompts from an online forum. Our dataset enables hierarchical story generation, where the model first generates a premise, and then transforms it into a passage of text. We gain further improvements with a novel form of model fusion that improves the relevance of the story to the prompt, and adding a new gated multi-scale self-attention mechanism to model long-range context. Experiments show large improvements over strong baselines on both automated and human evaluations. Human judges prefer stories generated by our approach to those from a strong non-hierarchical model by a factor of two to one.

Yann Dauphin | Mike Lewis | Angela Fan | M. Lewis | Yann Dauphin | Angela Fan | Y. Dauphin

[1] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.

[2] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.

[3] Mirella Lapata,et al. Chinese Poetry Generation with Recurrent Neural Networks , 2014, EMNLP.

[4] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[5] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[6] Jason Weston,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[7] Yoshua Bengio,et al. On Using Monolingual Corpora in Neural Machine Translation , 2015, ArXiv.

[8] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.

[9] Daniel Jurafsky,et al. A Hierarchical Neural Autoencoder for Paragraphs and Documents , 2015, ACL.

[10] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[11] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.

[12] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[13] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.

[14] Melissa Roemmele,et al. Writing Stories with Help from Recurrent Neural Networks , 2016, AAAI.

[15] Ashwin K. Vijayakumar,et al. Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models , 2016, ArXiv.

[16] Alexander M. Rush,et al. Challenges in Data-to-Document Generation , 2017, EMNLP.

[17] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[18] Anirban Laha,et al. Story Generation from Sequence of Independent Short Descriptions , 2017, ArXiv.

[19] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.

[20] Navdeep Jaitly,et al. Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.

[21] Denny Britz,et al. Generating Long and Diverse Responses with Neural Conversation Models , 2017, ArXiv.

[22] Quoc V. Le,et al. Unsupervised Pretraining for Sequence to Sequence Learning , 2016, EMNLP.

[23] Yann Dauphin,et al. Language Modeling with Gated Convolutional Networks , 2016, ICML.

[24] Mark O. Riedl,et al. Toward Automated Story Generation with Markov Chain Monte Carlo Methods and Deep Neural Networks , 2021, AIIDE Workshops.

[25] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.

[26] Adam Coates,et al. Cold Fusion: Training Seq2Seq Models Together with Language Models , 2017, INTERSPEECH.

[27] Mark O. Riedl,et al. Event Representations for Automated Story Generation with Deep Neural Nets , 2017, AAAI.

[28] Lukasz Kaiser,et al. Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.

[29] Mike Lewis,et al. Hierarchical Text Generation and Planning for Strategic Dialogue , 2017, ICML.