TopNet: Learning from Neural Topic Model to Generate Long Stories

Long story generation (LSG) is one of the coveted goals in natural language processing. Different from most text generation tasks, LSG requires to output a long story of rich content based on a much shorter text input, and often suffers from information sparsity. In this paper, we propose TopNet to alleviate this problem, by leveraging the recent advances in neural topic modeling to obtain high-quality skeleton words to complement the short input. In particular, instead of directly generating a story, we first learn to map the short text input to a low-dimensional topic distribution (which is pre-assigned by a topic model). Based on this latent topic distribution, we can use the reconstruction decoder of the topic model to sample a sequence of inter-related words as a skeleton for the story. Experiments on two benchmark datasets show that our proposed framework is highly effective in skeleton word selection and significantly outperforms the state-of-the-art models in both automatic evaluation and human evaluation.

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Timothy Baldwin,et al.  Topically Driven Neural Language Model , 2017, ACL.

[3]  Geoffrey Zweig,et al.  Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[4]  Yann Dauphin,et al.  Strategies for Structuring Story Generation , 2019, ACL.

[5]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[6]  Xu Sun,et al.  A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation , 2018, EMNLP.

[7]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[8]  Nathanael Chambers,et al.  A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories , 2016, NAACL.

[9]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[10]  Phil Blunsom,et al.  Discovering Discrete Latent Topics with Neural Variational Inference , 2017, ICML.

[11]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[12]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[15]  Yoshua Bengio,et al.  A Neural Knowledge Language Model , 2016, ArXiv.

[16]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[17]  Nanyun Peng,et al.  Plan, Write, and Revise: an Interactive System for Open-Domain Story Generation , 2019, NAACL.

[18]  Zhiyuan Liu,et al.  Topical Word Embeddings , 2015, AAAI.

[19]  Chong Wang,et al.  TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency , 2016, ICLR.

[20]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[21]  Alexander J. Smola,et al.  Language Models with Transformers , 2019, ArXiv.

[22]  Alan Ritter,et al.  Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[23]  Melissa Roemmele,et al.  Writing Stories with Help from Recurrent Neural Networks , 2016, AAAI.

[24]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[25]  Mark O. Riedl,et al.  Controllable Neural Story Plot Generation via Reward Shaping , 2019, IJCAI.

[26]  Guoyin Wang,et al.  Topic-Guided Variational Auto-Encoder for Text Generation , 2019, NAACL.

[27]  Phil Blunsom,et al.  Neural Variational Inference for Text Processing , 2015, ICML.

[28]  Nanyun Peng,et al.  Towards Controllable Story Generation , 2018 .

[29]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[30]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[31]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[32]  Geoffrey E. Hinton,et al.  Replicated Softmax: an Undirected Topic Model , 2009, NIPS.

[33]  Omer Levy,et al.  SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.

[34]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[35]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[36]  Anirban Laha,et al.  Story Generation from Sequence of Independent Short Descriptions , 2017, ArXiv.

[37]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[38]  Wei Liu,et al.  Distilled Wasserstein Learning for Word Embedding and Topic Modeling , 2018, NeurIPS.

[39]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[40]  Zhe Gan,et al.  Scalable Deep Poisson Factor Analysis for Topic Modeling , 2015, ICML.

[41]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[42]  Yasemin Altun,et al.  Overcoming the Lack of Parallel Data in Sentence Compression , 2013, EMNLP.

[43]  Hugo Larochelle,et al.  A Neural Autoregressive Topic Model , 2012, NIPS.

[44]  Yann Dauphin,et al.  Hierarchical Neural Story Generation , 2018, ACL.

[45]  Garrison W. Cottrell,et al.  Improving Neural Story Generation by Targeted Common Sense Grounding , 2019, EMNLP.

[46]  Zhe Gan,et al.  Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization , 2018, NeurIPS.

[47]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[48]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[49]  Zhe Gan,et al.  Topic Compositional Neural Language Model , 2017, AISTATS.

[50]  Dongyan Zhao,et al.  Plan-And-Write: Towards Better Automatic Storytelling , 2018, AAAI.

[51]  Yee Whye Teh,et al.  Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes , 2004, NIPS.