Toward Diverse Precondition Generation

A typical goal for language understanding is to logically connect the events of a discourse, but often connective events are not described due to their commonsense nature. In order to address this deficit, we focus here on generating precondition events. Precondition generation can be framed as a sequence-to-sequence problem: given a target event, generate a possible precondition. However, in most real-world scenarios, an event can have several preconditions, which is not always suitable for standard seq2seq frameworks. We propose DiP, the Diverse Precondition generation system that can generate unique and diverse preconditions. DiP consists of three stages of the generative process – an event sampler, a candidate generator, and a post-processor. The event sampler provides control codes (precondition triggers) which the candidate generator uses to focus its generation. Post-processing further improves the results through re-ranking and filtering. Unlike other conditional generation systems, DiP automatically generates control codes without training on diverse examples. Analysis reveals that DiP improves the diversity of preconditions significantly compared to a beam search baseline. Also, manual evaluation shows that DiP generates more preconditions than a strong nucleus sampling baseline.

[1]  Chris Donahue,et al.  Enabling Language Models to Fill in the Blanks , 2020, ACL.

[2]  Yejin Choi,et al.  The Curious Case of Neural Text Degeneration , 2019, ICLR.

[3]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[4]  Lav R. Varshney,et al.  CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.

[5]  Taylor Cassidy,et al.  Dense Event Ordering with a Multi-Pass Architecture , 2014, TACL.

[6]  Benjamin Van Durme,et al.  COD3S: Diverse Generation with Discrete Semantic Signatures , 2020, EMNLP.

[7]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[8]  Jeffrey Pennington,et al.  Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.

[9]  Niranjan Balasubramanian,et al.  Modeling Preconditions in Text with a Crowd-sourced Dataset , 2020, EMNLP.

[10]  Regina Barzilay,et al.  Learning High-Level Planning from Text , 2012, ACL.

[11]  Lei Zheng,et al.  Texygen: A Benchmarking Platform for Text Generation Models , 2018, SIGIR.

[12]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[13]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Hua Wu,et al.  PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable , 2020, ACL.

[16]  Avirup Sil,et al.  Extracting Action and Event Semantics from Web Text , 2010, AAAI Fall Symposium: Commonsense Knowledge.

[17]  Doug Downey,et al.  Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.

[18]  Osmar R. Zaïane,et al.  Automatic Dialogue Generation with Expressed Emotions , 2018, NAACL.

[19]  Thibault Sellam,et al.  BLEURT: Learning Robust Metrics for Text Generation , 2020, ACL.

[20]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.