Generating texts under constraint through discriminator-guided MCTS

Large pre-trained language models (LM) based on Transformers allow to generate very plausible long texts. In this paper, we explore how this generation can be further controlled to satisfy certain constraints (eg. being non-toxic, positive or negative, convey certain emotions, etc.) without fine-tuning the LM. Precisely, we formalize constrained generation as a tree exploration process guided by a discriminator according to how well the associated sequence respects the constraint. Using a discriminator to guide this generation, rather than fine-tuning the LM, in addition to be easier and cheaper to train, allows to apply the constraint more finely and dynamically. We propose several original methods to search this generation tree, notably the Monte Carlo Tree Search (MCTS) which provides theoretical guarantees on the search efficiency, but also simpler methods based on re-ranking a pool of diverse sequences using the discriminator scores. We evaluate these methods on two types of constraints and languages: review polarity and emotion control in French and English. We show that MCTS achieves state-of-the-art results in constrained generation, without having to tune the language model, in both tasks and languages. We also demonstrate that our other proposed methods based on re-ranking can be really effective when diversity among the generated propositions is encouraged.

[1]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[2]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[3]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[4]  Yannis Papanikolaou,et al.  DARE: Data Augmented Relation Extraction with GPT-2 , 2020, ArXiv.

[5]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[7]  Emily M. Bender,et al.  On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.

[8]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[9]  Yejin Choi,et al.  The Curious Case of Neural Text Degeneration , 2019, ICLR.

[10]  Vicente García-Díaz,et al.  Improving Monte Carlo Tree Search with Artificial Neural Networks without Heuristics , 2021, Applied Sciences.

[11]  Yi-Shin Chen,et al.  CARER: Contextualized Affect Representations for Emotion Recognition , 2018, EMNLP.

[12]  Lav R. Varshney,et al.  CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.

[13]  Oriol Vinyals,et al.  Machine Translation Decoding beyond Beam Search , 2021, EMNLP.

[14]  Joelle Pineau,et al.  Language GANs Falling Short , 2018, ICLR.

[15]  Yejin Choi,et al.  Learning to Write with Cooperative Discriminators , 2018, ACL.

[16]  Jason Yosinski,et al.  Plug and Play Language Models: A Simple Approach to Controlled Text Generation , 2020, ICLR.

[17]  Lei Zheng,et al.  Texygen: A Benchmarking Platform for Text Generation Models , 2018, SIGIR.

[18]  Christopher D. Rosin,et al.  Multi-armed bandits with episode context , 2011, Annals of Mathematics and Artificial Intelligence.

[19]  Yann Dauphin,et al.  Hierarchical Neural Story Generation , 2018, ACL.

[20]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[21]  Yejin Choi,et al.  RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models , 2020, FINDINGS.

[22]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[23]  Demis Hassabis,et al.  A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.

[24]  Benjamin Lecouteux,et al.  FlauBERT: Unsupervised Language Model Pre-training for French , 2020, LREC.

[25]  Eunah Cho,et al.  Data Augmentation using Pre-trained Transformer Models , 2020, LIFELONGNLP.

[26]  Richard Socher,et al.  GeDi: Generative Discriminator Guided Sequence Generation , 2021, EMNLP.

[27]  Ali Farhadi,et al.  Defending Against Neural Fake News , 2019, NeurIPS.