Controllable Generation from Pre-trained Language Models via Inverse Prompting

Large-scale pre-trained language models have demonstrated strong capabilities of generating realistic texts. However, it remains challenging to control the generation results. Previous approaches such as prompting are far from sufficient, and lack of controllability limits the usage of language models. To tackle this challenge, we propose an innovative method, inverse prompting, to better control text generation. The core idea of inverse prompting is to use generated text to inversely predict the prompt during beam search, which enhances the relevance between the prompt and the generated text and thus improves controllability. Empirically, we pre-train a large-scale Chinese language model to perform a systematic study using human evaluation on the tasks of open-domain poem generation and open-domain long-form question answering. Results demonstrate that our proposed method substantially outperforms the baselines and that our generation quality is close to human performance on some of the tasks.

[1]  Jason Yosinski,et al.  Plug and Play Language Models: A Simple Approach to Controlled Text Generation , 2020, ICLR.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  James J. Y. Liu The art of Chinese poetry , 1962 .

[4]  Pushmeet Kohli,et al.  Analysing Mathematical Reasoning Abilities of Neural Models , 2019, ICLR.

[5]  Yiming Yang,et al.  Transformer-XL: Language Modeling with Longer-Term Dependency , 2018 .

[6]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[7]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[8]  Xiaoyuan Yi,et al.  MixPoet: Diverse Poetry Generation via Learning Controllable Mixed Latent Space , 2020, AAAI.

[9]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[10]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[11]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[12]  Mark Sandler,et al.  CycleGAN, a Master of Steganography , 2017, ArXiv.

[13]  Robert F. Simmons,et al.  Computational Linguistics Natural Language Question- Answering Systems: 1969 , 2022 .

[14]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[15]  Xiaoyuan Yi,et al.  Jiuge: A Human-Machine Collaborative Chinese Classical Poetry Generation System , 2019, ACL.

[16]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[17]  Lav R. Varshney,et al.  CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.

[18]  David L. Waltz,et al.  Natural-Language Question-Answering Systems , 1975 .

[19]  Tie-Yan Liu,et al.  Dual Learning for Machine Translation , 2016, NIPS.

[20]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[21]  Malvina Nissim,et al.  As Good as New. How to Successfully Recycle English GPT-2 to Make Models for Other Languages , 2020, FINDINGS.

[22]  Franklin S. Cooper,et al.  Speech Understanding Systems , 1976, Artificial Intelligence.

[23]  Yoshua Bengio,et al.  HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[24]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[25]  Mohammad Shoeybi,et al.  Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism , 2019, ArXiv.

[26]  Huanqi Cao,et al.  CPM: A Large-scale Generative Chinese Pre-trained Language Model , 2020, AI Open.

[27]  Sungzoon Cho,et al.  Variational Autoencoder based Anomaly Detection using Reconstruction Probability , 2015 .

[28]  Long Jiang,et al.  Generating Chinese Couplets using a Statistical MT Approach , 2008, COLING.

[29]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .