Tailor: A Soft-Prompt-Based Approach to Attribute-Based Controlled Text Generation

Attribute-based Controlled Text Generation (CTG) refers to generating sentences that satisfy desirable attributes (e.g., emotions and topics). Existing work usually utilize fine-tuning or resort to extra attribute classifiers, yet suffer from increases in storage and inference time. To address these concerns, we explore attribute-based CTG in a parameter-efficient manner. In short, the proposed Tailor represents each attribute as a pre-trained continuous vector i.e., single-attribute prompt), which guides the generation of a fixed pre-trained language model (PLM) to satisfy a pre-specified attribute. These prompts can be simply concatenated as a whole for multi-attribute CTG without any re-training. Nevertheless, this may raise problems of fluency downgrading and position sensitivity. To solve this, Tailor provides two solutions to enhance the combination. The former contains a multi-attribute prompt mask and a re-indexing position sequence to bridge the gap between the training (one single-attribute prompt for each task) and the testing stage (concatenating two prompts). The latter introduces a trainable prompt connector to further enhance the combinations. Experiments demonstrate that, only requiring 0.08% extra training parameters of the GPT-2, Tailor can achieve effective and general improvements on eleven attribute-specific generation tasks.

[1]  Li Dong,et al.  Controllable Natural Language Generation with Contrastive Prefixes , 2022, FINDINGS.

[2]  Ming Zhou,et al.  A Survey of Controllable Text Generation Using Transformer-based Pre-trained Language Models , 2022, ACM Comput. Surv..

[3]  Zhilin Yang,et al.  P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks , 2021, ArXiv.

[4]  Po-Sen Huang,et al.  Challenges in Detoxifying Language Models , 2021, EMNLP.

[5]  Minlie Huang,et al.  PPT: Pre-trained Prompt Tuning for Few-shot Learning , 2021, ACL.

[6]  Zhiyuan Liu,et al.  PTR: Prompt Tuning with Rules for Text Classification , 2021, AI Open.

[7]  Eduard Hovy,et al.  StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer , 2021, NAACL.

[8]  D. Klein,et al.  FUDGE: Controlled Text Generation With Future Discriminators , 2021, NAACL.

[9]  Zhengxiao Du,et al.  GPT Understands, Too , 2021, AI Open.

[10]  Pengfei Liu,et al.  GSum: A General Framework for Guided Neural Abstractive Summarization , 2020, NAACL.

[11]  Shafiq R. Joty,et al.  GeDi: Generative Discriminator Guided Sequence Generation , 2020, EMNLP.

[12]  Pascale Fung,et al.  Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning , 2020, FINDINGS.

[13]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[14]  J. Yosinski,et al.  Plug and Play Language Models: A Simple Approach to Controlled Text Generation , 2019, ICLR.

[15]  Tom B. Brown,et al.  Fine-Tuning Language Models from Human Preferences , 2019, ArXiv.

[16]  Lav R. Varshney,et al.  CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.

[17]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[18]  Samuel R. Bowman,et al.  Neural Network Acceptability Judgments , 2018, Transactions of the Association for Computational Linguistics.

[19]  Regina Barzilay,et al.  Style Transfer from Non-Parallel Text by Cross-Alignment , 2017, NIPS.

[20]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[21]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[22]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[23]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[24]  Percy Liang,et al.  Prefix-Tuning: Optimizing Continuous Prompts for Generation , 2021, ACL.

[25]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[26]  E. Rudolph The role of conjunctions and particles for text connexity , 1989 .

[27]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .