Unsupervised Text Style Transfer with Masked Language Models

We propose Masker, an unsupervised text-editing method for style transfer. To tackle cases when no parallel source-target pairs are available, we train masked language models (MLMs) for both the source and the target domain. Then we find the text spans where the two models disagree the most in terms of likelihood. This allows us to identify the source tokens to delete to transform the source text to match the style of the target domain. The deleted tokens are replaced with the target MLM, and by using a padded MLM variant, we avoid having to predetermine the number of inserted tokens. Our experiments on sentence fusion and sentiment transfer demonstrate that Masker performs competitively in a fully unsupervised setting. Moreover, in low-resource settings, it improves supervised methods' accuracy by over 10 percentage points when pre-training them on silver training data generated by Masker.

[1]  Jackie Chi Kit Cheung,et al.  EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing , 2019, ACL.

[2]  Lucia Specia,et al.  Unsupervised Lexical Simplification for Non-Native Speakers , 2016, AAAI.

[3]  Sunita Sarawagi,et al.  Parallel Iterative Edit Models for Local Sequence Transduction , 2019, EMNLP.

[4]  Debanjan Ghosh,et al.  R3: Reverse, Retrieve, and Rank for Sarcasm Generation with Commonsense Knowledge , 2020, ACL.

[5]  Tao Zhang,et al.  Mask and Infill: Applying Masked Language Model for Sentiment Transfer , 2019, IJCAI.

[6]  Curtis G. Northcutt,et al.  Conditional Rap Lyrics Generation with Denoising Autoencoders , 2020, ArXiv.

[7]  Houfeng Wang,et al.  Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach , 2018, ACL.

[8]  Davis Liang,et al.  Masked Language Model Scoring , 2020, ACL.

[9]  Idan Szpektor,et al.  DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion , 2019, NAACL.

[10]  Alex Wang,et al.  A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models , 2019, ArXiv.

[11]  Joel R. Tetreault,et al.  Dear Sir or Madam, May I Introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer , 2018, NAACL.

[12]  Regina Barzilay,et al.  Blank Language Models , 2020, EMNLP.

[13]  Percy Liang,et al.  Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer , 2018, NAACL.

[14]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[15]  Abhijit Mishra,et al.  A Modular Architecture for Unsupervised Sarcasm Generation , 2019, EMNLP.

[16]  Alex Wang,et al.  BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model , 2019, Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.

[17]  Anirban Laha,et al.  Unsupervised Neural Text Simplification , 2018, ACL.

[18]  Aliaksei Severyn,et al.  Encode, Tag, Realize: High-Precision Text Editing , 2019, EMNLP.

[19]  Jie Zhou,et al.  A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer , 2019, IJCAI.

[20]  Shashi Narayan,et al.  Leveraging Pre-trained Checkpoints for Sequence Generation Tasks , 2019, Transactions of the Association for Computational Linguistics.

[21]  Andrew Y. Ng,et al.  Neural Text Style Transfer via Denoising and Reranking , 2019, Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.

[22]  Guillermo Garrido,et al.  FELIX: Flexible Text Editing Through Tagging and Insertion , 2020, FINDINGS.

[23]  Eric P. Xing,et al.  Unsupervised Text Style Transfer using Language Models as Discriminators , 2018, NeurIPS.

[24]  Regina Barzilay,et al.  Style Transfer from Non-Parallel Text by Cross-Alignment , 2017, NIPS.

[25]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.