MOVER: Mask, Over-generate and Rank for Hyperbole Generation

Despite being a common figure of speech, hyperbole is underresearched with only a few studies addressing its identification task. In this paper, we introduce a new task of hyperbole generation to transfer a literal sentence into its hyperbolic paraphrase. To tackle the lack of available hyperbolic sentences, we construct HYPO-XL, the first large-scale hyperbole corpus containing 17,862 hyperbolic sentences in a non-trivial way. Based on our corpus, we propose an unsupervised method for hyperbole generation with no need for parallel literal-hyperbole pairs. During training, we fine-tune BART to infill masked hyperbolic spans of sentences from HYPO-XL. During inference, we mask part of an input literal sentence and over-generate multiple possible hyperbolic versions. Then a BERT-based ranker selects the best candidate by hyperbolicity and paraphrase quality. Human evaluation results show that our model is capable of generating hyperbolic paraphrase sentences and outperforms several baseline systems.

[1]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[2]  Xiaojun Wan,et al.  A Neural Approach to Irony Generation , 2019, ArXiv.

[3]  Roger J. Kreuz,et al.  The empirical study of figurative language in literature , 1993 .

[4]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[5]  Lillian Lee,et al.  A Corpus of Sentence-level Revisions in Academic Writing: A Step towards Understanding Statement Strength in Communication , 2014, ACL.

[6]  Peng Liu,et al.  Sarcasm Detection in Social Media Based on Imbalanced Classification , 2014, WAIM.

[7]  Percy Liang,et al.  Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer , 2018, NAACL.

[8]  Joanna Turnbull,et al.  Oxford Advanced Learner's Dictionary: , 2011 .

[9]  Kilian Q. Weinberger,et al.  BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[10]  Minlie Huang,et al.  Long and Diverse Text Generation with Planning-based Hierarchical Variational Model , 2019, EMNLP.

[11]  Noah A. Smith,et al.  Question Generation via Overgenerating Transformations and Ranking , 2009 .

[12]  Ming-Yu Liu,et al.  Style Example-Guided Text Generation using Generative Adversarial Transformers , 2019, ArXiv.

[13]  Paolo Rosso,et al.  Making objective decisions from subjective data: Detecting irony in customer reviews , 2012, Decis. Support Syst..

[14]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[15]  Jie Zhou,et al.  A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer , 2019, IJCAI.

[16]  Eric P. Xing,et al.  Toward Controlled Generation of Text , 2017, ICML.

[17]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[18]  Xiaojun Wan,et al.  A Neural Approach to Pun Generation , 2018, ACL.

[19]  Smaranda Muresan,et al.  Generating similes effortlessly like a Pro: A Style Transfer Approach for Simile Generation , 2020, EMNLP.

[20]  Xiaojun Wan,et al.  Homophonic Pun Generation with Lexically Constrained Rewriting , 2020, EMNLP.

[21]  Suma Bhat,et al.  From Solving a Problem Boldly to Cutting the Gordian Knot: Idiomatic Text Generation , 2021, ArXiv.

[22]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[23]  Smaranda Muresan,et al.  Metaphor Generation with Conceptual Mappings , 2021, ACL.

[24]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[25]  Verena Rieser,et al.  Why We Need New Evaluation Metrics for NLG , 2017, EMNLP.

[26]  Iryna Gurevych,et al.  Metaphoric Paraphrase Generation , 2020, ArXiv.

[27]  Shunyao Li,et al.  Pun-GAN: Generative Adversarial Network for Pun Generation , 2019, EMNLP.

[28]  Mohit Iyyer,et al.  Reformulating Unsupervised Style Transfer as Paraphrase Generation , 2020, EMNLP.

[29]  Diane J. Litman,et al.  Annotation and Classification of Sentence-level Revision Improvement , 2018, BEA@NAACL-HLT.

[30]  Guillaume Lample,et al.  Multiple-Attribute Text Style Transfer , 2018, ArXiv.

[31]  Gözde Özbal,et al.  A Computational Exploration of Exaggeration , 2018, EMNLP.

[32]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[33]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[34]  Bin Luo,et al.  An Empirical Study of Hyperbole , 2020, EMNLP.

[35]  Xiaojun Wan,et al.  How to Avoid Sentences Spelling Boring? Towards a Neural Approach to Unsupervised Metaphor Generation , 2019, NAACL.

[36]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[37]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[38]  Nanyun Peng,et al.  Pun Generation with Surprise , 2019, NAACL-HLT.

[39]  C. Claridge Hyperbole in English: A Corpus-based Study of Exaggeration , 2010 .

[40]  Smaranda Muresan,et al.  MERMAID: Metaphor Generation with Symbolism and Discriminative Decoding , 2021, NAACL.

[41]  Rémi Louf,et al.  Transformers : State-ofthe-art Natural Language Processing , 2019 .

[42]  Zhiting Hu,et al.  Deep Learning for Text Style Transfer: A Survey , 2020, Computational Linguistics.

[43]  Jianwei Cui,et al.  Writing Polishment with Simile: Task, Dataset and A Neural Approach , 2020, AAAI.