论文信息 - MOVER: Mask, Over-generate and Rank for Hyperbole Generation - 字舞流文

MOVER: Mask, Over-generate and Rank for Hyperbole Generation

Despite being a common figure of speech, hyperbole is underresearched with only a few studies addressing its identification task. In this paper, we introduce a new task of hyperbole generation to transfer a literal sentence into its hyperbolic paraphrase. To tackle the lack of available hyperbolic sentences, we construct HYPO-XL, the first large-scale hyperbole corpus containing 17,862 hyperbolic sentences in a non-trivial way. Based on our corpus, we propose an unsupervised method for hyperbole generation with no need for parallel literal-hyperbole pairs. During training, we fine-tune BART to infill masked hyperbolic spans of sentences from HYPO-XL. During inference, we mask part of an input literal sentence and over-generate multiple possible hyperbolic versions. Then a BERT-based ranker selects the best candidate by hyperbolicity and paraphrase quality. Human evaluation results show that our model is capable of generating hyperbolic paraphrase sentences and outperforms several baseline systems.

Xiaojun Wan | Yunxiang Zhang

[1] J. Fleiss. Measuring nominal scale agreement among many raters. , 1971 .

[2] Xiaojun Wan,et al. A Neural Approach to Irony Generation , 2019, ArXiv.

[3] Roger J. Kreuz,et al. The empirical study of figurative language in literature , 1993 .

[4] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[5] Lillian Lee,et al. A Corpus of Sentence-level Revisions in Academic Writing: A Step towards Understanding Statement Strength in Communication , 2014, ACL.

[6] Peng Liu,et al. Sarcasm Detection in Social Media Based on Imbalanced Classification , 2014, WAIM.

[7] Percy Liang,et al. Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer , 2018, NAACL.

[8] Joanna Turnbull,et al. Oxford Advanced Learner's Dictionary: , 2011 .

[9] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[10] Minlie Huang,et al. Long and Diverse Text Generation with Planning-based Hierarchical Variational Model , 2019, EMNLP.

[11] Noah A. Smith,et al. Question Generation via Overgenerating Transformations and Ranking , 2009 .

[12] Ming-Yu Liu,et al. Style Example-Guided Text Generation using Generative Adversarial Transformers , 2019, ArXiv.

[13] Paolo Rosso,et al. Making objective decisions from subjective data: Detecting irony in customer reviews , 2012, Decis. Support Syst..

[14] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[15] Jie Zhou,et al. A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer , 2019, IJCAI.

[16] Eric P. Xing,et al. Toward Controlled Generation of Text , 2017, ICML.

[17] Iryna Gurevych,et al. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[18] Xiaojun Wan,et al. A Neural Approach to Pun Generation , 2018, ACL.

[19] Smaranda Muresan,et al. Generating similes effortlessly like a Pro: A Style Transfer Approach for Simile Generation , 2020, EMNLP.

[20] Xiaojun Wan,et al. Homophonic Pun Generation with Lexically Constrained Rewriting , 2020, EMNLP.

[21] Suma Bhat,et al. From Solving a Problem Boldly to Cutting the Gordian Knot: Idiomatic Text Generation , 2021, ArXiv.

[22] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[23] Smaranda Muresan,et al. Metaphor Generation with Conceptual Mappings , 2021, ACL.

[24] Mihai Surdeanu,et al. The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[25] Verena Rieser,et al. Why We Need New Evaluation Metrics for NLG , 2017, EMNLP.

[26] Iryna Gurevych,et al. Metaphoric Paraphrase Generation , 2020, ArXiv.

[27] Shunyao Li,et al. Pun-GAN: Generative Adversarial Network for Pun Generation , 2019, EMNLP.

[28] Mohit Iyyer,et al. Reformulating Unsupervised Style Transfer as Paraphrase Generation , 2020, EMNLP.

[29] Diane J. Litman,et al. Annotation and Classification of Sentence-level Revision Improvement , 2018, BEA@NAACL-HLT.

[30] Guillaume Lample,et al. Multiple-Attribute Text Style Transfer , 2018, ArXiv.

[31] Gözde Özbal,et al. A Computational Exploration of Exaggeration , 2018, EMNLP.

[32] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[33] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[34] Bin Luo,et al. An Empirical Study of Hyperbole , 2020, EMNLP.

[35] Xiaojun Wan,et al. How to Avoid Sentences Spelling Boring? Towards a Neural Approach to Unsupervised Metaphor Generation , 2019, NAACL.

[36] J. R. Landis,et al. The measurement of observer agreement for categorical data. , 1977, Biometrics.

[37] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[38] Nanyun Peng,et al. Pun Generation with Surprise , 2019, NAACL-HLT.

[39] C. Claridge. Hyperbole in English: A Corpus-based Study of Exaggeration , 2010 .

[40] Smaranda Muresan,et al. MERMAID: Metaphor Generation with Symbolism and Discriminative Decoding , 2021, NAACL.

[41] Rémi Louf,et al. Transformers : State-ofthe-art Natural Language Processing , 2019 .

[42] Zhiting Hu,et al. Deep Learning for Text Style Transfer: A Survey , 2020, Computational Linguistics.

[43] Jianwei Cui,et al. Writing Polishment with Simile: Task, Dataset and A Neural Approach , 2020, AAAI.