Controllable Sentence Simplification with a Unified Text-to-Text Transfer Transformer

Recently, a large pre-trained language model called T5 (A Unified Text-to-Text Transfer Transformer) has achieved state-of-the-art performance in many NLP tasks. However, no study has been found using this pre-trained model on Text Simplification. Therefore in this paper, we explore the use of T5 fine-tuning on Text Simplification combining with a controllable mechanism to regulate the system outputs that can help generate adapted text for different target audiences. Our experiments show that our model achieves remarkable results with gains of between +0.69 and +1.41 over the current state-of-the-art (BART+ACCESS). We argue that using a pre-trained model such as T5, trained on several tasks with large amounts of data, can help improve Text Simplification.

[1]  Takuya Akiba,et al.  Optuna: A Next-generation Hyperparameter Optimization Framework , 2019, KDD.

[2]  David Kauchak,et al.  Improving Text Simplification Language Modeling Using Unsimplified Text Data , 2013, ACL.

[3]  Renata Pontin de Mattos Fortes,et al.  Facilita: reading assistance for low-literacy readers , 2009, SIGDOC '09.

[4]  Mirella Lapata,et al.  Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming , 2011, EMNLP.

[5]  Lucia Specia,et al.  EASSE: Easier Automatic Sentence Simplification Evaluation , 2019, EMNLP.

[6]  Thomas Wolf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[7]  Emiel Krahmer,et al.  Sentence Simplification by Monolingual Machine Translation , 2012, ACL.

[8]  Richard J. Evans,et al.  Comparing methods for the syntactic simplification of sentences in information extraction , 2011, Literary and Linguistic Computing.

[9]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[10]  Yoav Goldberg,et al.  Controlling Linguistic Style Aspects in Neural Language Generation , 2017, ArXiv.

[11]  Joachim Bingel,et al.  Learning How to Simplify From Explicit Labeling of Complex-Simplified Text Pairs , 2017, IJCNLP.

[12]  Chris Callison-Burch,et al.  Simple PPDB: A Paraphrase Database for Simplification , 2016, ACL.

[13]  Horacio Saggion,et al.  Towards Automatic Lexical Simplification in Spanish: An Empirical Study , 2012, PITR@NAACL-HLT.

[14]  David Kauchak,et al.  Simple English Wikipedia: A New Text Simplification Task , 2011, ACL.

[15]  Daniel Ferrés,et al.  YATS: Yet Another Text Simplifier , 2016, NLDB.

[16]  Kerstin Matausch,et al.  EasyWeb - A Study How People with Specific Learning Difficulties Can Be Supported on Using the Internet , 2010, ICCHP.

[17]  Ari Rappoport,et al.  BLEU is Not Suitable for the Evaluation of Text Simplification , 2018, EMNLP.

[18]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[19]  Luis Alfonso Ureña López,et al.  Language technologies applied to document simplification for helping autistic people , 2015, Expert Syst. Appl..

[20]  Raman Chandrasekar,et al.  Motivations and Methods for Text Simplification , 1996, COLING.

[21]  Ricardo Baeza-Yates,et al.  Simplify or help?: text simplification strategies for people with dyslexia , 2013, W4A.

[22]  Iryna Gurevych,et al.  A Monolingual Tree-based Translation Model for Sentence Simplification , 2010, COLING.

[23]  Jackie Chi Kit Cheung,et al.  EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing , 2019, ACL.

[24]  Graham Neubig,et al.  Controlling Output Length in Neural Encoder-Decoders , 2016, EMNLP.

[25]  Ani Nenkova,et al.  Syntactic Simplification for Improving Content Selection in Multi-Document Summarization , 2004, COLING.

[26]  Siddhartha Jonnalagadda,et al.  BioSimplify: an open source sentence simplification engine to improve recall in automatic biomedical information extraction , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[27]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[28]  Sanja Stajner,et al.  Automatic Text Simplification for Spanish: Comparative Evaluation of Various Simplification Strategies , 2015, RANLP.

[29]  Zhi Chen,et al.  Semi-Supervised Text Simplification with Back-Translation and Asymmetric Denoising Autoencoders , 2020, AAAI.

[30]  Rico Sennrich,et al.  Controlling Politeness in Neural Machine Translation via Side Constraints , 2016, NAACL.

[31]  Lucia Specia,et al.  Unsupervised Lexical Simplification for Non-Native Speakers , 2016, AAAI.

[32]  Chris Callison-Burch,et al.  Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification , 2019, NAACL.

[33]  Bambang Parmanto,et al.  Integrating Transformer and Paraphrase Rules for Sentence Simplification , 2018, EMNLP.

[34]  Lucia Specia,et al.  ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations , 2020, ACL.

[35]  Xiaojun Wan,et al.  Automatic Text Simplification , 2018, Computational Linguistics.

[36]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[37]  Dima Karamshuk,et al.  CUT: Controllable Unsupervised Text Simplification , 2020, ArXiv.

[38]  Anirban Laha,et al.  Unsupervised Neural Text Simplification , 2018, ACL.

[39]  Lucia Specia,et al.  Learning Simplifications for Specific Target Audiences , 2018, ACL.

[40]  Delphine Bernhard,et al.  Question Generation for French: Collating Parsers and Paraphrasing Questions , 2012, Dialogue Discourse.

[41]  Chris Callison-Burch,et al.  Optimizing Statistical Machine Translation for Text Simplification , 2016, TACL.

[42]  R. P. Fishburne,et al.  Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel , 1975 .

[43]  Antoine Bordes,et al.  Controllable Sentence Simplification , 2020, LREC.

[44]  Siobhan Devlin,et al.  Simplifying Text for Language-Impaired Readers , 1999, EACL.

[45]  Angela Fan,et al.  Controllable Abstractive Summarization , 2017, NMT@ACL.

[46]  Sanja Stajner,et al.  Can Text Simplification Help Machine Translation? , 2016, EAMT.

[47]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[48]  Raquel Hervás,et al.  One Half or 50%? An Eye-Tracking Study of Number Representation Readability , 2013, INTERACT.

[49]  Mirella Lapata,et al.  Sentence Simplification with Deep Reinforcement Learning , 2017, EMNLP.

[50]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[51]  Sergiu Nisioi,et al.  Exploring Neural Text Simplification Models , 2017, ACL.

[52]  Sanja Stajner,et al.  Automated Text Simplification as a Preprocessing Step for Machine Translation into an Under-resourced Language , 2019, RANLP.

[53]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[54]  Antoine Bordes,et al.  Multilingual Unsupervised Sentence Simplification , 2020, ArXiv.