Controlling Pre-trained Language Models for Grade-Specific Text Simplification

Text simplification (TS) systems rewrite text to make it more readable while preserving its content. However, what makes a text easy to read depends on the intended readers. Recent work has shown that pre-trained language models can simplify text using a wealth of techniques to control output simplicity, ranging from specifying only the desired reading grade level, to directly specifying low-level edit operations. Yet it remains unclear how to set these control parameters in practice. Existing approaches set them at the corpus level, disregarding the complexity of individual inputs and considering only one level of output complexity. In this work, we conduct an empirical study to understand how different control mechanisms impact the adequacy and simplicity of text simplification systems. Based on these insights, we introduce a simple method that predicts the edit operations required for simplifying a text for a specific grade level on an instance-per-instance basis. This approach improves the quality of the simplified outputs over corpus-level search-based heuristics.

[1]  Daniel Wiechmann,et al.  (Psycho-)Linguistic Features Meet Transformer Models for Improved Explainable and Controllable Text Simplification , 2022, TSAR.

[2]  Xiang Lisa Li,et al.  Diffusion-LM Improves Controllable Text Generation , 2022, NeurIPS.

[3]  Marine Carpuat,et al.  An Imitation Learning Curriculum for Text Editing with Non-Autoregressive Models , 2022, ACL.

[4]  Lucia Specia,et al.  The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification , 2021, CL.

[5]  Aliaksei Severyn,et al.  Controlled Text Generation as Continuous Optimization with Multiple Constraints , 2021, NeurIPS.

[6]  Marine Carpuat,et al.  A Non-Autoregressive Edit-Based Approach to Controllable Text Simplification , 2021, FINDINGS.

[7]  Sanja Stajner,et al.  Automatic Text Simplification for Social Good: Progress and Challenges , 2021, FINDINGS.

[8]  Wei Xu,et al.  Controllable Text Simplification with Explicit Paraphrasing , 2020, NAACL.

[9]  Q. Mei,et al.  Explainable Prediction of Text Complexity: The Missing Preliminaries for Text Simplification , 2020, ACL.

[10]  Alan W Black,et al.  Exploring Controllable Text Generation Techniques , 2020, COLING.

[11]  Marine Carpuat,et al.  Controlling Text Complexity in Neural Machine Translation , 2019, EMNLP.

[12]  Mirella Lapata,et al.  Controllable Sentence Simplification: Employing Syntactic and Lexical Constraints , 2019, ArXiv.

[13]  Antoine Bordes,et al.  Controllable Sentence Simplification , 2019, LREC.

[14]  J. Yosinski,et al.  Plug and Play Language Models: A Simple Approach to Controlled Text Generation , 2019, ICLR.

[15]  Lav R. Varshney,et al.  CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.

[16]  Tomoyuki Kajiwara,et al.  Controllable Text Simplification with Lexical Constraint Loss , 2019, ACL.

[17]  Kilian Q. Weinberger,et al.  BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[18]  Xiaojun Wan,et al.  Automatic Text Simplification , 2018, Computational Linguistics.

[19]  Marine Carpuat,et al.  Multi-Task Neural Models for Translating Between Styles Within and Across Languages , 2018, COLING.

[20]  Yejin Choi,et al.  Learning to Write with Cooperative Discriminators , 2018, ACL.

[21]  Xing Shi,et al.  Hafez: an Interactive Poetry Generation System , 2017, ACL.

[22]  Qun Liu,et al.  Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search , 2017, ACL.

[23]  Mirella Lapata,et al.  Sentence Simplification with Deep Reinforcement Learning , 2017, EMNLP.

[24]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[25]  Chris Callison-Burch,et al.  Optimizing Statistical Machine Translation for Text Simplification , 2016, TACL.

[26]  Maxine Eskénazi,et al.  Predicting the Relative Difficulty of Single Sentences With and Without Surrounding Context , 2016, EMNLP.

[27]  David Kauchak,et al.  Simple English Wikipedia: A New Text Simplification Task , 2011, ACL.

[28]  David K. Allen,et al.  A study of the role of relative clauses in the simplification of news texts for learners of English , 2009 .

[29]  Renata Pontin de Mattos Fortes,et al.  Facilita: reading assistance for low-literacy readers , 2009, SIGDOC '09.

[30]  Kevyn Collins-Thompson,et al.  An Analysis of Statistical Models and Features for Reading Difficulty Prediction , 2008, ACL 2008.

[31]  Mari Ostendorf,et al.  Text simplification for language learners: a corpus analysis , 2007, SLaTE.

[32]  Antoine Bordes,et al.  MUSS: Multilingual Unsupervised Sentence Simplification by Mining Paraphrases , 2022, LREC.

[33]  Tomoyuki Kajiwara,et al.  Lexically Constrained Decoding with Edit Operation Prediction for Controllable Text Simplification , 2022, TSAR.

[34]  Sarah Ebling,et al.  Target-Level Sentence Simplification as Controlled Paraphrasing , 2022, TSAR.

[35]  Horacio Saggion,et al.  Controllable Sentence Simplification with a Unified Text-to-Text Transfer Transformer , 2021, INLG.

[36]  Lucia Specia,et al.  Learning Simplifications for Specific Target Audiences , 2018 .

[37]  Max Schwarzer,et al.  Human Evaluation for Text Simplification : The Simplicity-Adequacy Tradeoff , 2018 .

[38]  Rico Sennrich,et al.  Controlling Politeness in Neural Machine Translation via Side Constraints , 2016, NAACL.

[39]  Matthew Shardlow,et al.  A Survey of Automated Text Simplification , 2014 .

[40]  S. Bangalore,et al.  Motivations and Methods for Text Simplification , 1996, COLING.

[41]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .