Recursive Context-Aware Lexical Simplification

This paper presents a novel architecture for recursive context-aware lexical simplification, REC-LS, that is capable of (1) making use of the wider context when detecting the words in need of simplification and suggesting alternatives, and (2) taking previous simplification steps into account. We show that our system outputs lexical simplifications that are grammatically correct and semantically appropriate, and outperforms the current state-of-the-art systems in lexical simplification.

[1]  David Kauchak,et al.  Learning to Simplify Sentences Using Wikipedia , 2011, Monolingual@ACL.

[2]  Marek Rei,et al.  Semi-supervised Multitask Learning for Sequence Labeling , 2017, ACL.

[3]  Lucia Specia,et al.  A Report on the Complex Word Identification Shared Task 2018 , 2018, BEA@NAACL-HLT.

[4]  Lucia Specia,et al.  Unsupervised Lexical Simplification for Non-Native Speakers , 2016, AAAI.

[5]  Chris Callison-Burch,et al.  Problems in Current Text Simplification Research: New Data Can Help , 2015, TACL.

[6]  Satoru Uchida,et al.  CEFR-based Lexical Simplification Dataset , 2018, LREC.

[7]  Sampo Pyysalo,et al.  Attending to Characters in Neural Sequence Labeling Models , 2016, COLING.

[8]  David Kauchak,et al.  Simple English Wikipedia: A New Text Simplification Task , 2011, ACL.

[9]  Goran Glavas,et al.  Simplifying Lexical Simplification: Do We Need Simplified Corpora? , 2015, ACL.

[10]  Ekaterina Kochmar,et al.  CAMB at CWI Shared Task 2018: Complex Word Identification with Ensemble-Based Voting , 2018, BEA@NAACL-HLT.

[11]  Lucia Specia,et al.  Benchmarking Lexical Simplification Systems , 2016, LREC.

[12]  Ekaterina Kochmar,et al.  Complex Word Identification as a Sequence Labelling Task , 2019, ACL.

[13]  David Kauchak,et al.  Improving Text Simplification Language Modeling Using Unsimplified Text Data , 2013, ACL.

[14]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[15]  David Kauchak,et al.  Learning a Lexical Simplifier Using Wikipedia , 2014, ACL.

[16]  Mirella Lapata,et al.  Sentence Simplification with Deep Reinforcement Learning , 2017, EMNLP.

[17]  Cristian Danescu-Niculescu-Mizil,et al.  For the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia , 2010, NAACL.

[18]  Carl James,et al.  Errors in Language Learning and Use: Exploring Error Analysis , 1998 .

[19]  Noémie Elhadad,et al.  Putting it Simply: a Context-Aware Approach to Lexical Simplification , 2011, ACL.

[20]  Advaith Siddharthan,et al.  Syntactic Simplification and Text Cohesion , 2006 .

[21]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[22]  John Tait,et al.  Cohesive Generation of Syntactically Simplified Newspaper Text , 2000, TSD.

[23]  Mark Davies The Corpus of Contemporary American English (COCA) , 2012 .

[24]  David Kauchak,et al.  Improving Perceived and Actual Text Difficulty for Health Information Consumers using Semi-Automated Methods , 2012, AMIA.

[25]  Andrew C. Porter,et al.  Common Core Standards , 2011 .

[26]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[27]  David A. Hull Xerox TREC-8 Question Answering Track Report , 1999, TREC.

[28]  Ted Briscoe,et al.  The Second Release of the RASP System , 2006, ACL.

[29]  Advaith Siddharthan,et al.  A survey of research on text simplification , 2014 .

[30]  Iryna Gurevych,et al.  A Monolingual Tree-based Translation Model for Sentence Simplification , 2010, COLING.

[31]  Marie-Francine Moens,et al.  A Dataset for the Evaluation of Lexical Simplification , 2012, CICLing.

[32]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[33]  Matthew Shardlow,et al.  A Comparison of Techniques to Automatically Identify Complex Words. , 2013, ACL.

[34]  Christian Biemann,et al.  CWIG3G2 - Complex Word Identification Task across Three Text Genres and Two User Groups , 2017, IJCNLP.

[35]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[36]  吉島 茂,et al.  文化と言語の多様性の中のCommon European Framework of Reference for Languages: Learning, teaching, assessment (CEFR)--それは基準か? (第10回明海大学大学院応用言語学研究科セミナー 講演) , 2008 .

[37]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[38]  Raman Chandrasekar,et al.  Automatic induction of rules for text simplification , 1997, Knowl. Based Syst..