A Constrained Sequence-to-Sequence Neural Model for Sentence Simplification

Sentence simplification reduces semantic complexity to benefit people with language impairments. Previous simplification studies on the sentence level and word level have achieved promising results but also meet great challenges. For sentence-level studies, sentences after simplification are fluent but sometimes are not really simplified. For word-level studies, words are simplified but also have potential grammar errors due to different usages of words before and after simplification. In this paper, we propose a two-step simplification framework by combining both the word-level and the sentence-level simplifications, making use of their corresponding advantages. Based on the two-step framework, we implement a novel constrained neural generation model to simplify sentences given simplified words. The final results on Wikipedia and Simple Wikipedia aligned datasets indicate that our method yields better performance than various baselines.

[1]  Lucia Specia Translating from Complex to Simplified Sentences , 2010, PROPOR.

[2]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[3]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[4]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[5]  Chris Callison-Burch,et al.  Optimizing Statistical Machine Translation for Text Simplification , 2016, TACL.

[6]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[7]  Chris Callison-Burch,et al.  Simple PPDB: A Paraphrase Database for Simplification , 2016, ACL.

[8]  Hong Sun,et al.  Joint Learning of a Dual SMT System for Paraphrase Generation , 2012, ACL.

[9]  Ricardo Baeza-Yates,et al.  The Impact of Lexical Simplification by Verbal Paraphrases for People with and without Dyslexia , 2013, CICLing.

[10]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[11]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[12]  Goran Glavas,et al.  Simplifying Lexical Simplification: Do We Need Simplified Corpora? , 2015, ACL.

[13]  Rui Yan,et al.  Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation , 2016, COLING.

[14]  Matt Post,et al.  Joshua 5.0: Sparser, Better, Faster, Server , 2013, WMT@ACL.

[15]  Siobhan Devlin,et al.  Simplifying Text for Language-Impaired Readers , 1999, EACL.

[16]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[17]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[18]  Sanja Stajner,et al.  A Deeper Exploration of the Standard PB-SMT Approach to Text Simplification and its Evaluation , 2015, ACL.

[19]  David Kauchak,et al.  Learning to Simplify Sentences Using Wikipedia , 2011, Monolingual@ACL.

[20]  Emiel Krahmer,et al.  Sentence Simplification by Monolingual Machine Translation , 2012, ACL.

[21]  Iryna Gurevych,et al.  A Monolingual Tree-based Translation Model for Sentence Simplification , 2010, COLING.

[22]  Richard Evans,et al.  An evaluation of syntactic simplification rules for people with autism , 2014, PITR@EACL.

[23]  Rui Yan,et al.  Backward and Forward Language Modeling for Constrained Sentence Generation , 2015, 1512.06612.

[24]  R. P. Fishburne,et al.  Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel , 1975 .

[25]  David Kauchak,et al.  Learning a Lexical Simplifier Using Wikipedia , 2014, ACL.

[26]  Lucia Specia,et al.  Unsupervised Lexical Simplification for Non-Native Speakers , 2016, AAAI.