On the Helpfulness of Document Context to Sentence Simplification

Most of the research on text simplification is limited to sentence level nowadays. In this paper, we are the first to investigate the helpfulness of document context on sentence simplification and apply it to the sequence-to-sequence model. We firstly construct a sentence simplification dataset in which the contexts for the original sentence are provided by Wikipedia corpus. The new dataset contains approximately 116K sentence pairs with context. We then propose a new model that makes full use of the context information. Our model uses neural networks to learn the different effects of the preceding sentences and the following sentences on the current sentence and applies them to the improved transformer model. Evaluated on the newly constructed dataset, our model achieves 36.52 on SARI value, which outperforms the best performing model in the baselines by 2.46 (7.22%), indicating that context indeed helps improve sentence simplification. In the ablation experiment, we show that using either the preceding sentences or the following sentences as context can significantly improve simplification.

[1]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[2]  Sanja Stajner,et al.  Can Text Simplification Help Machine Translation? , 2016, EAMT.

[3]  Tomoyuki Kajiwara,et al.  Controllable Text Simplification with Lexical Constraint Loss , 2019, ACL.

[4]  Chris Callison-Burch,et al.  Optimizing Statistical Machine Translation for Text Simplification , 2016, TACL.

[5]  Lucia Specia,et al.  Unsupervised Lexical Simplification for Non-Native Speakers , 2016, AAAI.

[6]  Chris Callison-Burch,et al.  Problems in Current Text Simplification Research: New Data Can Help , 2015, TACL.

[7]  Ani Nenkova,et al.  Revisiting Readability: A Unified Framework for Predicting Text Quality , 2008, EMNLP.

[8]  Yoshua Bengio,et al.  Montreal Neural Machine Translation Systems for WMT’15 , 2015, WMT@EMNLP.

[9]  Mirella Lapata,et al.  WikiSimple: Automatic Simplification of Wikipedia Articles , 2011, AAAI.

[10]  Piji Li,et al.  Deep Recurrent Generative Decoder for Abstractive Text Summarization , 2017, EMNLP.

[11]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[13]  Ramakanth Pasunuru,et al.  Dynamic Multi-Level Multi-Task Learning for Sentence Simplification , 2018, COLING.

[14]  Xu Sun,et al.  A Semantic Relevance Based Neural Network for Text Summarization and Text Simplification , 2017, ArXiv.

[15]  Chris Callison-Burch,et al.  Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification , 2019, NAACL.

[16]  James Henderson,et al.  Document-Level Neural Machine Translation with Hierarchical Attention Networks , 2018, EMNLP.

[17]  Yang Liu,et al.  Learning to Remember Translation History with a Continuous Cache , 2017, TACL.

[18]  Emiel Krahmer,et al.  Sentence Simplification by Monolingual Machine Translation , 2012, ACL.

[19]  Xiaojun Wan,et al.  Automatic Text Simplification , 2018, Computational Linguistics.

[20]  Mirella Lapata,et al.  Sentence Simplification with Deep Reinforcement Learning , 2017, EMNLP.

[21]  Chris Callison-Burch,et al.  Simple PPDB: A Paraphrase Database for Simplification , 2016, ACL.

[22]  Ari Rappoport,et al.  Semantic Structural Evaluation for Text Simplification , 2018, NAACL.

[23]  Iryna Gurevych,et al.  A Monolingual Tree-based Translation Model for Sentence Simplification , 2010, COLING.

[24]  Wei Xu,et al.  Neural CRF Model for Sentence Alignment in Text Simplification , 2020, ACL.

[25]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[26]  Hiroshi Matsumoto,et al.  Selecting Proper Lexical Paraphrase for Children , 2013, ROCLING/IJCLCLP.

[27]  David Kauchak,et al.  Simple English Wikipedia: A New Text Simplification Task , 2011, ACL.

[28]  R. P. Fishburne,et al.  Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel , 1975 .

[29]  Kevin Gimpel,et al.  Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units , 2016, ArXiv.

[30]  Bambang Parmanto,et al.  Integrating Transformer and Paraphrase Rules for Sentence Simplification , 2018, EMNLP.

[31]  Wenjie Li,et al.  Joint Copying and Restricted Generation for Paraphrase , 2016, AAAI.

[32]  Xiaojun Wan,et al.  SemSUM: Semantic Dependency Guided Neural Abstractive Summarization , 2020, AAAI.

[33]  Junyi Jessy Li,et al.  Discourse Level Factors for Sentence Deletion in Text Simplification , 2019, AAAI.

[34]  David Kauchak,et al.  Improving Text Simplification Language Modeling Using Unsimplified Text Data , 2013, ACL.

[35]  Huanbo Luan,et al.  Improving the Transformer Translation Model with Document-Level Context , 2018, EMNLP.

[36]  Adrià de Gispert,et al.  Source sentence simplification for statistical machine translation , 2017, Comput. Speech Lang..

[37]  Ari Rappoport,et al.  BLEU is Not Suitable for the Evaluation of Text Simplification , 2018, EMNLP.

[38]  Shashi Narayan,et al.  Hybrid Simplification using Deep Semantics and Machine Translation , 2014, ACL.

[39]  Sergiu Nisioi,et al.  Exploring Neural Text Simplification Models , 2017, ACL.

[40]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[41]  Gustavo Paetzold Reliable Lexical Simplification for Non-Native Speakers , 2015, HLT-NAACL.

[42]  Jackie Chi Kit Cheung,et al.  EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing , 2019, ACL.

[43]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[44]  Zhi Chen,et al.  Semi-Supervised Text Simplification with Back-Translation and Asymmetric Denoising Autoencoders , 2020, AAAI.